Structured requirements and RegEx regular expressions
19 February 2024
Data validation and validation results
19 February 2024We have the data requirements defined, and now in order to start validation we need to collect the data we want to validate and capture the current state of the information in the databases. We must prepare and convert incoming data in unstructured formats into structured formats, as we did in the chapters above on converting different types of data. Now any data that comes into the process looks like open structured tables.
Let's take a snapshot of all incoming data or connecting databases at a particular point in time. In our example, this snapshot will be created fictitiously for the time 23:00:00 Friday, 29 March 2024.

A snapshot of the CAD (BIM) table of a project that has either already received or is due to receive attribute information about a new window entity
The image shows the information table of an architect's project using CAD (BIM) programs. This table-database displays unique window and door system identifiers (attribute «ID»), types (attribute «TypeName»), dimensions (attributes «Width» and «Length»), materials (attribute «Material»), Energy Rating and Acoustic Performance and hundreds of other attributes. Such a table, filled in within the CAD (BIM) program, is collected from various departments and documents, forming an information model of the project.
Thanks to the reverse engineering tools discussed in the chapter "Translating CAD (BIM) data into a structured form", this information can be organized into separate tables or combined into one common table that integrates different sections of the project.

Structured data from different systems is a two-dimensional table with columns that denote element attributes
A project in the form of a CAD (BIM) database consists of tens or hundreds of thousands of elements. These elements are grouped into entities by type and category and range from windows and doors of various kinds to concrete slabs and wall elements. Unique identifiers (native IDs) from the CAD (BIM) program or its Type (attributes «Type Name», «Type » or «Family»), such as "W-NEW" for a new window on the north side, allow the same entity to be tracked in different systems. Information about the new window must be recorded in all systems with the same numeric or alphabetic ID: in this case "W-NEW".
While the name and IDs of entities should be the same in all systems, the set of attribute-parameters and their values differ from one system to another. Different project professionals such as architects, structural engineers, construction managers, logisticians and property managers look at the same entities differently.
For each role in the process, databases with their own user interface are provided - from CAD (BIM) projects and structural impact assessments to installation schedules, delivery and storage logistics, and real estate management.
Each system is managed by a professional team of specialists, where behind the sum of all decisions made on the entered values at the end of the chain is the system manager, who is responsible for the legal validity and quality of the entered data.

One and the same entity has the same identifier in different systems, but different attributes that are important only in this system
Having collected the structured requirements and data at the logical and physical level in an organised manner, it remains for us to validate the data from the systems against the requirements we have collected.