Your learning journey starts here – select a chapter group
The fourth part focuses on methodologies and technologies for transforming disparate information into structured data sets of high quality. The process-es of forming and documenting data requirements as a basis for effective information architecture in construction projects are discussed in detail. Practical methods of extracting structured information from various sources (PDF -documents, images, text files, CAD -models) with examples of imple-mentation are presented. The use of regular expressions (RegEx) and other tools for automatic validation and verification of data is analyzed. The pro-cess of data modeling at conceptual, logical and physical levels is described step by step, taking into account the specifics of the construction industry. Specific examples of using language models (LLM) to automate the pro-cesses of structuring and validation of information are demonstrated. Effec-tive approaches to the visualization of analysis results are proposed, increas-ing the availability of analytical information for all levels of construction pro-ject management
051 Learning how to turn documents, PDF, pictures and texts into structured formats
In the era of the data-driven economy, data is becoming the basis for decision-making rather than an obstacle. Instead of constantly adapting information to each new system and its formats, companies are increasingly striving to...
052 Example of converting a PDF -document into a table
One of the most common tasks in construction projects is to process specifications in PDF format. To demonstrate the transition from unstructured data to a structured format, let’s consider a practical example: extracting a table...
054 Converting text data into a structured form
In addition to PDF documents with tables (Fig. 4.1-2) and scanned versions of tabular forms (Fig. 4.1-5), a significant part of information in project documentation is presented in text form. It can be both coherent...
055 Translation of CAD data (BIM) into a structured form
Structuring and categorizing CAD data (BIM) is more challenging because data stored from CAD (BIM) databases are almost always in closed or complex parametric formats, often combining geometric data elements (semi-structured) and metainformation elements (semi-structured...
056 CAD solution vendors move to structured data
From 2024, the design and construction industry is undergoing a significant technological shift in the use and processing of data. Instead of free access to design data, CAD -system vendors are focusing on promoting the...
057 Speed of decision making depends on data quality
Today’s design data architecture is undergoing fundamental changes. The industry is moving away from bulky, isolated models and closed formats towards more flexible, machine-readable structures focused on analytics, integration and process automation. However, the transition...
058 Data standardization and integration
Effective data management requires a clear standardization strategy. Only with clear requirements for data structure and quality can data validation be automated, manual operations reduced and informed decision making accelerated at all stages of a...
059 Digital interoperability starts with requirements
As the number of digital systems within companies grows, so does the need for data consistency between them. Managers responsible for different IT systems often find themselves unable to keep up with the increasing volume...
060 A common language of construction the role of classifiers in digital transformation
In the context of digitalization and automation of inspection and processing processes, a special role is played by classification systems elements – a kind of “digital dictionaries” that ensure uniformity in the description and parameterization...
061 Masterformat, OmniClass, Uniclass and CoClass the evolution of classification systems
Historically, construction element and work classifiers have evolved in three generations, each reflecting the level of available technology and the current needs of the industry in a particular time period (Fig. 4.2-8): First generation (early...
062 Data modeling conceptual, logical and physical model
Effective management of data (structured and categorized by us earlier) is impossible without a well thought-out storage and processing structure. To ensure access and consistency of information at the storage and processing stages, companies use...
064 Creating a database using LLM
Having a data model and description of entities through parameters, we are ready to create databases – storages, where we will store information coming after the structuring stage on specific processes. Let’s try to create...
065 Center of Excellence (CoE) for Data Modeling
With data becoming one of the key strategic assets, companies need to do more than just collect and store information correctly – it is important to learn how to manage data systematically. The Center of...
066 Requirements gathering and analysis transforming communications into structured data
Collecting and managing requirements is the first step to ensuring data quality. Despite the development of digital tools, most requirements are still formulated in an unstructured way: through letters, meeting minutes, phone calls and verbal...
068 Structured Requirements and RegEx regular expressions
Up to 80% of data created in companies is in unstructured or semi-structured formats (“Structured and unstructured data: What’s the Difference?,” 2024) – text, documents, letters, PDF -files, conversations. Such data (Fig. 4.4-1) is difficult...
069 Data collection for the verification process
Before starting validation, it is important to make sure that the data are available in a form suitable for the validation process. This means not just having the information available, but preparing it: the data...
070 Verification of data and results of verification
All new data entering the system – be it documents, tables or database entries from the client, architect, engineer, foreman, logistician or property manager – must be validated against the requirements formulated earlier (Fig. 4.4-9)....
071 Visualization of verification results
Visualization is an essential tool for interpreting inspection results. In addition to the usual summary tables, it can include dashboards, diagrams, and automatically generated PDF documents that group project elements by inspection status. Color coding...
072 Comparison of data quality checks with human life needs
Despite the constant development of data quality control methods and tools, the fundamental principle of information compliance remains unchanged. This principle is built into the foundation of a mature management system, whether in business or...
073 Next steps turning data into accurate calculations and plans
In this part we looked at how to convert unstructured data into a structured format, develop data models and organize processes for checking the quality of information in construction projects. Data management, standardization and classification...