Effective data management requires a clear standardization strategy. Only with clear requirements for data structure and quality can data validation be automated, manual operations reduced and informed decision making accelerated at all stages of a project.
In daily practice, a construction company has to process hundreds of files every day: e-mails, PDF -documents, CAD design files, data from IOT sensors, which need to be integrated into the company’s business processes.
The forest of a company’s ecosystem of databases and tools (Fig. 4.2-2) must learn how to derive nutrients from incoming multiformat data to produce the results the company needs.
To effectively deal with the flow of data, you don’t necessarily need to hire an army of managers, you first need to develop strict requirements and standards for data and use appropriate tools to automatically validate, unify and process it.

To automate the process of data validation and unification (for subsequent automatic integration) you should start by describing the minimum necessary data requirements for each specific system. These requirements define:
- What exactly do you need to get?
- In what form (structure, format)?
- What attributes are mandatory?
- What tolerances in accuracy and completeness are acceptable?
Data requirements describe the criteria of quality, structure and completeness of the received and processed information. For example, for texts in PDF -documents it is important to ensure accurate formatting in accordance with industry standards (Fig. 7.2-14 – Fig. 7.2-16). Objects in CAD -models must have correct attributes (dimensions, codes, links to classifiers) (Fig. 7.3-9, Fig. 7.3-10). And for contract scans, clear dates and the ability to automatically extract the amount and key terms are important (Fig. 4.1-7 – Fig. 4.1-10).
Formulating data requirements and automatically checking their compliance is one of the most time-consuming but critical steps. It is the most time-consuming step in business processes.
As mentioned in Part 3 of this book, between 50% and 90% of business intelligence (BI) professionals’ time is spent on data preparation rather than analysis (Fig. 3.2-5). This process includes data collection, verification, validation, harmonization, and structuring.
According to a 2016 survey (N. I. o. Health, “NIH STRATEGIC PLAN FOR DATA SCIENCE,” 2016), data scientists in a wide variety of broad-spectrum fields stated that they spend most of their work time (about 80%) doing what they least like to do (Fig. 4.2-3): collecting existing datasets and organizing (unifying, structuring) them. Thus, less than 20% of their time is left for creative tasks, such as finding patterns and regularities that will lead to new insights and discoveries.

Successful data management in a construction company requires a comprehensive approach that includes parameterization of tasks, formulation of data quality requirements, and use of suitable tools for their automated validation.