Your learning journey starts here – select a chapter group
In the third part, a comprehensive understanding of the typology of data in con-struction and methods of their effective organization is formed. The characteris-tics and specifics of working with structured, unstructured, semi-structured, tex-tual and geometric data in the context of construction projects are analyzed. Modern storage formats and protocols of information exchange between differ-ent systems used in the industry are considered. Practical tools and techniques for transforming multi-format data into a single structured environment are de-scribed, including how to integrate CAD (BIM) data. Approaches to ensuring the quality of data through standardization and validation critical to the accuracy of construction calculations are proposed. Practical aspects of using modern tech-nologies (Python Pandas, LLM -models) with code examples to solve typical problems in the construction industry are analyzed in detail. The value of creat-ing a competence center (CoE) as an organizational structure for coordination and standardization of information management approaches is substantiated.
024 The most important data types in the construction industry
In the modern construction industry, the systems, applications and data warehouses of companies are actively filled with information and data of various types and formats (Fig. 3.1-1). Let’s take a closer look at the main...
025 Structured Data
In the construction industry, information comes from many sources – drawings, specifications, schedules and reports. To effectively manage this flow of information, it needs to be structured. Structured data allows you to organize information in...
026 Relational databases RDBMS and SQL query language
Relational databases (RDBMS) are data warehousing systems that organize information into tables with defined relationships between them to efficiently store, process, and analyze data. Data organized in databases (RDBMS) are not just digital information; they...
027 SQL-queries in databases and new trends
The main advantage of the SQL language, often used in relational databases, over other types of information management (for example, with the help of classic Excel spreadsheets) is the support of very large volumes of...
028 Unstructured data
Although most of the data used in applications and information systems is in structured form, most of the information generated in construction is in the form of unstructured data – images, videos, text documents, audio...
029 Text data between unstructured chaos and structure
Textual data in the construction industry covers a wide range of formats and types of information, from paper documents to informal methods of communication such as letters, conversations, work correspondence and verbal meetings at the...
030 Semi-structured and loosely structured data
Semi-structured data contains some level of organization, but does not have a strict schema or structure. Although such information includes structured elements (e.g. dates, employee names and lists of tasks completed), the format of presentation...
031 Geometric data and its application
If metadata about project elements are almost always stored in the form of tables, structured or weakly structured formats, then geometric data of project elements in most cases are created using special CAD tools (Fig....
032 CAD data from design to data storage
Modern CAD and BIM systems store data in their own, often proprietary formats: DWG, DXF, RVT, DGN, PLN and others. These formats support both 2D and 3D representations of objects, preserving not only the geometry...
033 The emergence of the BIM (BOM) concept and the use of CAD in processes
The concept of Building Information Modeling (BIM), first outlined in the 2002 BIM Whitepaper (“Building Information Modeling Whitepaper site,” 2003), originated from the marketing initiatives of CAD software manufacturers. It emerged from the marketing initiatives...
034 Filling systems with data in the construction industry
Whether it is large corporations or medium-sized companies, specialists are daily engaged in filling program systems and databases with various interfaces with multiformat information (Fig. 3.2-1), which, with the help of managers, must interact with...
035 Data transformation the critical foundation of modern business analysis
Today, most companies are facing a paradox: about 80% of their daily processes still rely on classic structured data – familiar Excel spreadsheets and relational databases (RDBMS) (М. Shacklett, “Structured and unstructured data: Key differences,”...
036 Data models relationships in data and relationships between elements
Data in information systems are organized in different ways – depending on the tasks and requirements for storing, processing and transmitting information. The key difference between the types of data models, the form in which...
037 Proprietary formats and their impact on digital processes
One of the key challenges faced by construction companies during digitalization is limited access to data. This makes it difficult to integrate systems, reduces the quality of information and complicates the organization of efficient processes....
038 Open formats are changing the approach to digitalization
The construction industry was one of the last to address the problem of closed and proprietary data. Unlike other sectors of the economy, digitalization has been slow to develop here. The reasons for this include...
039 Paradigm Shift Open Source as the End of the Era of Software Vendor Dominance
The construction industry is undergoing a shift that cannot be monetized in the usual way. The concept of data-driven, data-centric approach and the use of Open Source tools is leading to a rethinking of the...
040 Structured open data the foundation of digital transformation
While in past decades business sustainability was largely determined by the choice of software solutions and dependence on specific vendors, in today’s digital economy the key factor is data quality and the ability to work...
041 LLM chat rooms ChatGPT, LlaMa, Mistral, Claude, DeepSeek, QWEN, Grok for automating data processing processes
The emergence of Large Language Models (LLMs) was a natural extension of the movement towards structured open data and the Open Source philosophy. When data becomes organized, accessible and machine-readable, the next step is a...
042 Large Language Models LLM how it works
Big language models (ChatGPT, LlaMa, Mistral, Claude, DeepSeek, QWEN, Grok) are neural networks trained on huge amounts of textual data from the Internet, books, articles and other sources. Their main task is to understand the...
043 Utilizing local LLMs for sensitive company data
The appearance of the first chat-LLMs in 2022 marked a new stage in the development of artificial intelligence. However, immediately after the widespread adoption of these models, a legitimate question arose: how secure is it...
044 Full control of AI in the company and how to deploy your own LLM
Modern tools allow companies to deploy a large language model (LLM) locally in just a few hours. This gives complete control over data and infrastructure, eliminating dependence on external cloud services and minimizing the risk...
045 RAG Intelligent LLM -assistants with access to corporate data
The next stage in the evolution of LLM application in business is the integration of models with actual real-time corporate data. This approach is called RAG (Retrieval-Augmented Generation) – Retrieval-Augmented Generation. In this architecture, the...
046 Choosing an IDE from LLM experiments to business solutions
When diving into the world of automation, data analysis, and artificial intelligence – especially when working with large language models (LLMs) – it is critical to choose the right integrated development environment (IDE). This IDE...
048 Python Pandas an indispensable tool for working with data
Pandas occupies a special place in the world of data analysis and automation. It is one of the most popular and widely used libraries of the Python programming language(“Python Packages Download Stats,” 2024), designed to...
049 DataFrame universal tabular data format
DataFrame is the central structure in the Pandas library, which is a two-dimensional table (Fig. 3.4-6) where rows correspond to individual objects or records and columns correspond to their characteristics, parameters, or categories. This structure...
050 Next steps building a sustainable data framework
In this part, we reviewed the key types of data used in the construction industry, got acquainted with different formats of their storage and analyzed the role of modern tools, including LLM and IDEs, in...