Translation of CAD (BIM) data into a structured form
19 February 2024Creating a database using ChatGPT
19 February 2024In the construction industry, companies are faced with the challenge of managing vast amounts of data collected in a variety of formats - from emails and Excel and CSV documents to scanned JPEG and PDF images. This creates difficulties in organising, retrieving and analysing information. To solve access problems and improve data management efficiency, tables and databases are created to meet business requirements through data modelling techniques.
A data model serves as a conceptual tool for articulating and sharing business requirements. It depicts the characteristics of the data, the business rules that govern it, and the organisation of the data in the tables and database.
In a data modelling process similar to building a house, a company hires a data architect to define the business requirements for the future database. The architect then creates a data model, after which the same company hires builders, i.e., database administrators and developers, to implement the project with the business requirements behind it.
Thus, every construction company, in addition to erecting buildings, must master the art of "building" databases (tables) and learn to masterfully create links between them, as if connecting bricks in a reliable and strong wall of company knowledge and data.
Key concepts in data modeling include:
- Entities: They represent the elements in the business context for which data is collected. At the level of project model creation in CAD (BIM) software the entities can be individual elements of the project, whereas in the estimate or calculation level the entities can be groups of elements collected by types, categories or other attributes.
- Attributes: Attributes serve to organise and define the structure of the data, covering specific details that need to be stored about entities. This can include information about the dimensions of an item, its properties, logistics parameters, assembly prices - all of which are characteristics of the entity. Entities are usually database tables and attributes are usually columns.
- Relationships: The connections between entities indicate how one entity relates to another within a data model. These relationships can be categorized as "one to one," "many to one," or "many to many."
- ER Diagrams: Entity-Relationship (ER) diagrams visualize the entities and their interrelations. These diagrams can represent a conceptual, logical, or physical data model, each serving different phases of data modeling.
In database design, understanding the levels of data abstraction is essential to effectively map out the system's architecture. The design process is typically segmented into three key models – each serving a distinct purpose and level of detail in representing the data and its structure. Here's a brief look at each:
Conceptual Data Model: This model outlines the primary entities and their relationships without going into detail about attributes. It's typically used in the initial planning stages.
Physical Data Model: This model details the necessary structures for database implementation, including tables, columns, and relationships. It focuses on database performance, indexing strategies, physical storage, and denormalization to optimize the physical deployment of databases.
The structured workflow for data model development and database construction can be described as follows:
- Gathering business requirements: This initial step involves gathering information about operational business needs and data requirements.
- Entity Identification: Based on the collected requirements, this phase involves identifying the specific entities - individual objects, types, categories - about which data needs to be collected.
- Conceptual and logical data modelling: In this stage, the structure of the data model begins to emerge. It starts with a conceptual model that gives a broad view of the entities of the system and their relationships without going into detail. This model then grows into a logical data model that refines the conceptual blueprint by defining in detail the specific attributes, keys and relationships between entities, establishing business data rules and frameworks.
- Physical Data Modelling: An extension of the data modelling process, this phase turns the logical model into a detailed technical schema suitable for implementation. It specifies the exact tables, columns, data types, and integrity constraints that will be used to build the database, taking into account performance considerations such as indexing strategies and physical storage options.
Database Creation: The culmination of the workflow is the actual database construction, where the physical model is implemented in the database management system. The database is then ready to store and organise business data as described in the previous models.
In the upcoming chapter «Data quality requirements and assurance», we'll delve deeper into gathering business requirements and identifying entities. For now, let's build a basic database with minimal requirements using ChatGPT.