
QTO with ChatGPT of CAD (BIM) data
24 February 2024
PDF to table with ChatGPT
24 February 2024To start working with the data, we will use the code to collect the data arrays that need to be checked. For this purpose we will go through all folders of the work server and collect all documents of a certain format and content and then convert the data into a structured form, which we have discussed in detail in the chapter "Conversion of unstructured and text data into a structured form" and "Conversion of CAD (BIM) data into a structured form".

Convert CAD (BIM) data into one large dataframe that will contain all sections of the project.
To give an illustrative example, in the Extract data loading step and getting a table of all CAD (BIM) projects we can use Pandamo (Pandas + Dynamo) for Revit®, IfcOpenShall for IFC. In our example, we will use DataDrivenConstruction converters for Revit and IFC formats to get structured tables from all projects and merge them all into one big DataFrame table

Conversion of all RVT and IFC files from the to one large structured (df) DataFrame
In Pandas DataFrame, it is possible to load not only structured formats, but also to load data from many data sources, including the following databases: MySQL, PostgreSQL, SQLite, Microsoft SQL Server, Oracle, Google BigQuery, MariaDB, Amazon Redshift, Teradata, IBM Db2, Snowflake, SAP HANA, ODBC connections (via SQLAlchemy or PyODBC), which we discussed in more detail in the chapter "Structured Data".
❏ To connect to the database, send a text request to ChatGPT:
"Please write an example of connecting MySQL and translating data into a DataFrame"
➤ ChatGPT Answer:

Example of connecting Pandas to MySQL database and importing data from MySQL database into Dataframe
By loading multi-format data into the variable “df”, we have collected and structured data in DataFrame, one of the most popular data structures in the data processing world, which is a two-dimensional structure that organizes data into a table with rows and columns. We will talk more about types of other formats Parquet, Apache ORC, JSON, Father, HDF5 and data warehouses in the chapter "Modern data technologies in the construction industry".