QTO with ChatGPT of CAD (BIM) data
24 February 2024PDF to table with ChatGPT
24 February 2024To start working with the data, we will use the code to collect the data arrays that need to be checked. For this purpose we will go through all folders of the work server and collect all documents of a certain format and content and then convert the data into a structured form, which we have discussed in detail in the chapter "Conversion of unstructured and text data into a structured form" and "Conversion of CAD (BIM) data into a structured form".
To give an illustrative example, in the Extract data loading step and getting a table of all CAD (BIM) projects we can use Pandamo (Pandas + Dynamo) for Revit®, IfcOpenShall for IFC. In our example, we will use DataDrivenConstruction converters for Revit and IFC formats to get structured tables from all projects and merge them all into one big DataFrame table
In Pandas DataFrame, it is possible to load not only structured formats, but also to load data from many data sources, including the following databases: MySQL, PostgreSQL, SQLite, Microsoft SQL Server, Oracle, Google BigQuery, MariaDB, Amazon Redshift, Teradata, IBM Db2, Snowflake, SAP HANA, ODBC connections (via SQLAlchemy or PyODBC), which we discussed in more detail in the chapter "Structured Data".
❏ To connect to the database, send a text request to ChatGPT:
"Please write an example of connecting MySQL and translating data into a DataFrame"
➤ ChatGPT Answer:
By loading multi-format data into the variable “df”, we have collected and structured data in DataFrame, one of the most popular data structures in the data processing world, which is a two-dimensional structure that organizes data into a table with rows and columns. We will talk more about types of other formats Parquet, Apache ORC, JSON, Father, HDF5 and data warehouses in the chapter "Modern data technologies in the construction industry".