RAG-ready data from Revit, IFC, DWG, DGN

Structured Data as DataFrame

The Pandas library that processes DataFrame data is loaded about 8 million times a day. Due to its popularity and ease of use, DataFrame has become the main format for data processing and automation in ChatGPT. DataFrames (also created by converting proprietary and parametric CAD formats (BIM) using the DataDrivenConstruction converter.

 

examples of using data after conversion in ChatGPT4

Using ChatGPT for CAD (BIM) 

Quick QTO with Graph from Revit

2022_rst_advancedsampleproject.rvt

Design ohne Titel (1)
find ids from column “Layer” with value “wall”. Get this IDs and find in ParentID. Than take for each group with ParentID column “Point” – x,y,z from each line. Plot separate polylines for each gr (3)
Group the data in Dataframe by "Type Name" while summarizing the "Volume" parameter and show the number of items in the group. And show it all as a horizontal bar chart without zero values

Plot Polylines from DWG

family_house_florida.dwg

DWG ChatGPT
find ids from column “Layer” with value “wall”. Get this IDs and find in ParentID. Than take for each group with ParentID column “Point” – x,y,z from each line. Plot separate polylines for each gr (2)
Find ids from column “Layer” with value “wall”. Get this IDs and find in "ParentID". Than take for each group with "ParentID" column “Point” - x,y,z from each line. Plot separate polylines for each group based on "ParentID” and connects first and last points. Plot all lines with matplotlib without legend

Area Distribution By ObjectType For IfcSlab from IFC

Ifc2x3_Duplex_Architecture.ifc

IFC in dataframe and in Chat
IFC with ChatGPT into pie chart (1)
Take only the items that have Level 1 and Level 2 values in the "Parent" parameter and take the items that have IfcSlab values in the "Category" parameter, then group these items by the "ObjectType" parameter and sum the values in the "PSet_Revit_Dimensions Area" parameter and show them as a pie chart
video tutorial on using project data in ChatGPT

CAD (BIM) data processing in ChatGPT

DataFrame Pandas

A DataFrame is a way of organizing data into a table very similar to the one you might see in Excel. In this table, the rows are individual records or entities, and the columns are the various characteristics or attributes of these item-entities.

If we have a table with information about a construction project, the rows can represent the individual entities-elements of the project and the attributes-columns can represent their categories, parameters, position or coordinates of the BoundingBox elements.

 

🚀 Efficient Data Management

DataFrames are optimized for handling large datasets, providing faster data manipulation.

🌐 Support for Heterogeneous Data

They can store different data types (like integers, strings, and floats) in various columns, ideal for real-world data.

🔍 Built-in Operations

DataFrames come equipped with numerous built-in methods for data filtering, sorting, and aggregating, simplifying complex data operations.

🔬 Ease of Data Exploration

Their tabular structure makes it easy to explore, analyze, and visualize data, aiding in quick data inspection and analysis.

🔗 Compatibility with Data Analysis Tools

They seamlessly integrate with various data analysis and visualization libraries, enhancing productivity in data science tasks.

📊 Advanced Data Integration

DataFrames easily interface with different data sources and formats, facilitating the integration and consolidation of diverse data sets.

The transition from unmanaged data flow to its effective integration into business processes starts with converting data from closed formats to open formats. In information technology, open-source applications allow developers around the world to collaboratively improve software.

A major benefit of open data is its ability to remove the dependence of application developers on specific platforms to access data.

In the choice between open and closed data, experts obviously choose the open form of data, as is the preference for structured data in automation, processing and data warehousing processes (Figure 2.2-4). Open and structured data is often used by default in most systems because of its ease of processing and unambiguous interpretation, making it the most preferred type for communication and collaboration at the requirements and business process level.