
Significant changes are brewing in the construction data processing industry. DeepSeek-V3, a free open source model, is showing capabilities that surpass the best of OpenAI.
22 January 2025Deepseek and other LLM models will lead to a revolution in project data processing.
Today, almost every CAD program uses embedded database technologies (SQLite, MySQL and the like) to manage project data. In fact, every construction project created in such a system is a set of files representing specialized database formats.
Professionals from other industries (manufacturing, logistics, retail, etc.) have long been solving similar problems using ETL (Extract, Transform, Load) tools. Almost all business processes in other industries revolve around ETL processes:
▪️ Extract data from systems, databases, and files (via APIs, parsers, or converters)
▪️ Transform them by cleaning them, combining them with other sources, and calculating metrics
▪️ Load the prepared information into into target systems: BI platforms (Power BI, Tableau), ERP or specialized databases for reporting and forecasting.
The construction industry will follow the same path in the coming years. Moreover, traditional ETL approaches (ELT, OLAP) will be complemented by RAG (Retrieval-Augmented Generation) and LLM (Large Language Models) tools, which will be applied at the stages of data extraction, analysis/transformation and loading.
No more headaches with data schemas and different formats. There will only be structured and columnar, vectorized databases that any data analytics tool can easily work with.