Textual data in the construction industry covers a wide range of formats and types of information, from paper documents to informal methods of communication such as letters, conversations, work correspondence and verbal meetings at the construction site. All of this textual data carries important information for managing construction projects, from details of design decisions and changes in plans to discussions of safety issues and negotiations with contractors and clients (Fig. 3.1-9).

Textual information can be both formalized and unstructured. Formalized data includes Word documents (.doc,.docx), PDF, as well as text files of meeting minutes (.txt). Non-formalized data include messenger and mail correspondence, meeting transcripts (Teams, Zoom, Google Meet), and audio recordings of discussions (.mp3,.wav) that require conversion to text.
But while written documents such as formal requests, contract terms and conditions, and emails usually already have some structure, verbal communications and work correspondence often remain unstructured, making them difficult to analyze and integrate into project management systems.
The key to effective management of text data is to convert it into a structured format. This allows you to automatically integrate the processed information into existing systems that already work with structured data.

To make effective use of textual information, it must be automatically converted into a structured form (Fig. 3.1-10). This process usually involves several steps:
- Text Recognition (OCR) – converting images of documents and drawings into a machine-readable format.
- Text analysis (NLP) – automatic identification of key parameters (dates, amounts and figures related to the project).
- Data classification – categorizing information (finance, logistics, risk management).
After recognition and classification the already structured data can be integrated into databases and used in automated reporting and management systems.