###### Key concepts of machine learning

27 February 2024###### ML Linear regression with ChatGPT

27 February 2024In the future, the construction company's core business process of cost and time estimating will be transformed through the use of machine learning models. This step in automating costing and estimating will not only improve accuracy and efficiency, but will also be the starting point for the development and implementation of machine learning models in other company business processes.

The following example will extract key data from past projects, and using this historical data as a basis, a machine learning model will be developed that will provide us with the ability to accurately estimate the cost and time frame for new construction projects.

It is important to be able to quickly determine how long a project will take to build and what the total cost of the project will be. These questions about project time and cost have traditionally been at the forefront of the minds of both clients and construction companies since the beginning of the construction industry.

As an example, consider three projects with three key attributes: the number of apartment), the number of floors, and a conditional measure of construction complexity on a scale of 1 to 10, where 10 represents the highest complexity (where 100 apartments is equivalent to the number 10 for ease of visualization). In machine learning, the process of converting and simplifying values such as 100 into 10 or 50 into 5 is called "normalization".

Normalizationin machine learning is the process of bringing different numerical data to a common scale to make it easier to process and analyze. This process is especially important when the data has different scales and units.

Let's assume that the first project (Figure 4.5-10) had 50 apartments, 7 floors, and a complexity score of 2, indicating a relatively simple construction. In the second project, we already had 80 apartments, 9 floors, and a relatively complex project. Under these conditions, the first and second apartment building took 270 and 330 days to build, and the total project cost was $4.5 million and $5.8 million, respectively.

In building a model for such data, the main task is to identify critical attributes, or labels, for prediction, in this case, construction time and cost.

With a small dataset, we will use information about previous construction projects to plan new ones: using machine learning algorithms, our goal is to predict the cost and construction duration of a new project X based on given new project characteristics such as 40 apartments, 4 floors, and the relative high (7) complexity of the project.

To create a predictive machine learning model, we need to choose an algorithm to create the model. An algorithm in machine learning is like a recipe that teaches a computer how to make predictions or decisions based on data. To analyze data about past construction projects and predict the time and cost of future projects (Figure 4.5-10), we can use for example one of the popular machine learning algorithms:

**Linear regression:**This algorithm tries to find a direct relationship between attributes, for example between the number of floors and the construction cost. The goal is to find a linear equation that best describes this relationship, which allows making predictions.**K-nearest neighbors (k-NN)**. This algorithm compares a new project with past projects that were similar in size or complexity. The k-NN categorizes the data based on which k training examples are closest to it. In the context of regression, the result is the mean or median of the k nearest neighbors.**Decision Trees**: It is a predictive modeling model that divides data into subsets based on different conditions using a tree structure. Each node of the tree represents a condition or question leading to further division of the data, and each leaf represents the final prediction or outcome. The algorithm divides the data into smaller groups based on different characteristics, such as first by the number of floors, then by complexity and so on, to make a prediction.

** **

Let's take a look at machine learning algorithms for estimating the cost of a new project, using two popular algorithms as examples: linear regression and K-nearest neighbors.