In this article, we will cover a use case in the construction industry related to forecasting the needed materials for construction and the time in which they will be required. In the construction industry, there is a lot of uncertainty between the order time and the time in which it is actually executed, due to several factors which will be described in detail below.
Business Problem
Let’s cover the case where we want to buy heavy industry materials from a supplier, but we only have a high-level estimate of the amount we will need. We are not sure right away the exact time and characteristics of the materials that will be needed, since there might be some delays in the project, and changes between order and execution. We have clients that are executing constructions and contact us with preliminary orders with their requirements.
We need to know several things about this process:
When will this specific client execute the order?
What material characteristics will be preferable for this customer?
What are the best materials to keep in stock right now and by which amount? He needs some fixed time (e.g. 4 weeks) to build them and transport them to a storage facility. If we keep too little in stock, we’ll delay the construction. If we have too much, unused materials can degrade and will waste valuable storage capacity.
Data Entities
Let’s consider the following data entities and associated historical data for this problem:
Supplier
Dispatches the raw materials we need.
Supplier ID
ZIP Code
Builder
Executes the construction.
Builder ID
ZIP Code
Construction
The site which is being built.
Location
Construction Type (Residential/Industrial, industry type)
Building Area/Dimensions
Number of expected workers
Material
A material request.
Material Type (e.g. Cement, beams, rebars)
Characteristics: e.g. – Amount of Cement 3000kg, T-bar beams width 3”, rebars 1/2”, strength, …
For each stakeholder involved in the process, there are things causing uncertainty in the questions described above. We may over or underestimate the amount/quality of the required materials (both from inaccurate information from the construction plans or from internal uncertainties in the estimates). The builders can waste or use in a more efficient way a certain material. The delay between order and execution depends on the complexity of the construction process, time of the year (holidays!), and supplier bottlenecks, among others.
Relevant Features
There are some relevant features that can be extracted to make this problem easier to predict:
Construction x Material: Amount and type of each material needed at order time and execution time for a certain construction. This tells us which constructions over- and underestimated certain materials, and what was the delay between order and execution. It will be used for building the targets of our problem.
Builder/Supplier: Statistics for the historical differences in material amount/characteristics between order time e execution time (e.g. ordered 3 tons cement, only needed 2.5 in the past 3 months, on average)
Time: Time of the year (month, quarter, season, …) and historical features on the difference between order time and execution time.
Modeling
For simplification purposes, let’s assume we only want to predict a single material: e.g.beams needed for a single unit in the construction.
We need to determine:
Required number of beams
Time until the order is executed, after it’s ordered with some initial characteristics
Option 1 – Multitask Regression Model
In this initial approach, we take the features at order time and try to predict the number of beams needed, and the number of days between order and execution. This is done using a multitask model, with two regression tasks.
The advantages of this approach are that it is easy to set up, the targets are easy to interpret, makes the model more robust, and might increase the performance. However, there are several disadvantages:
There’s a set of defined templates for the beams (SKU – Stock Keeping Unit) and the model might be predicting beam configurations that do not exist!
Hard decision process: There’s no way to measure prediction confidence when all you have is a value.
Difficult convergence: the domain of possible values is very large, and it’s not easy to tell if a prediction is good or not.
Option 2 – Multitask Classification
We can alternatively build a multitask classification model, where we consider two tasks:
whether or not the execution beams in our hypothesis matched a certain beam in stock. This means we will have to create artificial samples in our dataset: 1 positive row and N negative rows, where N is the number of possible beams.
Probability of the number of days between today’s date e execution being less than N weeks. This will require generating random dates between order date e execution date. The value of N is determined according to the needs of production and transportation to storage places by our client.
The table below shows an example of what this artificial sampling would look like:
Sampling Date: Randomly sampled dates between order_date and execution_date
Execution Beam Width (hypothesis): The comparison we’re performing. These are the values of beams that are in stock.
Execution Beam Width: What really happened. We use the comparison to “Execution Beam Width (hypothesis)” as a target.
In blue are shown the rows where the target is positive, and in orange where they are negative. For instance, a sampling date of 25/2/2021 is close enough to our execution date to consider it as a positive target for prediction, while 20/2/2021 is not. In terms of execution beams, the target is positive when the pre-orders and the execution matches.
Model Architecture
Regarding model architecture, we can build a two-stream model: we separate the features belonging to the delay between order and execution and the difference between ordered and executed material, since we’ll have multiple rows with similar features, and this tells the model to treat them differently in an explicit way.
The proposed architecture is relatively simple: a set of dense and dropout layers, followed by an aggregation operation (e.g. concatenation). Afterward, another set of Dense/Dropout layers transforms this concatenated latent space. In the bottom, two different softmax layers, one for each task, are added.
Compared to Option 1, this architecture has the advantage of allowing a decision process based on prediction confidence and only predicting items that the client is able to produce. However, the process complexity is higher: you need to create positive and negative training samples, and it is harder to set up.
We can also add custom penalizations in our loss function according to the business problem. If we predict 30 beams in a building that needs 20, it’s ok. If it needs more, it will not be sufficient. When the model doesn’t predict the same material, but a compatible one, we can punish it less. When it’s not, punish it more.
Decision Process
Building a multitask classification model allows us to create a decision process based on the expected value. Namely, what’s the probability P of needing K units of a product has an expected value of P x K units.
To know which materials to keep in stock for the next N weeks, the expected value for all constructions that are ordered and not yet executed can be summed.
Impact Measurement
What metrics would be important to measure?
Internally, for building and evaluating your model, you can use Machine Learning metrics:
Classification: PR AUC, ROC AUC, …
Regression: Mean Absolute Error, Mean Squared Error…
But this tells nothing about how good the model is at predicting the amount you need to stock. You need to measure business metrics as well:
How many products did we predict/produce in excess because no one purchased them
How many products didn’t we predict/produce on time, leading to an extra delay in the construction
Conclusion
This article has shown some different ways you can think about product forecasting problems, where there are a lot of products with similar characteristics.
We only cover the specific case of forecasting a single product type (beams) with different characteristics. However, this could be generalized for different products – such as the amount of cement needed – by adapting the model. Since there are no “cement SKUs”, and any amount predicted is valid, you can replace the sigmoid classification with a linear layer, and create a regression model together with binary classification for the time delay.
Like this story?
Subscribe to Our Newsletter
Special offers, latest news and quality content in your inbox.
Signup single post
Recommended Articles
Article
Descubra o significado do «Ai First»: Guia estratégico para 2026
22 de junho de 2026 in
Guia: Explicação
Descubra o verdadeiro significado da abordagem «AI First» para a sua empresa. Obtenha um roteiro estratégico para 2026, exemplos práticos e evite erros comuns.
IA para o Crescimento Empresarial: O Seu Guia Prático de Estratégia
17 de junho de 2026 in
Guia: Explicação
Descubra como a IA para o crescimento empresarial pode aumentar a eficiência, melhorar a tomada de decisões e criar laços mais fortes com os clientes. Obtenha agora estratégias práticas.
Automatização de faturas: otimize o seu departamento de contas a pagar em 2026
15 de junho de 2026 in
Guia: Explicação
Otimize o processo de contabilidade de fornecedores através da automatização das faturas. Conheça as tecnologias e as melhores práticas e avalie o valor real para a sua organização.
Utilizamos cookies no nosso website para lhe proporcionar a experiência mais relevante, lembrando as suas preferências e visitas repetidas. Ao clicar em “Aceitar Tudo”, concorda com a utilização de TODOS os cookies. No entanto, pode visitar as "Definições de Cookies" para fornecer um consentimento controlado.
Este website utiliza cookies para melhorar a sua experiência enquanto navega no website. Desses, os cookies categorizados como necessários são armazenados no seu navegador, pois são essenciais para o funcionamento das funcionalidades básicas do website. Também utilizamos cookies de terceiros que nos ajudam a analisar e compreender como utiliza este website. Estes cookies serão armazenados no seu navegador apenas com o seu consentimento. Tem também a opção de recusar estes cookies. No entanto, a recusa de alguns destes cookies pode afetar a sua experiência de navegação.
Os cookies necessários são absolutamente essenciais para que o website funcione corretamente. Estes cookies garantem funcionalidades básicas e recursos de segurança do website, de forma anónima.
Cookie
Duration
Description
cookielawinfo-checkbox-analiticas
11 meses
Este cookie é definido pelo plugin de Consentimento de Cookies do RGPD. O cookie é usado para armazenar o consentimento do utilizador para os cookies na categoria "Análise".
---
O seu texto é uma etiqueta ou nome de campo, provavelmente de um sistema de gestão de cookies ou de um formulário web, e não uma frase completa que necessite de tradução contextual.
No entanto, se o objectivo for manter a clareza e a funcionalidade para um utilizador de língua portuguesa, sugiro a seguinte tradução e explicação:
**"Checkbox Funcional"**
**Explicação:**
* **Checkbox:** Refere-se ao elemento gráfico de marcação (uma caixa que pode ser seleccionada ou desmarcada).
* **Funcional:** Indica que esta caixa de seleção está relacionada com funcionalidades essenciais do website, como o login, a gestão do carrinho de compras ou outras características que tornam o site utilizável.
Se esta etiqueta pertencer a um contexto onde se refere especificamente a cookies, a tradução poderia ser ajustada para ter mais clareza:
**"Aceitação de Cookies Funcionais"**
ou
**"Cookies Essenciais (Funcionais)"**
Esta última opção é comum em avisos de cookies para indicar que estes são estritamente necessários para o funcionamento do site.
---
11 meses
O cookie é definido pelo consentimento de cookies GDPR para registar o consentimento do utilizador para os cookies na categoria "Funcional".
cookielawinfo-checkbox-necessary
11 meses
Este cookie é definido pelo plugin GDPR Cookie Consent. O cookie é usado para armazenar o consentimento do utilizador para os cookies na categoria "Necessário".
cookielawinfo-checkbox-outros
11 meses
Este cookie é definido pelo plugin GDPR Cookie Consent. O cookie é usado para armazenar o consentimento do utilizador para os cookies na categoria "Outros".
checkbox-performance-cookielawinfo
11 meses
Este cookie é definido pelo plugin GDPR Cookie Consent. O cookie é usado para armazenar o consentimento do utilizador para os cookies na categoria "Desempenho".
política_de_cookies_visualizada
11 meses
O cookie é definido pelo plugin GDPR Cookie Consent e é utilizado para armazenar se o utilizador consentiu ou não com a utilização de cookies. Não armazena quaisquer dados pessoais.
Os cookies funcionais ajudam a realizar certas funcionalidades como partilhar o conteúdo do website em plataformas de redes sociais, recolher feedback e outras funcionalidades de terceiros.
Os cookies de desempenho são usados para compreender e analisar os principais índices de desempenho do website, o que ajuda a proporcionar uma melhor experiência ao utilizador para os visitantes.
Os cookies analíticos são usados para entender como os visitantes interagem com o website. Estes cookies ajudam a fornecer informações sobre métricas como o número de visitantes, taxa de rejeição, fonte de tráfego, etc.
Os cookies de publicidade são usados para fornecer aos visitantes anúncios relevantes e campanhas de marketing. Estes cookies rastreiam os visitantes em diferentes websites e recolhem informações para fornecer anúncios personalizados.