When we decide to buy or rent a real estate (apartment, room, house, etc), one of the most important search criteria is the price. Its value depends mostly on characteristics, such as location, year of construction, number of rooms, area, central heating, etc.
However, two properties with the same characteristics, for example, can be sold at two totally different prices, and there are deeper reasons for that difference. The seller/buyer urgency in completing the deal, the market context, the real estate agency managing the deal, or the time of the year, all contribute to these differences.
Thus, it can be particularly challenging to determine what is the real selling price of a given property. By analyzing the listing prices of properties in real estate websites, we can get an incorrect idea of the true value of the place. That is especially true, due to overestimation of the realistic value, for selling purposes. This may lead us to end up buying/renting a place for a price way greater than the realistic one.
As such, we will explore an approach to determine the real selling price of a place, by taking into consideration different aspects considered relevant when making an offer.
Investment in real estate can be purchase or rental of a house, an apartment or a room. It can also be for private use or for commercial use. However, we will assume the scenario of purchasing an apartment for private use. Nevertheless, in all these different contexts, the same considerations can be taken into account.
Data
Besides the property characteristics, there are other factors that may influence the selling price, therefore, we should look at other types of indicators and data when making an evaluation, namely:
Demographics and Geo-spatial data: a recent boost in the population in a given neighborhood may indicate a higher demand for that area, therefore the price should be higher. As such, the population growth around the respective neighborhood may be an indicator of the property price. Also, the infrastructures in the neighborhoods like the number and variety of stores, malls, public transportation, and schools may help adjust the right price.
Unstructured data: the pictures, descriptions and opinions given in the form of images and text can help us capture the condition of the property, which can impact its price.
Market behavior: The current market conditions also have an impact on selling prices. The demand or number of similar houses on sale impacts the house value, following the Demand and Supply Law. Thus, the average price of similar houses bought recently can be a good indication of the price. Additionally, the length of time during which the property is on the market, compared to the average for similar properties, may raise a red flag with regards to the current price.
Economic Indicators: There are also economic factors that influence house pricing. For example, the increase in the employment rate or in the wage growth can lead to an increase on the property listing price. Also, changes in the interest rate or financial incentives can contribute to buying a property on credit.
Selling aspects: urgency in the selling process, conditions of the payment and expertise of the selling agency, are other aspects that may influence the selling price, compared to its real value.
In terms of the data available, we can assume we know the apartment characteristics (e.g., number of rooms, location, area, energy efficiency, etc) and some indicators, like number of infrastructures near the place, employment rate, average price of similar houses, urgency in the selling, images and texts of the apartments, etc.
Furthermore, in terms of pricing, we will assume that we know the listing price of all apartments, and the selling price of some apartments (e.g., the selling price of deals made by a single real estate agency). This information can be structured as follows:
Just for clarification, we refer to the selling price as the value a given property is effectively sold at and the listing price corresponds to the price the place was listed on the market in the first place.
Price Prediction
The selling price prediction has several challenges, namely the following two:
The real selling price is often missing in our dataset, as it is not always available
The listing price may help us predict the selling price, although the distribution of the ratio between listing price and selling price is not given
Semi-supervised approach
As the selling price is only available in a small set of samples, the exploration of a fully supervised approach is not suitable.
One first approach could be using a semi-supervised approach with the goal of predicting the selling price based on the few samples labeled, as follows:
F(apt features) -> selling price
Where apt features, includes all the aspects previously described, such as demographics and geo-spatial data, market behavior, economic indicators, etc, besides the apartment characteristics. The text or image data could be encoded to be used in a tabular data format.
There are different semi-supervised techniques we could explore (transductive, inductive, wrapper methods, etc) for modeling.
However, this approach would be biased towards the agency from which we gathered the real selling price. Furthermore, we would not be, explicitly, taking advantage of having the listing price available, which can be used as a weak label.
As such, another approach can be considering the listing price as a weak label and use it to predict the selling price. For making a direct mapping, we would need to determine the distribution of the difference between the real selling price and listing price.
Thus, we can combine both semi-supervised learning and weakly supervised learning, in order to:
Adapt our approach, taking into consideration we have few data labeled (semi-supervised approach)
Use a noisy and weak label, the listing price, as a starting point to compute the real selling price (weakly supervised approach)
To achieve that, we will customize a loss function that can help us solve this task, taking these challenges into consideration.
Generically, we can model our problem as follows:
F(apt features, listing price) -> selling price
Again, the apt features would consist of all the aspects mentioned before and not only the apartment characteristics.
Distribution of the ratio between selling price and listing price
We will determine the relationship between the listing price and selling price by calculating the distribution of the ratio between them.
A possible example of the price ratio distribution could be:
Loss function/Optimization
The loss function will be customized in order to compare the price ratio distribution using the model predictions with the real price ratio distribution (computed with the known selling prices), combined with evaluation of the predictions of selling price.
To achieve this, we can use the Kullback-Leibler Divergence, which quantifies the difference between probability distributions using the following formula:
Where p e q correspond to the two probability distributions to be compared.
For evaluating the selling price predictions we can use the Mean Absolute Error (MAE):
Where x represents the selling price predictions and the y represents the real selling prices.
Thus, our loss function would be:
Where r_p refers to the price ratio distribution using the selling price predictions of the model and r_g refers to the real price ratio distribution, using the samples in which we know the real selling price. The selling_pricepredicted represents the selling prices predicted by the model and the selling_pricereal represents the real selling prices.
The task of purchasing a property can be quite impactful in our financial life. Therefore we should put an extra effort to try to get the best deal in terms of value/quality vs price.
This post discusses an approach for determining the correct selling price, based on the different factors considered relevant. There are a lot of aspects that influence a property value, and even more that determine the selling price. Thus, we started by making an overview of the different aspects that may influence the selling price, where the market behavior, demographics and geo-spational data, unstructured data (reviews, pictures and descriptions) and economic indicators are included.
Based on the data that is normally available online we described an approach that combines both weakly supervised and semi-supervised learning, together with a customized loss function that focuses on learning the real price ratio distribution, i.e., the ratio between the listing price and selling price.
This can be a realistic approach for predicting the real selling price. Nevertheless, and, as usual, if you have any comments or ideas about Automated Valuation Models for Real Estate, make sure to reach us!
Like this story?
Subscribe to Our Newsletter
Special offers, latest news and quality content in your inbox.
Signup single post
Recommended Articles
Article
Descubra o significado do «Ai First»: Guia estratégico para 2026
22 de junho de 2026 in
Guia: Explicação
Descubra o verdadeiro significado da abordagem «AI First» para a sua empresa. Obtenha um roteiro estratégico para 2026, exemplos práticos e evite erros comuns.
IA para o Crescimento Empresarial: O Seu Guia Prático de Estratégia
17 de junho de 2026 in
Guia: Explicação
Descubra como a IA para o crescimento empresarial pode aumentar a eficiência, melhorar a tomada de decisões e criar laços mais fortes com os clientes. Obtenha agora estratégias práticas.
Automatização de faturas: otimize o seu departamento de contas a pagar em 2026
15 de junho de 2026 in
Guia: Explicação
Otimize o processo de contabilidade de fornecedores através da automatização das faturas. Conheça as tecnologias e as melhores práticas e avalie o valor real para a sua organização.
Utilizamos cookies no nosso website para lhe proporcionar a experiência mais relevante, lembrando as suas preferências e visitas repetidas. Ao clicar em “Aceitar Tudo”, concorda com a utilização de TODOS os cookies. No entanto, pode visitar as "Definições de Cookies" para fornecer um consentimento controlado.
Este website utiliza cookies para melhorar a sua experiência enquanto navega no website. Desses, os cookies categorizados como necessários são armazenados no seu navegador, pois são essenciais para o funcionamento das funcionalidades básicas do website. Também utilizamos cookies de terceiros que nos ajudam a analisar e compreender como utiliza este website. Estes cookies serão armazenados no seu navegador apenas com o seu consentimento. Tem também a opção de recusar estes cookies. No entanto, a recusa de alguns destes cookies pode afetar a sua experiência de navegação.
Os cookies necessários são absolutamente essenciais para que o website funcione corretamente. Estes cookies garantem funcionalidades básicas e recursos de segurança do website, de forma anónima.
Cookie
Duration
Description
cookielawinfo-checkbox-analiticas
11 meses
Este cookie é definido pelo plugin de Consentimento de Cookies do RGPD. O cookie é usado para armazenar o consentimento do utilizador para os cookies na categoria "Análise".
---
O seu texto é uma etiqueta ou nome de campo, provavelmente de um sistema de gestão de cookies ou de um formulário web, e não uma frase completa que necessite de tradução contextual.
No entanto, se o objectivo for manter a clareza e a funcionalidade para um utilizador de língua portuguesa, sugiro a seguinte tradução e explicação:
**"Checkbox Funcional"**
**Explicação:**
* **Checkbox:** Refere-se ao elemento gráfico de marcação (uma caixa que pode ser seleccionada ou desmarcada).
* **Funcional:** Indica que esta caixa de seleção está relacionada com funcionalidades essenciais do website, como o login, a gestão do carrinho de compras ou outras características que tornam o site utilizável.
Se esta etiqueta pertencer a um contexto onde se refere especificamente a cookies, a tradução poderia ser ajustada para ter mais clareza:
**"Aceitação de Cookies Funcionais"**
ou
**"Cookies Essenciais (Funcionais)"**
Esta última opção é comum em avisos de cookies para indicar que estes são estritamente necessários para o funcionamento do site.
---
11 meses
O cookie é definido pelo consentimento de cookies GDPR para registar o consentimento do utilizador para os cookies na categoria "Funcional".
cookielawinfo-checkbox-necessary
11 meses
Este cookie é definido pelo plugin GDPR Cookie Consent. O cookie é usado para armazenar o consentimento do utilizador para os cookies na categoria "Necessário".
cookielawinfo-checkbox-outros
11 meses
Este cookie é definido pelo plugin GDPR Cookie Consent. O cookie é usado para armazenar o consentimento do utilizador para os cookies na categoria "Outros".
checkbox-performance-cookielawinfo
11 meses
Este cookie é definido pelo plugin GDPR Cookie Consent. O cookie é usado para armazenar o consentimento do utilizador para os cookies na categoria "Desempenho".
política_de_cookies_visualizada
11 meses
O cookie é definido pelo plugin GDPR Cookie Consent e é utilizado para armazenar se o utilizador consentiu ou não com a utilização de cookies. Não armazena quaisquer dados pessoais.
Os cookies funcionais ajudam a realizar certas funcionalidades como partilhar o conteúdo do website em plataformas de redes sociais, recolher feedback e outras funcionalidades de terceiros.
Os cookies de desempenho são usados para compreender e analisar os principais índices de desempenho do website, o que ajuda a proporcionar uma melhor experiência ao utilizador para os visitantes.
Os cookies analíticos são usados para entender como os visitantes interagem com o website. Estes cookies ajudam a fornecer informações sobre métricas como o número de visitantes, taxa de rejeição, fonte de tráfego, etc.
Os cookies de publicidade são usados para fornecer aos visitantes anúncios relevantes e campanhas de marketing. Estes cookies rastreiam os visitantes em diferentes websites e recolhem informações para fornecer anúncios personalizados.