Ditch the Crystal Ball: Reverse-Engineering with Machine Learning

In the fast-paced world of business, staying ahead often means relying on external market indicators. However, the conventional approach of using third-party providers comes with drawbacks such as increased costs, limited customization, and external solution dependence. 


Machine Learning models are estimators – which means they can be used not only to predict unknowns in your business but also to reverse-engineer complex business processes.

As part of this blog post, you will learn how to identify these potential points of improvement, prioritize them, and create models to estimate them.


How do you figure out your current dependencies on external market indicators? You start by listing all the third-party/external services you use to get external indicators, the volume of queries, the associated costs of each, a qualitative score of how much value it’s generating for you and what are the business processes that depend on those indicators. This will help you determine what are the services that are most relevant for your business.

Some examples include:

  • Real Estate and Construction:  listing prices per area segment, points of interest
  • Retail: item depreciation per segment (e.g. car or mobile phone per year make model, mobile phone per make)
  • FinTech: expected growth per company

Data Prioritization

Your next objective is to understand how you can stop depending on these third parties, and which ones are easier to replace. For this, for each process, identify the key features and variables that influence the outcome of the process. List all the data sources you have available internally, the ones you can acquire, and the ones you can purchase (and what are the costs to do so). 

Some examples of data sources that can be used to determine the services listed above are:

  • Real Estate and Construction: real estate listings pages, open geographical data (OpenStreetMaps)
  • Retail: listings pages, auctions, classified advertisements (e.g. Craigslist) …
  • FinTech: social media, news, Linkedin job posts

We dive deeper into the identification and prioritization of data sources as part of our Data Ignite methodology.

Building Great Datasets Course

Building Great Datasets

Learn more about how to identify and prioritize data sources

Learn More

Determine the opportunity size

Afterward, measure the trade-offs of continuing to use a 3rd party solution or building your own custom. Here, you’ll want to understand what the realistic money gain you’ll get out of using your custom-built solution in a fixed time frame, assuming you can only replicate the performance up to a certain degree.

To ensure an efficient and cost-effective transition, we recommend spending as much as 10% of that value to create a Proof of Concept. This will allow you to further check for the viability of the idea in a controlled environment. You should also prioritize the list of ideas to tackle by their expected value, technical feasibility, ease of business integration, and potential risks of failure. If you’re wondering about how to do so, we have the perfect tool to help you. Book a meeting with us to further discuss this process!

Collect data and train a model

At this point, you should start collecting or purchasing the data that you need to replicate the external process, as well as the associated market indicators. Focus on the data that’s more easily available and more impactful first, to develop the Proof of Concept.

When you have the data available, use Machine Learning to train a model based on the collected, purchased, and/or internal data. This model learns the patterns and relationships within the data, effectively replicating the observed business process.

Run a business simulation

After having your first model, you should translate the model’s performance into a business metric. Technical metrics – which we call “Side-kicks”, are good to estimate how good the replacement model is, but it’s important to translate it to a KPI which the stakeholders can understand our “Hero”. 

You have built an approximator to your external market indicators which, without a doubt, won’t be as accurate as the original value. What you need to consider at this point is if there’s a possibility that your error is low enough to the point where you already have a positive outcome, or if you need to iterate further.  You can easily make profit out of mediocre models!


Replace the original process

At this point, you can replace the original business process. Monitor the KPIs that depend on that external market indicator, and have a fallback solution if you notice that the business simulation and the real-life results are not properly aligned.

You will no longer depend on a third party provider to give you the data that you were using to run your business – that’s already a great step forward!

Summing up

In this blog post, we covered the several steps to reverse engineer market indicators: identifying potential points of improvement, prioritizing them by opportunity size and effort, creating models to estimate them, and determining if the models are good enough to be replacements.

If a high dependency on external market indicators is something you’ve been facing in your business, contact me and I can help you discover how to tackle it.

Do you want to further discuss this idea?

Book a meeting with Paulo Maia

Meet Paulo Learn More


Like this story?

Subscribe to Our Newsletter

Special offers, latest news and quality content in your inbox once per month.

Signup single post

This field is for validation purposes and should be left unchanged.

Recommended Articles

Business-centric AI: A New Perspective for Your Company

Coping with the challenge of integrating AI into your business? You’re not alone. Many companies struggle to find the right approach to AI, often getting lost in technical details or data management issues. However, there’s a solution that transcends these common pitfalls: Business-centric AI. This transformative strategy is the perfect way to align your core […]

Read More
Long-term vs. Short-term Predictions in Machine Learning

When building a machine learning model, one of the most common questions is whether to opt for long-term or short-term predictions. In other words, should you build a model that forecasts an event tomorrow or a month from now? Our article will demystify this critical decision-making process. We’ll walk you through a strategic approach that […]

Read More
Ditch the Crystal Ball: Reverse-Engineering with Machine Learning

  Machine Learning models are estimators – which means they can be used not only to predict unknowns in your business but also to reverse-engineer complex business processes. As part of this blog post, you will learn how to identify these potential points of improvement, prioritize them, and create models to estimate them. Identification How […]

Read More