Category: Technical

Classifying text using LLMs

  Text classification is one of the most common use cases in Natural Language Processing, with numerous practical applications – now easier to access with Large Language Models. Companies use text classification in multiple scenarios to become more efficient: Tagging large volumes of data: reducing manual labor with better filtering, automatically organizing large volumes of […]

Written by on Aug 29, 2023

Making Money with Mediocre AI Models: A Guide for Business Stakeholders

In the world of AI, it’s easy to assume that only the most accurate models can bring value to your business. However, this is far from the truth. In fact, even mediocre models can be transformed into money-making machines with the right strategies. In this article, we’ll explore three real-life examples of how we turned […]

Written by on Aug 15, 2023

Spatial Explanations: Unlocking Insights with Occlusions

Spatial Explanations with Occlusions: In computer vision, businesses must grasp the workings of image models to fully leverage visual data. Our simple method called spatial explanations with occlusions, helps achieve a deeper understanding. By employing spatial occlusions across images, this technique unveils critical areas that significantly influence the model’s predictions.” What to do with these […]

Written by on Aug 2, 2023

Protect your AI Model from attackers!

Machine learning models can achieve amazing results performing tasks they were designed to. They can also have catastrophic performance if the data we feed the model is not compliant with the data used to train it. This can be exploited as an adversarial attack on our model. Adversarial attacks are a common and growing problem […]

Written by on Jul 10, 2023

In medio stat virtus? Not always!

The Problem What do you do when the model is underperforming? When the models’ performance does not meet our expectations, we usually spend time searching for the flaws, selecting and analyzing the cases where it failed to understand why it happened. Then, we try to apply more robust solutions, train, test, and repeat. In some […]

Written by on Apr 10, 2023

Increasing Efficiency with Active Learning

The problem: Labeling data is boring (and expensive) So there you are. You have collected your data, analyzed it, processed it, and built your sophisticated model architecture. After many hours of training and evaluating, you have come to a very unpleasant conclusion: you need more data. Before you readjust your budget to fit the extra […]

Written by on Mar 3, 2023

How to deal with the annoying implications of changing data sources

Let’s discuss a common scenario in AI consulting. The client provides access to data sources in formats such as CSVs or databases that aren’t in a production environment. Why? Usually, they’re exploring the value of the project, do not want to disclose too much data and want to prevent technical problems from happening at the […]

Written by on Nov 20, 2022

Stop removing outliers just because!

Outliers are data points that stand out for being different from the remaining data distribution. An outlier can be: An odd value in a feature A data point distant from the centroid of the data A data point in a region of low density, but between areas of high density. Suppose you have been working […]

Written by on Nov 14, 2022

Duplicate detection in text data

A common use case seen across several industries is the creation of systems capable of detecting the similarity between pairs of objects – images and texts. For example, duplicate detection in marketplaces, or recommendation systems that show similar objects to the ones the user has searched for, can use such systems. They can also be […]

Written by on Oct 25, 2022

Turning classes into inputs

Let’s face it, we all have worked on an ML project where we had to predict a ridiculously high number of classes. Large enough to make the number of observations per class into an embarrassingly small subset. Most people model these tasks as a multiclass classification problem where, for each input observation, we must predict […]

Written by on Sep 22, 2022