Let’s face it, we all have worked on an ML project where we had to predict a ridiculously high number of classes. Large enough to make the number of observations per class into an embarrassingly small subset. Most people model these tasks as a multiclass classification problem where, for each input observation, we must predict the most likely class (or the class probabilities).
Examples of such tasks are predicting the model of a car, the species of an animal, the intent of a user on a chat, the SIC/NAICS code of a company, and the product on a marketplace picture, among many others.
A dynamic number of classes also characterizes these examples. For example, let’s say we are training a Computer Vision model to recognize the item on a photo for an autonomous retail store. Every day, new products are launched to the market. If you go with the traditional approach, you must train a new model daily to keep up with the catalog.
This would make the model maintenance (and operations) go wild! You don’t want that!
Our recipe for cooking large Multiclass classification models
Our trick for this kind of model is converting classes into part of the question. So, instead of training a multiclass classification model that predicts:
What’s the class of the observation? – a categorical question
we ask the question:
Is this observation from a given category? – a yes or no question.
I like to call this trick flipping your model upside down, making the outcome part of the inputs.
Technically, we transform our predictive model
into
Then, for any given observation, you just need to ask for all classes and take the one with the highest probability.
Is there a new class? Don’t worry; just ask an additional question next time you need to generate a prediction. No re-training is required. I like to call this trick flipping your model upside down, making the outcome part of the inputs.
Disclaimer: as long as your initial class subset is general enough. Otherwise, just re-train every now and then.
How can we encode classes as inputs? Multiclass classification as Binary classification
Let’s say you have features describing the classes. Then, you just need to encode the class as the set of features that describe it. For example, in the retail product recognition example, you can characterize the item by its category, brand, weight, size, color, description, ingredients, etc.
However, it’s not so common to have features describing the classes. How would you describe a user’s intent on a chat? How would you describe a car model or an animal?
Yes, it would be possible to do it. But, my bet is that you won’t have access to such data.
What to do in such a situation?
A Card Up Your Sleeve
You must agree that you have the features of the entities belonging to that class, right? In that case, you can just get features about the distribution of the observations in that class. Statistical values like the average, minimum, maximum, and variance of the features for the observations in that class. Now you have features describing the class. You’re welcome.
Hey Kelwin, but you know, aren’t features old-fashioned? We all work with deep learning nowadays and leave the model to learn its own features. I’m glad you asked, young grasshopper!
You can train a siamese neural network that answers the question:
Are these two observations from the same class?
Or, in a more formal language:
Now, you can ask the question comparing your new test observation against all training data points, aggregate the probabilities by class (e.g., maximum, average) and return the class with the highest score. Basically, you can just transform a multiclass problem into a similarity learning one.
Are you crazy? That won’t scale at all. Well, it will. First of all, you just need to index all training observations. So, whenever new input arrives, you just run your neural network on the input instance to get its latent features plus a simple nearest neighbor comparison against all the other data points.
Still, can you imagine doing that over millions of observations? Of course not, but you can always choose pivots that represent your class properly using any technique, such as k-medoids on the latent space. Easy peasy.
Now, you have a scalable model that adjusts to new classes without the need for re-training.
We have used this trick in several industries and use cases, which always pays for itself.
You gain so much operational efficiency, plus mitigating the problem of classes with low frequency.
Is a class no longer relevant? Remove its observations from your index.
Is there any new class? Add new observations to your index.
As easy as that!
There are a couple of additional tricks we can teach you, but you will need to wait for another article. I have to leave. But you don’t. So, subscribe now to our newsletter below to stay tuned.
Like this story?
Subscribe to Our Newsletter
Special offers, latest news and quality content in your inbox once per month.
Signup single post
Recommended Articles
Article
AI City Intelligence
Oct 31, 2024 in
Use Case
Imagine being able to make better decisions about where to live, where to establish a new business, or how to understand the changing dynamics of urban neighborhoods. Access to detailed, up-to-date information about city environments allows us to answer these questions with greater confidence, but the challenge lies in accessing and analyzing the right data. […]
EcoRouteAI: Otimização de Ecopontos com Inteligência Artificial
Sep 30, 2024 in
News
O Plano Estratégico para os Resíduos Urbanos (PERSU) 2030 definiu metas ambiciosas para a gestão de resíduos em Portugal, com o objetivo de aumentar a reciclagem e melhorar a sustentabilidade ambiental. No entanto, os atuais índices de reciclagem e separação de resíduos ainda estão aquém do necessário, tanto a nível nacional quanto europeu, criando desafios […]
NILG.AI named Most-Reviewed AI Companies in Portugal by The Manifest
Aug 28, 2024 in
News
The artificial intelligence space has been showcasing many amazing technologies and solutions. AI is at its peak, and many businesses are using it to help propel their products and services to the top! You can do it, too, with the help of one of the best AI Companies in Portugal: NILG.AI. We focus on your […]
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.