10 Key Sales Forecasting Techniques for 2025
Jul 2, 2025 in Machine Learning
Discover the top 10 sales forecasting techniques for 2025. This guide covers everything from time series to machine learning for ultimate accuracy.
Not a member? Sign up now
An overview of Classification with Reject Option
Paulo Maia on Aug 2, 2021
The Miranda warning prevents us from self-incrimination.
You have the right to remain silent. Anything you say will be used against you.
If we hold ML models accountable for their predictions, shouldn’t we at least grant them that right? Can we expect ML models to know everything? I guess we don’t! Moreover, it would be beneficial to know when the model is unsure about what to say.
Granting ML models the right to abstain is known as the reject option. And it’s pretty handy. We will show you how to use it in this article.
Machine Learning and Artificial Intelligence algorithms are currently applied in almost every industry, integrating numerous Value Chains that depend on their decisions. However, despite the continuous advances in the state-of-the-art, these algorithms are still not perfect and make several mistakes in critical situations. The cause of each mistake might rely on several factors, for example:
Data points that are too close to a decision boundary – In real-world datasets, decision boundaries might be hard to define. In those cases, in the predictions close to the boundary, the model returns predictions with low levels of confidence, which might lead to misclassified data.
Outliers – If the data point doesn’t belong to any population seen on the training set, it’s hard for the model to make an inference about that data point.
Low confidence levels might lead to misclassification cases but is it really the model’s fault? When we ask an algorithm for its predictions about a data point, we force it to return an answer, even if it doesn’t know it. For several use cases (see some examples in “Applications”), it’s beneficial to give the model the option to remain silent, i.e., if the algorithm is not confident enough, it has the option to reject the data point, avoiding making mistakes – Classification with Reject Option.
Get familiar with the Machine Learning concepts with our course.
Learn More
This approach is only applicable when it is possible to pass on the decision to another available decision system (e.g. another algorithm, a specialist, exams, or tests) or when there’s no need to return predictions for the entire dataset. In other words, apply Reject Option when the cost of rejecting an instance by the model is lower than the error cost. Here are a couple of examples of applications that can benefit from an approach of Classification with Reject Option:
Now that we have seen how Classification with Reject Option can help us in critical use cases, let explore how we can integrate it in our model implementations.
The easiest and simplest way to integrate Reject Option in a Decision Support System is applying post-processing on the model results, considering the confidence level and the performance goal. For example, if the acceptable performance is an average accuracy above 95%, you can optimize the confidence threshold for each class. To do so, follow these steps:
To avoid overfitting over the validation set, apply cross-validation and compute the optimized threshold considering one of the sample statistics: average, median, or mode.
Despite being easy to implement, this method has some limitations. First of all, it’s hard to regularize the amount of data that is being ignored by post-processing. In the limit, this method is able to find perfect metrics by ignoring all the data, so you will need extra mechanisms to avoid that to occur in your optimization. Second of all, since this method is applied after getting the predictions, the model doesn’t learn how the feature space is related to data rejection. To overcome these limitations, we present to you the next three methods found in the literature.
This method was explored by Sousa, Ricardo Gamelas, et al. in [1] for a binary problem. The solution implemented by them included the following steps:
A weakness of this model is that it needs two different training sets, one for the first model and a second to be re-labeled and to train the second model. If you’re dealing with small amounts of data, you might compromise the model performance by using only half of it.
This method was also presented by Sousa, Ricardo Gamelas, et al. in [1] for a binary problem. However, as well as the previous method, it can be adapted for the multi-class problem.
The implementation of this solution integrated the following steps:
To extend this method for a multi-class problem, you must train a different model for each class and then combine the predictions of all the models to check if there is unanimity, otherwise, the data point is rejected. This means the computation scales with the number of classes in the problem, which makes it impracticable when working with datasets with a high amount of classes, as the Imagenet (1000 classes), for example.
The fourth and last method was proposed by Geifman, Yonatan, and Ran El-Yaniv in [2] with the novel Dense Neural Network (DNN) architecture “Selectivenet”. The Selectivenet can be adapted to any DNN, by adding an extra task to the model for data selection. The selection task is self-supervised, which means there’s no ground truth related to this task but its output is supervised by the loss function.
The loss function has then two terms: one to punish misclassifications on the data points that were not rejected by the model and a second term to punish the rejection itself to avoid a massive rejection.
Additionally, the authors suggest joining an auxiliary task that can be the same as the classification task or a different one, as long as it doesn’t ignore any data point. The purpose of this task is to force the model to learn the entire feature space represented by the available data and to learn the relation between the feature space and the rejection. Adding the auxiliary task implies the addition of a third term to the loss function, whose impact is regularized by a parameter.
From all the methods this seems the most functional since it is easy to implement, it doesn’t require an extra data partition to optimize the thresholds, and it doesn’t cause a significant increase in the computation cost.
To know more about other learning strategies, check our course.
Learn MoreReject Option methods are useful to increase the trustability of Machine Learning methods and to avoid mismanagement in critical situations. However, it is not applicable to every use case. When a data point is rejected by the model and it can’t be ignored, someone or something has to handle it, and that option might not be available. Once again, the key to a successful AI system is in understanding the problem, finding the strengths and the limitations associated with each possible method, and designing a solution that fits the problem and its context.
If you’re looking for more ideas or if you’re willing to discuss cutting-edge solutions in AI, contact us at [email protected]
[1] – Sousa, Ricardo Gamelas, et al. “Robust classification with reject option using the self-organizing map.” Neural Computing and Applications 26.7 (2015): 1603-1619.
[2] – Geifman, Yonatan, and Ran El-Yaniv. “Selectivenet: A deep neural network with an integrated reject option.” International Conference on Machine Learning. PMLR, 2019.
Like this story?
Special offers, latest news and quality content in your inbox.
Jul 2, 2025 in Machine Learning
Discover the top 10 sales forecasting techniques for 2025. This guide covers everything from time series to machine learning for ultimate accuracy.
Jul 2, 2025 in Technical
Discover key business intelligence dashboard examples to enhance your data insights. Explore trending dashboards for impactful decision-making in 2025.
Jul 2, 2025 in Machine Learning
A practical guide to predicting customer churn. Learn how to build a churn prediction model, from data prep to actionable retention strategies.
Cookie | Duration | Description |
---|---|---|
cookielawinfo-checkbox-analytics | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics". |
cookielawinfo-checkbox-functional | 11 months | The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional". |
cookielawinfo-checkbox-necessary | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary". |
cookielawinfo-checkbox-others | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other. |
cookielawinfo-checkbox-performance | 11 months | This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance". |
viewed_cookie_policy | 11 months | The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data. |