In many binary classification problems, especially in domains with highly unbalanced problems (such as the medical domain and rare event detection), we need to make sure our model does not become too biased for the more predominant class.
Thus, you may have heard that accuracy is not a good metric to validate classifiers in unbalanced settings. Instead, people tend to use other performance metrics that are robust to imbalance, such as the ROC AUC and the F1-score. So, why don’t we train models that optimize these metrics directly? Well, in some cases it is not possible/efficient since they are not differentiable. This is a series of blog posts in which we’ll explain how to optimize your model for such metrics (or soft versions of them), starting with the ROC AUC.
What is the ROC AUC?
A ROC curve is a plot that illustrates the diagnostic ability of a binary classifier for varying discriminative thresholds in its probabilistic output. It can be built by thresholding the model’s predicted probabilities at several values between 0 to 1 and calculating the True Positive Rate (Proportion of positive samples correctly predicted as positive) and the False Positive Rate (proportion of negative samples incorrectly predicted as positive). The ROC curve is the plot of all these determined points, as shown below.
In ML, we typically want to achieve the curve with the highest possible area. Why? Maybe you didn’t know this, but the area under the ROC curve equals the probability of the classifier ranking a randomly chosen positive instance ahead of a randomly chosen negative one (the proof of such a theorem is available here). Namely, what’s the probability of assigning a higher priority to a sick patient than to a healthy patient.
So, we can think of the ROC AUC as the accuracy of a ranking model when exposed to a pair of samples with opposite classes (e.g, one sick and one healthy). Being the cross-entropy loss the de facto soft approach for learning binary classifiers when we have accuracy in mind, the cross-entropy of a pairwise ranking model (e.g., a siamese neural network) would be a soft way of learning that tends to maximize the ROC AUC.
How to optimize the ROC AUC?
Let’s say we have a model (e.g., a deep neural network) such as the following one that, given the input features, predicts a continuous score.
As we discussed before, maximizing the ROC AUC is equivalent to maximizing the accuracy of the score difference sign for a positive-negative pair:
Thus, we can simply use a Siamese architecture, where each stream will contain our target model, trained on positive-negative pairs. In our architecture, the scores will be subtracted and passed through a sigmoid activation in order to approximate the probability of the positive sample having a higher score than the negative sample.
For generating our training batches, each pair will have a sample from each class, one on the negative stream and one on the positive stream, meaning that the ground truth will always be 1, as the probability from the positive stream should always be higher than the probability from the negative stream. The model is trained by minimizing the cross-entropy loss of this pairwise target. We won’t converge to a naive solution here given that the weights on each stream of a siamese network are shared.
This means the model is penalized whenever the negative side has a greater score than the positive side. As such, we are optimizing the model so that it always gives a higher score to the positive class (input_pos) when compared to a negative class (input_neg) – which is essentially the definition of optimizing ROC AUC!
So, how do we turn this network into an actionable model, which returns the binary classes? We need to reduce our network to a single stream, with the pre-trained weights, and determine a threshold value for the predicted score above which the model classifies the class as positive.
Validation
This architecture was tested on the CIFAR10 dataset in Keras, by creating an artificially unbalanced problem. The positive class was considered to be “airplanes”, and the negative class was all the other classes in the dataset. The positive class was then subsampled to 5% so we created an artificially unbalanced problem.
Afterward, we used a feedforward network with internal dropout layers and compared the performance of the simple cross-entropy-based strategy to train classifiers with the siamese-based models that we discussed in this post. The experiment was repeated 5 times with different random seeds to get a mean value more independent of the image selection process.
The mean ROC AUC value for the Siamese Network was (86 ± 1.3)%, while for the one-stream network the value was reduced to (72.2 ± 7.2)%, showing that optimizing the model with the siamese architecture was beneficial for the ROC AUC.
Conclusion
This blogpost explained how to optimize your model for a different metric, based on the probabilistic interpretation of ROC AUC.
At NILG.AI, we have worked on a lot of medical and marketing applications, where targets tend to be extremely unbalanced. We have used this strategy in several projects, achieving on each case higher performance with this learning strategy than with traditional learning approaches. If you are facing a similar problem, let’s discuss how we can collaborate with these types of learning approaches!
Like this story?
Subscribe to Our Newsletter
Special offers, latest news and quality content in your inbox once per month.
Signup single post
Recommended Articles
Article
NILG.AI named Most-Reviewed AI Companies in Portugal by The Manifest
Aug 28, 2024 in
News
The artificial intelligence space has been showcasing many amazing technologies and solutions. AI is at its peak, and many businesses are using it to help propel their products and services to the top! You can do it, too, with the help of one of the best AI Companies in Portugal: NILG.AI. We focus on your […]
Predictive models are transforming the AI landscape. They can forecast future events, identify past occurrences, and even predict present situations. However, building a successful predictive model is not as simple as it seems. To achieve an effective predictive model, you need to consider three crucial moments: the prediction time, the prediction window, and the data […]
Generative AI is a powerful tool that many companies are rushing to incorporate into their operations. However, it’s crucial to understand the possible risks associated with this technology. In this article, we’ll discuss the top nine risks that could impact your business’s readiness for AI integration. Stay ahead of the curve, and make sure you’re […]
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.