Transparency is of utmost importance when AI is applied to high stake decision problems where additional information on the underlying process beyond the output of the model may be required. Taking the automation of loan attribution as an example, a client that has a loan denied will surely want to know why did that happen and how it could have been avoided. In this blogpost, we give you some examples of algorithms for Explainable AI, with a focus on Healthcare. This is the first part of our special “AI in Healthcare” month where we give special focus to the state of the art applications of Artificial Intelligence in Health.
This kind of insight may not be as easy to obtain as one would think, given that the main issue of deep learning models is that there is a lack of explicit representation of knowledge, even when the technical principles of the models are understood, thus the rising interest in explainable AI.
Furthermore, official regulations such as the General Data Protection Regulation (GDPR and ISO/IEC 27001) were created in the EU for ensuring that the widespread of automation techniques, such as deep neural networks, is made in a trustworthy manner hampering their use in business carelessly. This does not mean that these approaches have to explain every aspect in every moment, instead, the results should be traced back by demand, as explained here.
Course
The ABCs of Machine Learning
Master the fundamental ML concepts in our free course.
Despite the lack of a term definition within the field, the main idea is that there are two different types of understanding: understandability and interpretability are related to the functional understanding of a certain model, providing the expert user with insights of the “black box” model. On the other hand, explainability is related to providing the average user with a high-level algorithmic information allowing him to answer questions like “Why?” and not directly related with the “How?”.
Scheme of a truly explainable AI model extracted from here.
An interesting notion was introduced by Doran et al., defending that a truly explainable model not only provides a decision and an explanation, but is also able to integrate reasoning. In the figure, the model should be able to classify the provided image as a “factory” because it contains certain elements and afterwards provides reasoning supporting the decision: the association between the elements and the label should be made in an organized way, and not post classification. By this example one can easily see that in this ideal model both the “how?” and the “why?” are presented.
Explainable models can be divided in Post-hoc and Ante-hoc (or in-model techniques). The application of the first methods are done in a trained model, fitting explanations (e.g. saliency maps), whereas the latter ones are intrinsic to the model, (e.g. decision trees) as explained in the work of Holzinger et al.
The table contains some examples of techniques that are currently being used. For simplicity, only visual data will be considered and one example of each group will be detailed, so that the reader gets a clearer picture of what can be achieved. There is a lot more done of what was discussed here, for example, Singh et al. made a review on the methods currently being used to enhance the transparency of deep learning models applied to the medical image analysis.
Post-hoc
Ante-hoc
Activation Maximization
Attention-based models
Layer-wise propagation
Part-based models
Activation Maximization
Activation Maximization visualizes the preferred inputs of certain neurons in each layer. This is done by finding the input pattern that leads to a maximum activation of a certain neuron (each input pixel is changed until the maximum is achieved). The process is iterative and starts by a random input image that is updated. The gradients can be computed using back-propagation, while maintaining constant the parameters learnt by the convolutional neural network. This way, each pixel of the initial noisy images are iteratively
changed to maximise the activation of the considered neuron until the preferred pattern image is reached. As we can see by the image, obtained from the work of Nguyen et al., the image on the right corresponds to an abstract representation of the pool tables that achieves the most activation within the network. Reyes et al. concentrated their efforts on the current state of the art regarding explainable AI and radiology. In their work they showcase how techniques like this, and others can be employed.
Layer-wise Relevance Propagation
Layer-wise Relevance Propagation (LRP) represents by heatmaps the contribution that each pixel has in the output, for the case of kernel based models. This decomposition is rooted in a series of constraints to guarantee that the heatmap is realistic and consistent. It can be formally given as representing the output by the sum of relevances for each pixel, in the input layer. These relevances can be computed in a chain-like mechanism where the total of relevances in a certain previous layer equals the total of relevances in the next layer. This also implies a constraint in the decomposition, ensuring that the total relevance remains constant from the output layer to the input layer, meaning that no relevance is forfeited or generated.
For more examples and detail on this technique, see the work of Bach et al. from which this image was picked. This technique was recently applied by Karim et al. on convolutional neural networks applied to X-ray images of lungs for COVID-19 detection.
Attention Models
Attention Models are methods based on the human vision system and its focal perception and processing of objects, though they are not exclusive to computer vision problems by having applications in Natural Language Processing, Statistical Learning and Speech (see Chaudhari et al. for more on attention models and examples of them). Attention models provide a way to enhance neural network interpretability while, in some cases, reducing the computational cost by selecting certain parts from the input.
Nowadays, there are already approaches that take advantage of this idea. For example, Rio-Torto et al. proposed a network that contains both a Classifier and an Explainer. The explainer gives higher weight to the classification on the relevant parts of the input, as can be seen in the image below, (extracted from their work), where the zebra stripes are highlighted.
Part based networks
Part based networks are a recent architecture that is based in the human way of explaining an image in a classification task based on parts that are similar to what we, throughout our experiences, have previously seen.
Prototypical part network is an ante-hoc technique capable of obtaining explanations alongside the predictions proposed by Chen et al.. However, the explanations are in reality the sub-product of the prototypes activation on the network’s input. These prototypes are latent representations of some input part, from a certain class, in which the decision was based on considering a weight combination of these scores that will dictate the class at which the image belongs to. In the original classification task, and following the scheme below, one can see that in the prototype layer each of the prototypes are from a part of a bird, the first one is the head of a clay coloured sparrow whereas the second one is the head of a Brewer’s sparrow. This means that the network learned, for example, that the head of a clay coloured sparrow is a distinctive pattern of its specific class, and that if the input image is similar to the patterns within the prototype then it will positively contribute to the classification of the image.
Practical case study – Explainable AI applied to epilepsy
During my MSc Dissertation, I worked with the Clinical Neurophysiology Group, from the University of Twente, on the automation of the explainable diagnosis of epilepsy by deep learning models. It is also worth mentioning Prof. Luís Teixeira who gave crucial guidance during my work.
Epilepsy is a neurological disorder that affects more than 50 million people worldwide whose diagnosis is based on electroencephalography (EEG) – recording of brain electrical activity. The current clinical practice includes the analysis of the EEG recordings by trained neurologists to identify abnormal patterns associated with seizures, characterized by high frequency abnormalities in the EEG signals. However, this visual analysis, apart from being subjective, is extremely laborious as it requires highly trained neurophysiologists to go through EEG signals that may have hours of recording.
A major hindrance with the application of deep learning models in healthcare is that they are often seen as “black boxes”, despite their high performance. Consequently, they provide little to no insight of the processes underlying the decision, which in the medical context is not acceptable. To tackle this, two explainable approaches were used for seizure detection where the models not only identified which portions of the signals were the most probable of having abnormal patterns but which regions contributed the most to this decision (visual explanations).
The approaches used were the two previous ones described, the Classifier and Explainer network (C&E), and the Prototype part network (ProtoPNet). Both approaches were evaluated with respect to the classification task and the explanations provided, which were directly compared to those of experts. Below we can see the EEG signals from a multi-channel montage and the overlapped explanations where the high spike regions are the most highlighted. The similarity with the explanations provided by the experts tells us that indeed both networks were able to provide relevant insights by correctly suggesting seizure related patterns associated with the decision of the model.
Conclusion
With the diffusion of machine learning models throughout different businesses and fields, an increase in the number of works done in explainable AI is also seen in the literature. High stake decisions are the main focus of these models, as they often require more than a decision, or prediction, to be accepted and employed safely in the desired environment, such as in this example with Epilepsy. However, one can also use these approaches in several other AI domains, to help understand really what is driving your decisions. If you have some reservations on applying AI to your business feel free to discuss with us this kind of more transparent approach and how it could help you!
Special offers, latest news and quality content in your inbox once per month.
Signup single post
Recommended Articles
Article
AI City Intelligence
Oct 31, 2024 in
Use Case
Imagine being able to make better decisions about where to live, where to establish a new business, or how to understand the changing dynamics of urban neighborhoods. Access to detailed, up-to-date information about city environments allows us to answer these questions with greater confidence, but the challenge lies in accessing and analyzing the right data. […]
EcoRouteAI: Otimização de Ecopontos com Inteligência Artificial
Sep 30, 2024 in
News
O Plano Estratégico para os Resíduos Urbanos (PERSU) 2030 definiu metas ambiciosas para a gestão de resíduos em Portugal, com o objetivo de aumentar a reciclagem e melhorar a sustentabilidade ambiental. No entanto, os atuais índices de reciclagem e separação de resíduos ainda estão aquém do necessário, tanto a nível nacional quanto europeu, criando desafios […]
NILG.AI named Most-Reviewed AI Companies in Portugal by The Manifest
Aug 28, 2024 in
News
The artificial intelligence space has been showcasing many amazing technologies and solutions. AI is at its peak, and many businesses are using it to help propel their products and services to the top! You can do it, too, with the help of one of the best AI Companies in Portugal: NILG.AI. We focus on your […]
We use cookies on our website to give you the most relevant experience by remembering your preferences and repeat visits. By clicking “Accept All”, you consent to the use of ALL the cookies. However, you may visit "Cookie Settings" to provide a controlled consent.
This website uses cookies to improve your experience while you navigate through the website. Out of these, the cookies that are categorized as necessary are stored on your browser as they are essential for the working of basic functionalities of the website. We also use third-party cookies that help us analyze and understand how you use this website. These cookies will be stored in your browser only with your consent. You also have the option to opt-out of these cookies. But opting out of some of these cookies may affect your browsing experience.
Necessary cookies are absolutely essential for the website to function properly. These cookies ensure basic functionalities and security features of the website, anonymously.
Cookie
Duration
Description
cookielawinfo-checkbox-analytics
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Analytics".
cookielawinfo-checkbox-functional
11 months
The cookie is set by GDPR cookie consent to record the user consent for the cookies in the category "Functional".
cookielawinfo-checkbox-necessary
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookies is used to store the user consent for the cookies in the category "Necessary".
cookielawinfo-checkbox-others
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Other.
cookielawinfo-checkbox-performance
11 months
This cookie is set by GDPR Cookie Consent plugin. The cookie is used to store the user consent for the cookies in the category "Performance".
viewed_cookie_policy
11 months
The cookie is set by the GDPR Cookie Consent plugin and is used to store whether or not user has consented to the use of cookies. It does not store any personal data.
Functional cookies help to perform certain functionalities like sharing the content of the website on social media platforms, collect feedbacks, and other third-party features.
Performance cookies are used to understand and analyze the key performance indexes of the website which helps in delivering a better user experience for the visitors.
Analytical cookies are used to understand how visitors interact with the website. These cookies help provide information on metrics the number of visitors, bounce rate, traffic source, etc.
Advertisement cookies are used to provide visitors with relevant ads and marketing campaigns. These cookies track visitors across websites and collect information to provide customized ads.