{"id":3109,"date":"2023-07-10T14:24:49","date_gmt":"2023-07-10T14:24:49","guid":{"rendered":"https:\/\/nilg.ai\/?p=3109"},"modified":"2023-09-06T06:59:06","modified_gmt":"2023-09-06T06:59:06","slug":"protect-your-ai-model-from-attackers","status":"publish","type":"post","link":"https:\/\/nilg.ai\/pt\/202307\/protect-your-ai-model-from-attackers\/","title":{"rendered":"Protect your AI Model from attackers!"},"content":{"rendered":"<p><img decoding=\"async\" class=\"alignnone size-large wp-image-3261\" src=\"https:\/\/nilg.ai\/wp-content\/uploads\/2023\/05\/medium-shot-man-holding-device-small-1024x683.jpg\" alt=\"\" width=\"1024\" height=\"683\" srcset=\"https:\/\/nilg.ai\/wp-content\/uploads\/2023\/05\/medium-shot-man-holding-device-small-1024x683.jpg 1024w, https:\/\/nilg.ai\/wp-content\/uploads\/2023\/05\/medium-shot-man-holding-device-small-300x200.jpg 300w, https:\/\/nilg.ai\/wp-content\/uploads\/2023\/05\/medium-shot-man-holding-device-small-768x512.jpg 768w, https:\/\/nilg.ai\/wp-content\/uploads\/2023\/05\/medium-shot-man-holding-device-small-1536x1024.jpg 1536w, https:\/\/nilg.ai\/wp-content\/uploads\/2023\/05\/medium-shot-man-holding-device-small-600x400.jpg 600w, https:\/\/nilg.ai\/wp-content\/uploads\/2023\/05\/medium-shot-man-holding-device-small.jpg 1728w\" sizes=\"(max-width: 1024px) 100vw, 1024px\" \/><\/p>\n<p>Machine learning models can achieve amazing results performing tasks they were designed to. They can also have <a href=\"https:\/\/nilg.ai\/pt\/202308\/boosting-profits-with-mediocre-ai-models\/\">catastrophic performance<\/a> if the data we feed the model is not compliant with the data used to train it. This can be exploited as an adversarial attack on our model. Adversarial attacks are a common and growing problem in AI, where attackers use <a href=\"https:\/\/nilg.ai\/pt\/bonus\/data-collection-constraints\/\">misrepresentative data<\/a> to mess with our model\u2019s performance, affecting our product and reputation.<\/p>\n<h2>How can we protect our models from adversarial attacks?<\/h2>\n<p><span style=\"font-weight: 400;\">One way to shield our models from adversarial attacks is to detect <a href=\"https:\/\/nilg.ai\/pt\/202211\/stop-removing-outliers-just-because\/\">out-of-distribution samples<\/a> before feeding them to our models.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">The main idea is to create a pipeline that checks if the sample follows the same distribution as the training data used to train our model. If it does, great &#8211; you\u2019re good to go. Otherwise, reject the input sample.\u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n<h2>How can we distinguish an in-sample from an out-sample?<\/h2>\n<p><span style=\"font-weight: 400;\">Due to the ability to perform distribution fitting, generative models have become some of the best anomaly detection methods. <a href=\"https:\/\/en.wikipedia.org\/wiki\/Mixture_model\" target=\"_blank\" rel=\"noopener\">Gaussian Mixture Model<\/a> (GMM) is a probabilistic clustering model that assumes all data was generated by a mixture of <\/span><i><span style=\"font-weight: 400;\">n<\/span><\/i><span style=\"font-weight: 400;\"> number of Gaussian distributors. <a href=\"https:\/\/en.wikipedia.org\/wiki\/Mixture_model\" target=\"_blank\" rel=\"noopener\">GMMs<\/a> became handy for outlier detection by detecting data samples from low-density regions.\u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400;\">Let\u2019s imagine we have built an Image Classification model, and we are trying to prevent users from input images that are not suitable to be processed by our model.<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400;\">Firstly, we need to create a dataset with <\/span><i><span style=\"font-weight: 400;\">in-sample<\/span><\/i><span style=\"font-weight: 400;\"> e <\/span><i><span style=\"font-weight: 400;\">out-sample<\/span><\/i><span style=\"font-weight: 400;\"> data points, and label them accordingly. The <\/span><i><span style=\"font-weight: 400;\">in-sample<\/span><\/i><span style=\"font-weight: 400;\"> (positive) data points will be our validation + test data used in the Image Classification model. The out-sample data points could be images from ImageNet\/COCO datasets for example. Ideally, we should also include images similar to our dataset that would likely be errors from an uninformed user. For example, if our model is trained to classify objects that are present in a bathroom (toilet, bathtub, sink, etc.), an out-sample image could be an object you can find inside your house that is not present in bathrooms (bed, couch, chairs, etc).<\/span><\/p>\n<p>&nbsp;<\/p>\n<p><span style=\"font-weight: 400;\">Secondly, we need to extract a numerical representation of the images &#8211; the embeddings. Since we are dealing with unstructured data, we need to extract the latent space of each image in our dataset before we fit our data into the GMM model.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">Now, we fit the in-sample data to the GMM and validate the performance at detecting outliers using both in-sample and out-of-sample data. At this stage, we might need to spend some time tweaking the parameters of the GMM. One important thing to remember is that the number of classes we have in our dataset could indicate how many Gaussian mixture components we need in the GMM model.\u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n<h2>What if the GMM can\u2019t distinguish well enough between real cases and adversarial attacks?<\/h2>\n<p><span style=\"font-weight: 400;\">In cases where pre-trained models were used to extract the embeddings, what can happen is that the latent spaces of in-sample and out-samples are still very similar when compared with the broad domain present in the datasets used to train those pre-trained models. ImageNet contains images of animals, nature, and objects of all kinds. So, when we use a pre-trained model that was trained on ImageNet to extract embeddings from our domain-specific images, it is normal for those latent spaces to be quite similar.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400;\">How can we help the GMM model distinguish the in-samples and out-samples latent spaces better? By using <a href=\"https:\/\/en.wikipedia.org\/wiki\/Principal_component_analysis\">Principal Component Analysis<\/a> (or any other dimensionality reduction technique) we should be able to remove general similarities between the samples. Also, in this stage, we might have to spend some time fine-tuning the parameters and figuring out how much reduction we should apply. After dimensionally reducing our latent spaces, we fit them into the GMM model.\u00a0<\/span><\/p>\n<p>&nbsp;<\/p>\n<h2>Conclusion<\/h2>\n<p><span style=\"font-weight: 400;\">Using GMMs to detect if a sample is too different from the training distribution is a great way of protecting our models from adversarial usage. By checking if a sample belongs to the training distribution we are guaranteeing that the model is performing the task as it was designed to, preventing the chances of miss performing and thus increasing the reliability and reputation of our product.<\/span><\/p>","protected":false},"excerpt":{"rendered":"<p>Machine learning models can achieve amazing results performing tasks they were designed to. They can also have catastrophic performance if the data we feed the model is not compliant with the data used to train it. This can be exploited as an adversarial attack on our model. Adversarial attacks are a common and growing problem [&hellip;]<\/p>\n","protected":false},"author":128,"featured_media":3261,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[53],"tags":[182,44,73],"class_list":["post-3109","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-technical","tag-ai-safety","tag-ai4business","tag-anomaly-detection"],"acf":[],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v20.8 - https:\/\/yoast.com\/wordpress\/plugins\/seo\/ -->\n<title>Protect your AI Model from attackers! - NILG.AI<\/title>\n<meta name=\"description\" content=\"ML models can perform catastrophically when exposed to adversarial attacks via improper data. We will show you how to prevent it.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/nilg.ai\/pt\/202307\/protect-your-ai-model-from-attackers\/\" \/>\n<meta property=\"og:locale\" content=\"pt_PT\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Protect your AI Model from attackers! - NILG.AI\" \/>\n<meta property=\"og:description\" content=\"ML models can perform catastrophically when exposed to adversarial attacks via improper data. We will show you how to prevent it.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/nilg.ai\/pt\/202307\/protect-your-ai-model-from-attackers\/\" \/>\n<meta property=\"og:site_name\" content=\"NILG.AI\" \/>\n<meta property=\"article:published_time\" content=\"2023-07-10T14:24:49+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2023-09-06T06:59:06+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/nilg.ai\/wp-content\/uploads\/2023\/05\/medium-shot-man-holding-device-small.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1728\" \/>\n\t<meta property=\"og:image:height\" content=\"1152\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"Rafael Cavalheiro\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@nilg_ai\" \/>\n<meta name=\"twitter:site\" content=\"@nilg_ai\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"Rafael Cavalheiro\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutos\" \/>\n<script type=\"application\/ld+json\" class=\"yoast-schema-graph\">{\"@context\":\"https:\/\/schema.org\",\"@graph\":[{\"@type\":\"Article\",\"@id\":\"https:\/\/nilg.ai\/202307\/protect-your-ai-model-from-attackers\/#article\",\"isPartOf\":{\"@id\":\"https:\/\/nilg.ai\/202307\/protect-your-ai-model-from-attackers\/\"},\"author\":{\"name\":\"Rafael Cavalheiro\",\"@id\":\"https:\/\/nilg.ai\/#\/schema\/person\/335080e48ef99bfa99f8ee59bce6164d\"},\"headline\":\"Protect your AI Model from attackers!\",\"datePublished\":\"2023-07-10T14:24:49+00:00\",\"dateModified\":\"2023-09-06T06:59:06+00:00\",\"mainEntityOfPage\":{\"@id\":\"https:\/\/nilg.ai\/202307\/protect-your-ai-model-from-attackers\/\"},\"wordCount\":709,\"publisher\":{\"@id\":\"https:\/\/nilg.ai\/#organization\"},\"keywords\":[\"AI Safety\",\"AI4business\",\"Anomaly Detection\"],\"articleSection\":[\"Technical\"],\"inLanguage\":\"pt-PT\"},{\"@type\":\"WebPage\",\"@id\":\"https:\/\/nilg.ai\/202307\/protect-your-ai-model-from-attackers\/\",\"url\":\"https:\/\/nilg.ai\/202307\/protect-your-ai-model-from-attackers\/\",\"name\":\"Protect your AI Model from attackers! - NILG.AI\",\"isPartOf\":{\"@id\":\"https:\/\/nilg.ai\/#website\"},\"datePublished\":\"2023-07-10T14:24:49+00:00\",\"dateModified\":\"2023-09-06T06:59:06+00:00\",\"description\":\"ML models can perform catastrophically when exposed to adversarial attacks via improper data. We will show you how to prevent it.\",\"inLanguage\":\"pt-PT\",\"potentialAction\":[{\"@type\":\"ReadAction\",\"target\":[\"https:\/\/nilg.ai\/202307\/protect-your-ai-model-from-attackers\/\"]}]},{\"@type\":\"WebSite\",\"@id\":\"https:\/\/nilg.ai\/#website\",\"url\":\"https:\/\/nilg.ai\/\",\"name\":\"NILG.AI\",\"description\":\"Create ever-improving businesses with AI\",\"publisher\":{\"@id\":\"https:\/\/nilg.ai\/#organization\"},\"potentialAction\":[{\"@type\":\"SearchAction\",\"target\":{\"@type\":\"EntryPoint\",\"urlTemplate\":\"https:\/\/nilg.ai\/?s={search_term_string}\"},\"query-input\":\"required name=search_term_string\"}],\"inLanguage\":\"pt-PT\"},{\"@type\":\"Organization\",\"@id\":\"https:\/\/nilg.ai\/#organization\",\"name\":\"NILG.AI\",\"url\":\"https:\/\/nilg.ai\/\",\"logo\":{\"@type\":\"ImageObject\",\"inLanguage\":\"pt-PT\",\"@id\":\"https:\/\/nilg.ai\/#\/schema\/logo\/image\/\",\"url\":\"https:\/\/nilg.ai\/wp-content\/uploads\/2022\/03\/logo.svg\",\"contentUrl\":\"https:\/\/nilg.ai\/wp-content\/uploads\/2022\/03\/logo.svg\",\"caption\":\"NILG.AI\"},\"image\":{\"@id\":\"https:\/\/nilg.ai\/#\/schema\/logo\/image\/\"},\"sameAs\":[\"https:\/\/twitter.com\/nilg_ai\",\"https:\/\/youtube.com\/@nilg_ai\",\"https:\/\/www.linkedin.com\/company\/nilg-ai\/\"]},{\"@type\":\"Person\",\"@id\":\"https:\/\/nilg.ai\/#\/schema\/person\/335080e48ef99bfa99f8ee59bce6164d\",\"name\":\"Rafael Cavalheiro\",\"image\":{\"@type\":\"ImageObject\",\"inLanguage\":\"pt-PT\",\"@id\":\"https:\/\/nilg.ai\/#\/schema\/person\/image\/\",\"url\":\"https:\/\/secure.gravatar.com\/avatar\/91552d1d5f222e1a01b6f82c48bec07c413fc908ac062b6a5208df33b3db745d?s=96&d=mm&r=g\",\"contentUrl\":\"https:\/\/secure.gravatar.com\/avatar\/91552d1d5f222e1a01b6f82c48bec07c413fc908ac062b6a5208df33b3db745d?s=96&d=mm&r=g\",\"caption\":\"Rafael Cavalheiro\"},\"url\":\"https:\/\/nilg.ai\/pt\/author\/rafael\/\"}]}<\/script>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Protect your AI Model from attackers! - NILG.AI","description":"ML models can perform catastrophically when exposed to adversarial attacks via improper data. We will show you how to prevent it.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/nilg.ai\/pt\/202307\/protect-your-ai-model-from-attackers\/","og_locale":"pt_PT","og_type":"article","og_title":"Protect your AI Model from attackers! - NILG.AI","og_description":"ML models can perform catastrophically when exposed to adversarial attacks via improper data. We will show you how to prevent it.","og_url":"https:\/\/nilg.ai\/pt\/202307\/protect-your-ai-model-from-attackers\/","og_site_name":"NILG.AI","article_published_time":"2023-07-10T14:24:49+00:00","article_modified_time":"2023-09-06T06:59:06+00:00","og_image":[{"width":1728,"height":1152,"url":"https:\/\/nilg.ai\/wp-content\/uploads\/2023\/05\/medium-shot-man-holding-device-small.jpg","type":"image\/jpeg"}],"author":"Rafael Cavalheiro","twitter_card":"summary_large_image","twitter_creator":"@nilg_ai","twitter_site":"@nilg_ai","twitter_misc":{"Written by":"Rafael Cavalheiro","Est. reading time":"4 minutos"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/nilg.ai\/202307\/protect-your-ai-model-from-attackers\/#article","isPartOf":{"@id":"https:\/\/nilg.ai\/202307\/protect-your-ai-model-from-attackers\/"},"author":{"name":"Rafael Cavalheiro","@id":"https:\/\/nilg.ai\/#\/schema\/person\/335080e48ef99bfa99f8ee59bce6164d"},"headline":"Protect your AI Model from attackers!","datePublished":"2023-07-10T14:24:49+00:00","dateModified":"2023-09-06T06:59:06+00:00","mainEntityOfPage":{"@id":"https:\/\/nilg.ai\/202307\/protect-your-ai-model-from-attackers\/"},"wordCount":709,"publisher":{"@id":"https:\/\/nilg.ai\/#organization"},"keywords":["AI Safety","AI4business","Anomaly Detection"],"articleSection":["Technical"],"inLanguage":"pt-PT"},{"@type":"WebPage","@id":"https:\/\/nilg.ai\/202307\/protect-your-ai-model-from-attackers\/","url":"https:\/\/nilg.ai\/202307\/protect-your-ai-model-from-attackers\/","name":"Protect your AI Model from attackers! - NILG.AI","isPartOf":{"@id":"https:\/\/nilg.ai\/#website"},"datePublished":"2023-07-10T14:24:49+00:00","dateModified":"2023-09-06T06:59:06+00:00","description":"ML models can perform catastrophically when exposed to adversarial attacks via improper data. We will show you how to prevent it.","inLanguage":"pt-PT","potentialAction":[{"@type":"ReadAction","target":["https:\/\/nilg.ai\/202307\/protect-your-ai-model-from-attackers\/"]}]},{"@type":"WebSite","@id":"https:\/\/nilg.ai\/#website","url":"https:\/\/nilg.ai\/","name":"NILG.AI","description":"Create ever-improving businesses with AI","publisher":{"@id":"https:\/\/nilg.ai\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/nilg.ai\/?s={search_term_string}"},"query-input":"required name=search_term_string"}],"inLanguage":"pt-PT"},{"@type":"Organization","@id":"https:\/\/nilg.ai\/#organization","name":"NILG.AI","url":"https:\/\/nilg.ai\/","logo":{"@type":"ImageObject","inLanguage":"pt-PT","@id":"https:\/\/nilg.ai\/#\/schema\/logo\/image\/","url":"https:\/\/nilg.ai\/wp-content\/uploads\/2022\/03\/logo.svg","contentUrl":"https:\/\/nilg.ai\/wp-content\/uploads\/2022\/03\/logo.svg","caption":"NILG.AI"},"image":{"@id":"https:\/\/nilg.ai\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/twitter.com\/nilg_ai","https:\/\/youtube.com\/@nilg_ai","https:\/\/www.linkedin.com\/company\/nilg-ai\/"]},{"@type":"Person","@id":"https:\/\/nilg.ai\/#\/schema\/person\/335080e48ef99bfa99f8ee59bce6164d","name":"Rafael Cavalheiro","image":{"@type":"ImageObject","inLanguage":"pt-PT","@id":"https:\/\/nilg.ai\/#\/schema\/person\/image\/","url":"https:\/\/secure.gravatar.com\/avatar\/91552d1d5f222e1a01b6f82c48bec07c413fc908ac062b6a5208df33b3db745d?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/91552d1d5f222e1a01b6f82c48bec07c413fc908ac062b6a5208df33b3db745d?s=96&d=mm&r=g","caption":"Rafael Cavalheiro"},"url":"https:\/\/nilg.ai\/pt\/author\/rafael\/"}]}},"_links":{"self":[{"href":"https:\/\/nilg.ai\/pt\/wp-json\/wp\/v2\/posts\/3109","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nilg.ai\/pt\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nilg.ai\/pt\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nilg.ai\/pt\/wp-json\/wp\/v2\/users\/128"}],"replies":[{"embeddable":true,"href":"https:\/\/nilg.ai\/pt\/wp-json\/wp\/v2\/comments?post=3109"}],"version-history":[{"count":9,"href":"https:\/\/nilg.ai\/pt\/wp-json\/wp\/v2\/posts\/3109\/revisions"}],"predecessor-version":[{"id":3113,"href":"https:\/\/nilg.ai\/pt\/wp-json\/wp\/v2\/posts\/3109\/revisions\/3113"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/nilg.ai\/pt\/wp-json\/wp\/v2\/media\/3261"}],"wp:attachment":[{"href":"https:\/\/nilg.ai\/pt\/wp-json\/wp\/v2\/media?parent=3109"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nilg.ai\/pt\/wp-json\/wp\/v2\/categories?post=3109"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nilg.ai\/pt\/wp-json\/wp\/v2\/tags?post=3109"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}