OWASP
diff --git a/‎2023/ML01_2023-Adversarial_Attack.md‎
Lines changed: 50 additions & 51 deletions b/‎2023/ML01_2023-Adversarial_Attack.md‎
Lines changed: 50 additions & 51 deletions
diff --git a/‎2023/ML02_2023-Data_Poisoning_Attack.md‎
Lines changed: 63 additions & 61 deletions b/‎2023/ML02_2023-Data_Poisoning_Attack.md‎
Lines changed: 63 additions & 61 deletions
@@ -4,77 +4,76 @@ type: documentation
 altfooter: true
 level: 4
 auto-migrated: 0
-pitch:
 document: OWASP Machine Learning Security Top Ten 2023
 year: 2023
 order: 1
 title: ML01:2023 Adversarial Attack
 lang: en
-author:
-contributors:
-tags: OWASP Machine Learning Security Top Ten 2023, Top Ten, ML01:2023, mltop10, mlsectop10
+tags:
+  [
+    OWASP Machine Learning Security Top Ten 2023,
+    Top Ten,
+    ML01:2023,
+    mltop10,
+    mlsectop10,
+  ]
 exploitability: 5
-prevalence:
 detectability: 3
 technical: 5
-redirect_from:
 ---
 
-RISK Chart for Scenario One:
+## Description
 
-| Threat agents/Attack vectors                                                                                                                                                       |                                                     Security Weakness                                                     |                                                 Impact                                                  |
-| ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :-----------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------------------------------------: |
-| Exploitability: 5 (Easy to exploit) ML Application Specific: 4 ML Operations Specific: 3                                                                                           | Detectability: 3 (The adversarial image may not be noticeable to the naked eye, making it difficult to detect the attack) | Technical: 5 (The attack requires technical knowledge of deep learning and image processing techniques) |
-| Threat Agent: Attacker with knowledge of deep learning and image processing techniques Attack Vector: Deliberately crafted adversarial image that is similar to a legitimate image |                     Vulnerability in the deep learning model's ability to classify images accurately                      |            Misclassification of the image, leading to security bypass or harm to the system             |
+Adversarial attacks are a type of attack in which an attacker deliberately
+alters input data to mislead the model.
 
-It is important to note that this chart is only a sample based on
-scenario below, and the actual risk assessment will depend on the
-specific circumstances of each machine learning system.
+## How to Prevent
+
+**Adversarial training:** One approach to defending against adversarial attacks
+is to train the model on adversarial examples. This can help the model become
+more robust to attacks and reduce its susceptibility to being misled.
 
-**Description**:
-Adversarial attacks are a type of attack in which an attacker
-deliberately alters input data to mislead the model.
+**Robust models:** Another approach is to use models that are designed to be
+robust against adversarial attacks, such as adversarial training or models that
+incorporate defense mechanisms.
 
-**Example Attack Scenario:**
+**Input validation:** Input validation is another important defense mechanism
+that can be used to detect and prevent adversarial attacks. This involves
+checking the input data for anomalies, such as unexpected values or patterns,
+and rejecting inputs that are likely to be malicious.
 
-Scenario 1: Image classification
+## Risk Factors
 
-A deep learning model is trained to classify images into different
-categories, such as dogs and cats. An attacker creates an adversarial
-image that is very similar to a legitimate image of a cat, but with
-small, carefully crafted perturbations that cause the model to
-misclassify it as a dog. When the model is deployed in a real-world
-setting, the attacker can use the adversarial image to bypass security
-measures or cause harm to the system.
+| Threat Agents/Attack Vectors                                                                                                                                                                  |                                                               Security Weakness                                                                |                                                            Impact                                                             |
+| --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | :--------------------------------------------------------------------------------------------------------------------------------------------: | :---------------------------------------------------------------------------------------------------------------------------: |
+| Exploitability: 5 (Easy) <br><br> _ML Application Specific: 4_ <br> _ML Operations Specific: 3_                                                                                               | Detectability: 3 (Moderate) <br><br> _The adversarial image may not be noticeable to the naked eye, making it difficult to detect the attack._ | Technical: 5 (Difficult) <br><br> _The attack requires technical knowledge of deep learning and image processing techniques._ |
+| Threat Agent: Attacker with knowledge of deep learning and image processing techniques. <br><br> Attack Vector: Deliberately crafted adversarial image that is similar to a legitimate image. |                               Vulnerability in the deep learning model's ability to classify images accurately.                                |                       Misclassification of the image, leading to security bypass or harm to the system.                       |
 
-Scenario 2: Network intrusion detection
+It is important to note that this chart is only a sample based on
+[the scenario below](#scenario1) only. The actual risk assessment will depend on
+the specific circumstances of each machine learning system.
 
-A deep learning model is trained to detect intrusions in a network. An
-attacker creates adversarial network traffic by carefully crafting
-packets in such a way that they will evade the model\'s intrusion
-detection system. The attacker can manipulate the features of the
-network traffic, such as the source IP address, destination IP address,
-or payload, in such a way that they are not detected by the intrusion
-detection system. For example, the attacker may hide their source IP
-address behind a proxy server or encrypt the payload of their network
-traffic. This type of attack can have serious consequences, as it can
-lead to data theft, system compromise, or other forms of damage.
+## Example Attack Scenarios
 
-**How to Prevent:**
+### Scenario \#1: Image classification {#scenario1}
 
-1. Adversarial training: One approach to defending against adversarial
-   attacks is to train the model on adversarial examples. This can help
-   the model become more robust to attacks and reduce its
-   susceptibility to being misled.
+A deep learning model is trained to classify images into different categories,
+such as dogs and cats. An attacker creates an adversarial image that is very
+similar to a legitimate image of a cat, but with small, carefully crafted
+perturbations that cause the model to misclassify it as a dog. When the model is
+deployed in a real-world setting, the attacker can use the adversarial image to
+bypass security measures or cause harm to the system.
 
-2. Robust models: Another approach is to use models that are designed
-   to be robust against adversarial attacks, such as adversarial
-   training or models that incorporate defense mechanisms.
+### Scenario \#2: Network intrusion detection
 
-3. Input validation: Input validation is another important defense
-   mechanism that can be used to detect and prevent adversarial
-   attacks. This involves checking the input data for anomalies, such
-   as unexpected values or patterns, and rejecting inputs that are
-   likely to be malicious.
+A deep learning model is trained to detect intrusions in a network. An attacker
+creates adversarial network traffic by carefully crafting packets in such a way
+that they will evade the model\'s intrusion detection system. The attacker can
+manipulate the features of the network traffic, such as the source IP address,
+destination IP address, or payload, in such a way that they are not detected by
+the intrusion detection system. For example, the attacker may hide their source
+IP address behind a proxy server or encrypt the payload of their network
+traffic. This type of attack can have serious consequences, as it can lead to
+data theft, system compromise, or other forms of damage.
 
-**References:**
+## References
@@ -4,92 +4,94 @@ type: documentation
 altfooter: true
 level: 4
 auto-migrated: 0
-pitch:
 document: OWASP Machine Learning Security Top Ten 2023
 year: 2023
 order: 2
 title: ML02:2023 Data Poisoning Attack
 lang: en
-author:
-contributors:
-tags: OWASP Machine Learning Security Top Ten 2023, Top Ten, ML02:2023, mltop10, mlsectop10
+tags:
+  [
+    OWASP Machine Learning Security Top Ten 2023,
+    Top Ten,
+    ML02:2023,
+    mltop10,
+    mlsectop10,
+  ]
 exploitability: 3
-prevalence:
 detectability: 2
 technical: 4
-redirect_from:
 ---
 
-|                                                                   Threat agents/Attack vectors                                                                   |                             Security Weakness                             |                                                                 Impact                                                                 |
-| :--------------------------------------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------: |
-|                                Exploitability: 3 (Medium to exploit)<br>ML Application Specific: 4 <br>ML Operations Specific: 3                                 |                       Detectability: 2<br>(Limited)                       |                                                            Technical: 4<br>                                                            |
-| Threat Agent: Attacker who has access to the training data used for the model.<br>Attack Vector: The attacker injects malicious data into the training data set. | Lack of data validation and insufficient monitoring of the training data. | The model will make incorrect predictions based on the poisoned data, leading to false decisions and potentially serious consequences. |
+## Description
 
-It is important to note that this chart is only a sample based on
-scenario below, and the actual risk assessment will depend on the
-specific circumstances of each machine learning system.
+Data poisoning attacks occur when an attacker manipulates the training data to
+cause the model to behave in an undesirable way.
 
-**Description:**
+## How to Prevent
 
-Data poisoning attacks occur when an attacker manipulates the training
-data to cause the model to behave in an undesirable way.
+**Data validation and verification:** Ensure that the training data is
+thoroughly validated and verified before it is used to train the model. This can
+be done by implementing data validation checks and employing multiple data
+labelers to validate the accuracy of the data labeling.
 
-**Example Attack Scenario:**
+**Secure data storage:** Store the training data in a secure manner, such as
+using encryption, secure data transfer protocols, and firewalls.
 
-Scenario 1: Training a spam classifier
+**Data separation:** Separate the training data from the production data to
+reduce the risk of compromising the training data.
 
-An attacker poisons the training data for a deep learning model that
-classifies emails as spam or not spam. The attacker executed this attack
-by injecting the maliciously labeled spam emails into the training data
-set. This could be done by compromising the data storage system, for
-example by hacking into the network or exploiting a vulnerability in the
-data storage software. The attacker could also manipulate the data
-labeling process, such as by falsifying the labeling of the emails or by
-bribing the data labelers to provide incorrect labels.
+**Access control:** Implement access controls to limit who can access the
+training data and when they can access it.
 
-Scenario 2: Training a network traffic classification system
+**Monitoring and auditing:** Regularly monitor the training data for any
+anomalies and conduct audits to detect any data tampering.
 
-An attacker poisons the training data for a deep learning model that is
-used to classify network traffic into different categories, such as
-email, web browsing, and video streaming. They introduce a large number
-of examples of network traffic that are incorrectly labeled as a
-different type of traffic, causing the model to be trained to classify
-this traffic as the incorrect category. As a result, the model may be
-trained to make incorrect traffic classifications when the model is
-deployed, potentially leading to misallocation of network resources or
-degradation of network performance.
+**Model validation:** Validate the model using a separate validation set that
+has not been used during training. This can help to detect any data poisoning
+attacks that may have affected the training data.
 
-**How to Prevent:**
+**Model ensembles:** Train multiple models using different subsets of the
+training data and use an ensemble of these models to make predictions. This can
+reduce the impact of data poisoning attacks as the attacker would need to
+compromise multiple models to achieve their goals.
 
-Data validation and verification: Ensure that the training data is
-thoroughly validated and verified before it is used to train the model.
-This can be done by implementing data validation checks and employing
-multiple data labelers to validate the accuracy of the data labeling.
+**Anomaly detection:** Use anomaly detection techniques to detect any abnormal
+behavior in the training data, such as sudden changes in the data distribution
+or data labeling. These techniques can be used to detect data poisoning attacks
+early on.
 
-Secure data storage: Store the training data in a secure manner, such as
-using encryption, secure data transfer protocols, and firewalls.
+## Risk Factors
 
-Data separation: Separate the training data from the production data to
-reduce the risk of compromising the training data.
+|                                                                      Threat Agents/Attack Vectors                                                                      |                             Security Weakness                             |                                                                 Impact                                                                 |
+| :--------------------------------------------------------------------------------------------------------------------------------------------------------------------: | :-----------------------------------------------------------------------: | :------------------------------------------------------------------------------------------------------------------------------------: |
+|                                  Exploitability: 3 (Moderate) <br><br> _ML Application Specific: 4_ <br> _ML Operations Specific: 3_                                   |                       Detectability: 2 (Difficult)                        |                                                        Technical: 4 (Moderate)                                                         |
+| Threat Agent: Attacker who has access to the training data used for the model. <br><br> Attack Vector: The attacker injects malicious data into the training data set. | Lack of data validation and insufficient monitoring of the training data. | The model will make incorrect predictions based on the poisoned data, leading to false decisions and potentially serious consequences. |
 
-Access control: Implement access controls to limit who can access the
-training data and when they can access it.
+It is important to note that this chart is only a sample based on
+[the scenario below](#scenario1) only. The actual risk assessment will depend on
+the specific circumstances of each machine learning system.
 
-Monitoring and auditing: Regularly monitor the training data for any
-anomalies and conduct audits to detect any data tampering.
+## Example Attack Scenarios
+
+### Scenario \#1: Training a spam classifier {#scenario1}
 
-Model validation: Validate the model using a separate validation set
-that has not been used during training. This can help to detect any data
-poisoning attacks that may have affected the training data.
+An attacker poisons the training data for a deep learning model that classifies
+emails as spam or not spam. The attacker executed this attack by injecting the
+maliciously labeled spam emails into the training data set. This could be done
+by compromising the data storage system, for example by hacking into the network
+or exploiting a vulnerability in the data storage software. The attacker could
+also manipulate the data labeling process, such as by falsifying the labeling of
+the emails or by bribing the data labelers to provide incorrect labels.
 
-Model ensembles: Train multiple models using different subsets of the
-training data and use an ensemble of these models to make predictions.
-This can reduce the impact of data poisoning attacks as the attacker
-would need to compromise multiple models to achieve their goals.
+### Scenario \#2: Training a network traffic classification system
 
-Anomaly detection: Use anomaly detection techniques to detect any
-abnormal behavior in the training data, such as sudden changes in the
-data distribution or data labeling. These techniques can be used to
-detect data poisoning attacks early on.
+An attacker poisons the training data for a deep learning model that is used to
+classify network traffic into different categories, such as email, web browsing,
+and video streaming. They introduce a large number of examples of network
+traffic that are incorrectly labeled as a different type of traffic, causing the
+model to be trained to classify this traffic as the incorrect category. As a
+result, the model may be trained to make incorrect traffic classifications when
+the model is deployed, potentially leading to misallocation of network resources
+or degradation of network performance.
 
-**References:**
+## References