Skip to content

Commit f334cbd

Browse files
committed
Minor corrections
Signed-off-by: Álvaro Bacca Peña <[email protected]>
1 parent fb926b2 commit f334cbd

File tree

3 files changed

+27
-15
lines changed

3 files changed

+27
-15
lines changed

art/defences/detector/poison/clustering_centroid_analysis.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@
3333
from art.defences.detector.poison.poison_filtering_defence import PoisonFilteringDefence
3434

3535
if TYPE_CHECKING:
36-
from tensorflow.keras import Model, Sequential
36+
from tensorflow.keras import Model
3737
from umap import UMAP
3838
from sklearn.base import ClusterMixin
3939
from art.utils import CLASSIFIER_TYPE

notebooks/poisoning_defense_clustering_centroid_analysis.ipynb

Lines changed: 24 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -38,24 +38,36 @@
3838
"source": [
3939
"### 2.1. I/O CCA-UD\n",
4040
"\n",
41-
"The following I/O descriptions do not correspond to a single function's parameters and/or return values, but serve as a general overview of what the algorithm uses and returns in a usage scenario.\n",
41+
"The following I/O descriptions do not correspond to a single function's parameters and/or return values, but serve as a general overview of what the algorithm uses and returns in a usage scenario:\n",
4242
"\n",
4343
"### Inputs\n",
44-
"| Input | Description |\n",
45-
"|-------------------------------------------------------------|----------------------------------------------------------------------------------------------------------------|\n",
46-
"| Poisoned training set features (`x_train`) | Dataset of independent variables used to train the classifier |\n",
47-
"| Poisoned training set labels (`y_train`) | Labels used to train the classifier |\n",
48-
"| Benign indices (`benign_indices`) | Indices of `x_train` that are definitely benign samples |\n",
49-
"| Final feature layer (`final_feature_layer`) | Name of the final layer that builds the feature representation. It is used to slice the DNN into two submodels |\n",
50-
"| Misclassification threshold (`misclassification_threshold`) | ($\\theta$ in the paper) Minimum percentage of correct classifications needed to consider a cluster as benign |\n",
51-
"| True poison labels (`is_clean`) | True poison labels used to evaluate the defence's performance agains the detected poisoned points |\n",
44+
"| Input | Optional | Default Value | Description |\n",
45+
"|-------------------------------------------------------------|----------|---------------|-----------------------------------------------------------------------------------------------------------------|\n",
46+
"| Classifier (`classifier`) | N | - | Classifier model that is being analyzed for poisoning. |\n",
47+
"| Poisoned training set features (`x_train`) | N | - | Dataset of independent variables used to train the classifier. |\n",
48+
"| Poisoned training set labels (`y_train`) | N | - | Labels used to train the classifier. |\n",
49+
"| Benign indices (`benign_indices`) | N | - | Indices of `x_train` that are definitely benign samples. |\n",
50+
"| Final feature layer (`final_feature_layer`) | N | - | Name of the final layer that builds the feature representation. It is used to slice the DNN into two submodels. |\n",
51+
"| Misclassification threshold (`misclassification_threshold`) | N | - | ($\\theta$ in the paper) Minimum percentage of correct classifications needed to consider a cluster as benign. |\n",
52+
"| True poison labels (`is_clean`) | N | - | True poison labels used to evaluate the defence's performance against the detected poisoned points. |\n",
5253
"\n",
5354
"### Outputs\n",
5455
"| Ouptut | Description |\n",
5556
"|-----------------------|-----------------------------------------------------------------------------------------------------------------|\n",
5657
"| Poisoning verdict | List of `x_train` with 1/0 labels; 1 means the data point is clean, whereas 0 means it was detected as poisoned |\n",
5758
"| Report | Dictionary with report details on the dataset's performance |\n",
58-
"| Confusion matrix JSON | JSON-like object with the detection performance results, given the true poisoned labels |\n"
59+
"| Confusion matrix JSON | JSON-like object with the detection performance results, given the true poisoned labels |\n",
60+
"\n",
61+
"\n",
62+
"The following table shows specific inputs used in the CCA-UD's object creation:\n",
63+
"\n",
64+
"### Inputs (implementation-specific)\n",
65+
"| Input | Optional | Default Value | Description |\n",
66+
"|-----------------------------------------------------------------|----------|-----------------------------------|--------------------------------------------------------------------------------------------|\n",
67+
"| Reducer (`reducer`) | Y | `UMAP(n_neighbors=5, min_dist=0)` | Dimensionality reducer used to reduce feature space. |\n",
68+
"| Clusterer (`clusterer`) | Y | `DBSCAN(eps=0.8, min_samples=20)` | Clustering algorithm used to cluster the reduced features. |\n",
69+
"| Feature extraction batch size (`feature_extraction_batch_size`) | Y | `32` | Batch size for feature extraction. Use lower values in case of low GPU memory availability |\n",
70+
"| Misclassification batch size (`misclassification_batch_size`) | Y | `32` | Batch size for misclassification. Use lower values in case of low GPU memory availability |\n"
5971
]
6072
},
6173
{
@@ -110,7 +122,7 @@
110122
"metadata": {},
111123
"source": [
112124
"### 3.1 Setup\n",
113-
"Loggers are created and libraries are imported."
125+
"Loggers are created and libraries are imported. The usage of a Conda environment is strongly encouraged, as it not only manages ART's dependencies, but also non-Python dependencies that can boost CCA-UD's performance with a dedicated GPU."
114126
]
115127
},
116128
{
@@ -971,7 +983,7 @@
971983
"metadata": {},
972984
"source": [
973985
"#### 3.5.1. Benign subset selection\n",
974-
"It is expected from the defender that he/she can provide indices of the training data that correspond to benign data in order to calculate the benign centroids. In this scenario, 40% of the benign samples in the full dataset are given as benign sample to the algorithm."
986+
"It is expected from the defender that he/she can provide indices of the training data that correspond to benign data in order to calculate the benign centroids. In this scenario, 30% of the benign samples in the full dataset are given as benign sample to the algorithm."
975987
]
976988
},
977989
{

run_tests.sh

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -151,10 +151,10 @@ else
151151
"tests/defences/detector/evasion/test_subsetscanning_detector.py" \
152152
"tests/defences/detector/poison/test_activation_defence.py" \
153153
"tests/defences/detector/poison/test_clustering_analyzer.py" \
154+
"tests/defences/detector/poison/test_clustering_centroid_analysis.py" \
154155
"tests/defences/detector/poison/test_ground_truth_evaluator.py" \
155156
"tests/defences/detector/poison/test_provenance_defence.py" \
156-
"tests/defences/detector/poison/test_roni.py" \
157-
"tests/defences/detector/poison/test_clustering_centroid_analysis.py" )
157+
"tests/defences/detector/poison/test_roni.py" )
158158

159159
declare -a metrics=("tests/metrics/test_gradient_check.py" \
160160
"tests/metrics/test_metrics.py" \

0 commit comments

Comments
 (0)