Skip to content

Commit 9a8efff

Browse files
kunmukhjdw170000
authored andcommitted
Initial commit
0 parents  commit 9a8efff

File tree

536 files changed

+38086
-0
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

536 files changed

+38086
-0
lines changed

.gitignore

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
gadget-finder/logs
2+
.idea
3+
*__pycache__*
4+
*.zip
5+
intrusion-detection-system/graph-based/runs
6+
intrusion-detection-system/graph-based/images
7+
intrusion-detection-system/graph-based/logs
8+
intrusion-detection-system/graph-based/sample-supply-chain-data/
9+
intrusion-detection-system/sample-supply-chain-data.zip

LICENSE.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
BSD 3-Clause License
2+
3+
Copyright (c) 2023, syssec-utd
4+
5+
Redistribution and use in source and binary forms, with or without
6+
modification, are permitted provided that the following conditions are met:
7+
8+
1. Redistributions of source code must retain the above copyright notice, this
9+
list of conditions and the following disclaimer.
10+
11+
2. Redistributions in binary form must reproduce the above copyright notice,
12+
this list of conditions and the following disclaimer in the documentation
13+
and/or other materials provided with the distribution.
14+
15+
3. Neither the name of the copyright holder nor the names of its
16+
contributors may be used to endorse or promote products derived from
17+
this software without specific prior written permission.
18+
19+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
20+
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
21+
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
22+
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
23+
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
24+
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
25+
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
26+
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
27+
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
28+
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

README.md

Lines changed: 127 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,127 @@
1+
# Evading Provenance-Based ML Detectors with Adversarial System Actions
2+
3+
Reproducibility artifacts for the paper _Evading Provenance-Based ML Detectors with Adversarial System Actions_.
4+
5+
## Overview
6+
7+
8+
## Folder structure
9+
10+
| Folder | Description|
11+
| -------|-----------|
12+
| `gadget-finder`| Folder containing the code and data to execute the gadget-finder algorithms. |
13+
| `intrusion-detection-system`| Folder containing the code and data files for IDS execution. |
14+
15+
### Environment Setup
16+
17+
We will use `conda` as the python environment manager. Install the project dependencies from the [provng.yml](provng.yml) using this command:
18+
19+
```bash
20+
conda env update --name provng --file provng.yml
21+
```
22+
23+
Activate the conda environment before running the experiments by running this command
24+
25+
```bash
26+
conda activate provng
27+
```
28+
29+
### Gadget Finder
30+
31+
* [Gadget Finder](gadget-finder/gadget-finder.py)
32+
* Finds the possible gadget chains between two programs as identified in [input.csv](gadget-finder/input.csv)
33+
* You can check a sample output in [output](gadget-finder/output) directory.
34+
35+
Running the gadget finder script:
36+
37+
```bash
38+
python gadget-finder.py -i input.csv -p FrequencyDB/SAMPLE_WINDOWS_FREQUENCY_DB.csv -o output/gadgets.txt
39+
```
40+
41+
### Path-based IDS
42+
43+
#### SIGL[[1]](#references)
44+
45+
* [sigl](intrusion-detection-system/path-based/sigl.py)
46+
* Driver script for SIGL, which is an Autoencoder based IDS that detects anomalous paths.
47+
* Sample causal paragraphs and feature vectors for Enterprise APT available in [sample-enterprise-data](intrusion-detection-system/path-based/sample-enterprise-data) directory.
48+
49+
Running the SIGL script:
50+
51+
```bash
52+
python sigl.py
53+
```
54+
55+
#### ProvDetector[[2]](#references)
56+
57+
* [provdetector](intrusion-detection-system/path-based/provdetector.py)
58+
* Driver script for ProvDetector, which is an LOF based IDS that detects anomalous paths.
59+
* Sample causal paragraphs and feature vectors for Enterprise APT available in [sample-enterprise-data](intrusion-detection-system/path-based/sample-enterprise-data) directory.
60+
61+
Running the ProvDetector script:
62+
63+
```bash
64+
python provdetector.py
65+
```
66+
67+
### Graph-based IDS
68+
69+
#### S-GAT
70+
71+
* [S-GAT](intrusion-detection-system/graph-based/gnnDriver.py)
72+
* Driver script for S-GAT, which is an GNN based IDS that detects anomalous graph using graph structure and attributes, e.g., node/edge types.
73+
* Run [download_sample_supply_chain_data.sh](intrusion-detection-system/graph-based/download_sample_supply_chain_data.sh) to download and unzip the sample Supply-Chain APT data from [Google Drive](https://drive.google.com/file/d/1Jz0ZuiZlUEZdAgqlnfmpN2_X0Cms6Sl8/view?usp=sharing)
74+
* The weighted average F1 score on the provided data with the provided model should be 0.88.
75+
76+
Running the S-GAT script:
77+
78+
```bash
79+
python gnnDriver.py gat -if 5 -hf 10 -lr 0.001 -e 20 -n 5 -bs 128 -bi -s
80+
```
81+
82+
#### Prov-GAT
83+
84+
* [Prov-GAT](intrusion-detection-system/graph-based/gnnDriver.py)
85+
* Driver script for Prov-GAT, which is an GNN based IDS that detects anomalous graph using node and edge attributes on top of features used by S-GAT feature.
86+
* Run [download_sample_supply_chain_data.sh](intrusion-detection-system/graph-based/download_sample_supply_chain_data.sh) to download and unzip the sample Supply-Chain APT data from [Google Drive](https://drive.google.com/file/d/1Jz0ZuiZlUEZdAgqlnfmpN2_X0Cms6Sl8/view?usp=sharing)
87+
* The weighted average F1 score on the provided data with the provided model should be 0.95.
88+
89+
Running the Prov-GAT script:
90+
91+
```bash
92+
python gnnDriver.py gat -if 768 -hf 10 -lr 0.001 -e 20 -n 5 -bs 128 -bi
93+
```
94+
95+
#### ProvNinja Graph
96+
97+
* [ProvNinja-Graph](intrusion-detection-system/graph-based/provninjaGraph.py)
98+
* Driver script for ProvNinja-Graph which is an adversarial example generator.
99+
* Run [download_sample_supply_chain_data.sh](intrusion-detection-system/graph-based/download_sample_supply_chain_data.sh) to download and unzip the sample Supply-Chain APT data from [Google Drive](https://drive.google.com/file/d/1Jz0ZuiZlUEZdAgqlnfmpN2_X0Cms6Sl8/view?usp=sharing)
100+
* Output will be in directory [adversarial_examples](intrusion-detection-system/graph-based/adversarial_examples).
101+
* The evasion rate should be approximately 168 / 198 true positives for the provided data with the provided models.
102+
103+
Running the ProvNinja-Graph script:
104+
105+
```bash
106+
python provninjaGraph.py
107+
```
108+
109+
## Citing us
110+
111+
```
112+
@inproceedings{mukherjee2023sec,
113+
title = {Evading Provenance-Based ML Detectors with Adversarial System Actions},
114+
author = {Kunal Mukherjee and Josh Wiedemeier and Tianhao Wang and James Wei and Feng Chen and Muhyun Kim and Murat Kantarcioglu and Kangkook Jee},
115+
year = 2023,
116+
booktitle = {Proceedings of USENIX Security Symposium (SEC)},
117+
series = {USENIX '23}
118+
}
119+
```
120+
121+
## References
122+
123+
[1] X. Han, X. Yu, T. Pasquier, et al., “_Sigl: Securing software installations through deep graph learning_,” in
124+
USENIX Security Symposium (SEC), 2021. <br>
125+
[2] Q. Wang, W. U. Hassan, D. Li, et al., “_You Are What
126+
You Do: Hunting Stealthy Malware via Data Provenance Analysis_,” in Network and Distributed System
127+
Security Symposium (NDSS), Feb. 2020. <br>

0 commit comments

Comments
 (0)