You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In the below summary you will find examplary views of the grafana dashboards. The metrics were obtained using the [mock-generator](./docker/docker-compose.send-real-logs.yml)
72
+
##### Or run the modules locally on your machine:
73
+
```sh
74
+
python -m venv .venv
75
+
source .venv/bin/activate
76
+
77
+
sh install_requirements.sh
78
+
```
79
+
Alternatively, you can use `pip install` and enter all needed requirements individually with `-r requirements.*.txt`.
80
+
81
+
Now, you can start each stage, e.g. the inspector:
82
+
83
+
```sh
84
+
python src/inspector/inspector.py
85
+
```
86
+
87
+
<palign="right">(<ahref="#readme-top">back to top</a>)</p>
88
+
89
+
90
+
## Usage
91
+
92
+
### Configuration
93
+
94
+
To configure **heiDGAF** according to your needs, use the provided `config.yaml`.
95
+
96
+
The most relevant settings are related to your specific log line format, the model you want to use, and
97
+
possibly infrastructure.
98
+
99
+
The section `pipeline.log_collection.collector.logline_format` has to be adjusted to reflect your specific input log
100
+
line format. Using our adjustable and flexible log line configuration, you can rename, reorder and fully configure each
101
+
field of a valid log line. Freely define timestamps, RegEx patterns, lists, and IP addresses. For example, your
|`pipeline.data_inspection.inspector.mode`| Mode of operation for the data inspector. |`univariate` (options: `multivariate`, `ensemble`) |
158
-
|`pipeline.data_inspection.inspector.ensemble.model`| Model to use when inspector mode is `ensemble`. |`WeightEnsemble`|
159
-
|`pipeline.data_inspection.inspector.ensemble.module`| Module name for the ensemble model. |`streamad.process`|
160
-
|`pipeline.data_inspection.inspector.models`| List of models to use for data inspection (e.g., anomaly detection). | Array of model definitions (e.g., `{"model": "ZScoreDetector", "module": "streamad.model", "model_args": {"is_global": false}}`)|
161
-
|`pipeline.data_inspection.inspector.anomaly_threshold`| Threshold for classifying an observation as an anomaly. |`0.01`|
162
-
|`pipeline.data_analysis.detector.model`| Model to use for data analysis (e.g., DGA detection). |`rf` (Random Forest) option: `XGBoost`|
163
-
|`pipeline.data_analysis.detector.checksum`| Checksum for the model file to ensure integrity. |`021af76b2385ddbc76f6e3ad10feb0bb081f9cf05cff2e52333e31040bbf36cc`|
164
-
|`pipeline.data_analysis.detector.base_url`| Base URL for downloading the model if not present locally. |`https://heibox.uni-heidelberg.de/d/0d5cbcbe16cd46a58021/`|
165
201
166
-
<palign="right">(<ahref="#readme-top">back to top</a>)</p>
167
-
168
-
### Insert Data
202
+
## Models and Training
169
203
170
-
>[!IMPORTANT]
171
-
> We rely on the following datasets to train and test our or your own models:
204
+
To train and test our and possibly your own models, we currently rely on the following datasets:
172
205
173
-
For training our models, we currently rely on the following data sets:
After installing the requirements, use `src/train/train.py`:
199
235
200
236
```sh
201
237
> python -m venv .venv
@@ -215,56 +251,32 @@ Commands:
215
251
train
216
252
```
217
253
218
-
Setting up the [dataset directories](#insert-test-data) (and adding the code for your model class if applicable) let's you start the training process by running the following commands:
254
+
Setting up the [dataset directories](#insert-test-data) (and adding the code for your model class if applicable) lets you start
255
+
the training process by running the following commands:
This will create a rules.txt file containing the innards of the model, explaining the rules it created.
275
+
This will create a `rules.txt` file containing the innards of the model, explaining the rules it created.
239
276
240
277
<p align="right">(<a href="#readme-top">back to top</a>)</p>
241
278
242
279
243
-
### Data
244
-
245
-
> [!IMPORTANT]
246
-
> We support custom schemes.
247
-
248
-
Depending on your data and usecase, you can customize the data scheme to fit your needs.
249
-
The below configuration is part of the [main configuration file](./config.yaml) which is detailed in our [documentation](https://heidgaf.readthedocs.io/en/latest/usage.html#id2)
0 commit comments