Skip to content

Commit e83f83f

Browse files
0ssigenocristinaascarimlodic
authored
Datamodel docs (#7)
* Datamodel docs * Update .gitignore * Update docs/IntelOwl/contribute.md Co-authored-by: Matteo Lodi <[email protected]> * Update docs/IntelOwl/contribute.md Co-authored-by: Matteo Lodi <[email protected]> * Update docs/IntelOwl/contribute.md Co-authored-by: Matteo Lodi <[email protected]> * Update docs/IntelOwl/contribute.md Co-authored-by: Matteo Lodi <[email protected]> * Update docs/IntelOwl/contribute.md Co-authored-by: Matteo Lodi <[email protected]> * Update docs/IntelOwl/contribute.md Co-authored-by: Matteo Lodi <[email protected]> * Update docs/IntelOwl/contribute.md Co-authored-by: Matteo Lodi <[email protected]> * Update docs/IntelOwl/contribute.md Co-authored-by: Matteo Lodi <[email protected]> * Update docs/IntelOwl/contribute.md Co-authored-by: Matteo Lodi <[email protected]> * Updates data model docs * Updates usage data model * Fix * Updates usage * fix * tweaks --------- Co-authored-by: Cristina Ascari <[email protected]> Co-authored-by: Cristina Ascari <[email protected]> Co-authored-by: Matteo Lodi <[email protected]>
1 parent 3c37151 commit e83f83f

File tree

3 files changed

+129
-1
lines changed

3 files changed

+129
-1
lines changed

.gitignore

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,5 @@
11
site/
2-
venv/
2+
venv/
3+
4+
# ide
5+
.idea

docs/IntelOwl/contribute.md

Lines changed: 110 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -430,6 +430,116 @@ You can use the django management command `dumpplugin` to automatically create t
430430

431431
Example: `docker exec -ti intelowl_uwsgi python3 manage.py dumpplugin PlaybookConfig <new_analyzer_name>`
432432

433+
## How to create a DataModel
434+
435+
After the successful execution of an `Analyzer`, a `DataModel` will be created only if `_do_create_data_model` returns `True` and at least one of the following conditions is true:
436+
1. The `mapping_data_model` field is defined in the `AnalyzerConfig`
437+
2. The `Analyzer` overrides `_update_data_model`
438+
3. The `Analyzer` overrides `_create_data_model_mtm`
439+
440+
### AnalyzerConfig.mapping_data_model
441+
Each `AnalyzerConfig` has now a new field, called `mapping_data_model`: this is a dictionary whose keys represent the path used to retrieve the value in the `AnalyzerReport`, while its values represent the fields in the `DataModel`.
442+
443+
If you precede the key with the symbol `$` it means that is a constant.
444+
445+
Example:
446+
```python3
447+
report= {
448+
"data": {
449+
"urls": [
450+
{"url": "intelowl.com"},
451+
{"url": "https://intelowl.com"}
452+
],
453+
"country": "IT",
454+
"tags": [
455+
"social_engineering",
456+
"random_things"
457+
]
458+
}
459+
}
460+
mapping_data_model={
461+
"data.urls.url": "external_urls", # unmarshaling of the array is done automatically
462+
"data.country": "country_code",
463+
"$malicious": "evaluation", # the $ specify that this is a constant
464+
"data.tags.0": "tags" # we just want the first tag
465+
}
466+
```
467+
468+
With the previously shown `AnalyzerReport` and its mapping, we will create a DataModel with these conditions
469+
```python3
470+
# the values are lowercase because everything inside the DataModel is converted to lowercase
471+
assert external_urls == ["intelowl.com", "https://intelowl.com"]
472+
assert country_code == "it"
473+
assert evaluation == "malicious"
474+
```
475+
476+
If you specify a path that is not present in the `DataModel`, an error will be added to the job.
477+
If you specify a path that is not present in the `AnalyzerConfig`, a warning will be added to the job.
478+
479+
### Analyzer._do_create_data_model
480+
This is a function that every `Analyzer` can override: this function returns a boolean and, if `False`, the DataModel will not be created.
481+
This can be useful when a specific `Analyzer` succeeds without retrieving useful results.
482+
Let's use the `UrlHaus` Analyzer as an example : if the domain analyzed is not present in its database, the result will be
483+
```python3
484+
{"query_status": "no_results"}
485+
```
486+
meaning that we can use the following code to consider only _real_ results:
487+
```python3
488+
def _do_create_data_model(self) -> bool:
489+
return (
490+
super()._do_create_data_model()
491+
and self.report.report.get("query_status", "no_results") != "no_results"
492+
)
493+
```
494+
495+
### Analyzer._create_data_model_mtm
496+
This is a function that every `Analyzer` can override: this function returns a dictionary where the values are the objects that will be added in a many to many relationship in the DataModel, and the keys the names of the fields.
497+
This is useful when you want to save part of a report in separate Model and want to reference it with a many to many relationship.
498+
Let's use the `Yara` Analyzer as an example.
499+
500+
```python3
501+
def _create_data_model_mtm(self):
502+
from api_app.data_model_manager.models import Signature
503+
504+
signatures = []
505+
for yara_signatures in self.report.report.values():
506+
for yara_signature in yara_signatures:
507+
url = yara_signature.pop("rule_url", None)
508+
sign = Signature.objects.create(
509+
provider=Signature.PROVIDERS.YARA.value,
510+
signature=yara_signature,
511+
url=url if url else "",
512+
score=1,
513+
)
514+
signatures.append(sign)
515+
516+
return {"signatures": signatures}
517+
518+
```
519+
Here we are creating many `Signature` objects (using the signatures that matched the sample analyzed) and adding them to the `signatures` field.
520+
521+
### Analyzer._update_data_model
522+
This is the last function that you can override in the `Analyzer` class: this function returns nothing, and is called after every other check.
523+
This mean that you can use it for more articulate data transformation to parse the `AnalyzerReport` into a `DataModel`.
524+
525+
Again, let's use an example, this time with the analyzer `AbuseIPDB`.
526+
```python3
527+
def _update_data_model(self, data_model) -> None:
528+
super()._update_data_model(data_model)
529+
if self.report.report.get("totalReports", 0):
530+
self.report: AnalyzerReport
531+
if self.report.report["isWhitelisted"]:
532+
evaluation = (
533+
self.report.data_model_class.EVALUATIONS.TRUSTED.value
534+
)
535+
else:
536+
evaluation = self.report.data_model_class.EVALUATIONS.MALICIOUS.value
537+
data_model.evaluation = evaluation
538+
```
539+
We are setting the field `evaluation` depending on some logic that we constructed, using the data inside the report.
540+
If the IP address has been reported by some AbuseIPDB users but, at the same time, is whitelisted by AbuseIPDB, then we set its `evaluation` to `trusted`. On the contrary, if it's not whitelisted, we set it as `malicious`.
541+
542+
433543
## How to modify a plugin
434544

435545
If the changes that you have to make should stay local, you can just change the configuration inside the `Django admin` page.

docs/IntelOwl/usage.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -298,6 +298,21 @@ The form will open with the fields to fill in to create the analyzer.
298298

299299
![img.png](./static/analyzer_creation.png)
300300

301+
### DataModels
302+
303+
_Available from version > 6.2.0_
304+
305+
The main functionality of a `DataModel` is to model an `Analyzer` result to a set of prearranged keys, allowing users to easily search, evaluate and use the analyzer result.
306+
The author of an `AnalyzerConfig` is able to decide the mapping between each field of the `AnalyzerReport` and the corresponding one in the `DataModel`.
307+
308+
There are three types of `DataModel`:
309+
1. `DomainDataModel` is the `DataModel` for domains and URLs
310+
2. `IPDataModel` is the `DataModel` for IP addresses
311+
3. `FileDataModel` is the `DataModel` for files and hashes
312+
313+
The `DataModel` will not be created for generic observables. This is a design choice and could be changed in future.
314+
315+
This feature is still in the development phase. At the moment, the DataModels created are saved in the database, but they are not being used for further operations.
301316

302317
### Connectors
303318

0 commit comments

Comments
 (0)