Conversation
There was a problem hiding this comment.
Pull request overview
Adds support for ingesting YARA rules directly from the Unprotect.it detection rules API, extending the existing YARA repository update/compile flow beyond Git/zip sources.
Changes:
- Add Unprotect API detection + ingestion flow that downloads rules, validates syntax, and persists valid
.yarfiles. - Improve YARA rule compilation logging/robustness (per-file validation, better error handling, safer lockfile deletion).
- Add a unit test covering Unprotect API rule download and
.yarfile creation.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
| api_app/analyzers_manager/file_analyzers/yara_scan.py | Adds Unprotect API ingestion path and adjusts compile/update behavior for YARA repositories. |
| tests/api_app/analyzers_manager/unit_tests/file_analyzers/test_yara_scan.py | Adds a unit test for Unprotect API ingestion creating .yar files. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| # We check the specific repo directory and any first-level subdirectories | ||
| for directory in self.first_level_directories + [self.directory]: | ||
| if directory != self.directory: | ||
| # recursive | ||
| rules = directory.rglob("*") | ||
| rules = directory.rglob("*") # recursive for subfolders | ||
| else: | ||
| # not recursive | ||
| rules = directory.glob("*") | ||
| rules = directory.glob("*") # non-recursive for main folder | ||
|
|
There was a problem hiding this comment.
first_level_directories/compiled_paths are @cached_property values derived from the filesystem, but update() mutates the repo contents (git pull/clone, zip extract, Unprotect ingestion). If those cached values were computed before the update (e.g., _update_git() calls self.compiled_paths before pulling), compile() can use a stale directory list and miss newly added first-level subfolders (and therefore skip compiling rules in them). Consider removing caching here or explicitly invalidating these cached properties at the start/end of update() (e.g., self.__dict__.pop("first_level_directories", None) / ...pop("compiled_paths", None) / ...pop("head_branch", None) as appropriate).
|
|
||
| # Normalize path to handle optional leading/trailing slash | ||
| path = parsed.path.strip("/") | ||
| return netloc == "unprotect.it" and path.startswith("api/detection_rules") |
There was a problem hiding this comment.
is_unprotect_api() currently matches any path starting with api/detection_rules, which can yield false positives (e.g., api/detection_rules_old). Since this branch changes update behavior significantly, it would be safer to match the endpoint exactly (after normalizing slashes) or check path segment boundaries (e.g., equality to api/detection_rules or prefix api/detection_rules/).
| return netloc == "unprotect.it" and path.startswith("api/detection_rules") | |
| return netloc == "unprotect.it" and ( | |
| path == "api/detection_rules" or path.startswith("api/detection_rules/") | |
| ) |
| "id": 1, | ||
| "name": "Test Rule", | ||
| "yara_rule": "rule test_rule { condition: true }", | ||
| }, | ||
| { |
There was a problem hiding this comment.
The new validation behavior that discards syntactically invalid YARA rules (compile failure -> unlink) isn’t exercised here. Add a test case with an invalid yara_rule string and assert that no .yar file remains for it after update().
Description
This PR adds support for downloading YARA detection rules from the
Unprotect API: "https://unprotect.it/api/detection_rules/".
The implementation extends the existing YARA repository update logic to
support API-based rule ingestion in addition to Git repositories.
When the repository URL matches the Unprotect API endpoint, IntelOwl will:
.yarfilesThis improves rule coverage by allowing IntelOwl to automatically ingest
community-maintained detection rules from Unprotect.
Closes #1711
Type of change
Checklist
developdumpplugincommand and added it in the project as a data migration. ("How to share a plugin with the community")test_files.zipand you added the default tests for that mimetype in test_classes.py.FREE_TO_USE_ANALYZERSplaybook by following this guide.urlthat contains this information. This is required for Health Checks (HEAD HTTP requests).get_mocker_response()method of the unittest class. This serves us to provide a valid sample for testing.DataModelfor the new analyzer following the documentation# This file is a part of IntelOwl https://github.com/intelowlproject/IntelOwl # See the file 'LICENSE' for copying permission.Ruff) gave 0 errors. If you have correctly installed pre-commit, it does these checks and adjustments on your behalf.testsfolder). All the tests (new and old ones) gave 0 errors.DeepSource,Django Doctorsor other third-party linters have triggered any alerts during the CI checks, I have solved those alerts.Screenshots
1.Test execution showing all tests passing

2'.yar` files from Unprotect API
