Skip to content

fix yara scan logic#3570

Open
Gagan144-blip wants to merge 1 commit intointelowlproject:developfrom
Gagan144-blip:yara-fix-clean
Open

fix yara scan logic#3570
Gagan144-blip wants to merge 1 commit intointelowlproject:developfrom
Gagan144-blip:yara-fix-clean

Conversation

@Gagan144-blip
Copy link
Copy Markdown

Description

This PR adds support for downloading YARA detection rules from the
Unprotect API: "https://unprotect.it/api/detection_rules/".

The implementation extends the existing YARA repository update logic to
support API-based rule ingestion in addition to Git repositories.

When the repository URL matches the Unprotect API endpoint, IntelOwl will:

  • Fetch detection rules from the API
  • Validate rule syntax before saving
  • Store valid rules locally as .yar files
  • Skip invalid or malformed rules safely

This improves rule coverage by allowing IntelOwl to automatically ingest
community-maintained detection rules from Unprotect.

Closes #1711

Type of change

  • New feature (non-breaking change which adds functionality).
  • Bug Fix

Checklist

  • I have read and understood the rules about how to Contribute to this project
  • The pull request is for the branch develop
  • A new plugin (analyzer, connector, visualizer, playbook, pivot or ingestor) was added or changed, in which case:
    • I strictly followed the documentation "How to create a Plugin"
    • Usage file was updated. A link to the PR to the docs repo has been added as a comment here.
    • Advanced-Usage was updated (in case the plugin provides additional optional configuration). A link to the PR to the docs repo has been added as a comment here.
    • I have dumped the configuration from Django Admin using the dumpplugin command and added it in the project as a data migration. ("How to share a plugin with the community")
    • If a File analyzer was added and it supports a mimetype which is not already supported, you added a sample of that type inside the archive test_files.zip and you added the default tests for that mimetype in test_classes.py.
    • If you created a new analyzer and it is free (does not require any API key), please add it in the FREE_TO_USE_ANALYZERS playbook by following this guide.
    • Check if it could make sense to add that analyzer/connector to other freely available playbooks.
    • I have provided the resulting raw JSON of a finished analysis and a screenshot of the results.
    • If the plugin interacts with an external service, I have created an attribute called precisely url that contains this information. This is required for Health Checks (HEAD HTTP requests).
    • If a new analyzer has beed added, I have created a unittest for it in the appropriate dir. I have also mocked all the external calls, so that no real calls are being made while testing.
    • I have added that raw JSON sample to the get_mocker_response() method of the unittest class. This serves us to provide a valid sample for testing.
    • I have created the corresponding DataModel for the new analyzer following the documentation
  • I have inserted the copyright banner at the start of the file: # This file is a part of IntelOwl https://github.com/intelowlproject/IntelOwl # See the file 'LICENSE' for copying permission.
  • Please avoid adding new libraries as requirements whenever it is possible. Use new libraries only if strictly needed to solve the issue you are working for. In case of doubt, ask a maintainer permission to use a specific library.
  • If external libraries/packages with restrictive licenses were added, they were added in the Legal Notice section.
  • Linters (Ruff) gave 0 errors. If you have correctly installed pre-commit, it does these checks and adjustments on your behalf.
  • I have added tests for the feature/bug I solved (see tests folder). All the tests (new and old ones) gave 0 errors.
  • If the GUI has been modified:
    • I have a provided a screenshot of the result in the PR.
    • I have created new frontend tests for the new component or updated existing ones.
  • After you had submitted the PR, if DeepSource, Django Doctors or other third-party linters have triggered any alerts during the CI checks, I have solved those alerts.

Screenshots

1.Test execution showing all tests passing
1. Test execution showing all tests passing

2'.yar` files from Unprotect API
 '.yar' files from Unprotect API

  1. Content of a downloaded YARA rule file
3. Content of a downloaded YARA rule file

Copilot AI review requested due to automatic review settings March 28, 2026 01:57
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds support for ingesting YARA rules directly from the Unprotect.it detection rules API, extending the existing YARA repository update/compile flow beyond Git/zip sources.

Changes:

  • Add Unprotect API detection + ingestion flow that downloads rules, validates syntax, and persists valid .yar files.
  • Improve YARA rule compilation logging/robustness (per-file validation, better error handling, safer lockfile deletion).
  • Add a unit test covering Unprotect API rule download and .yar file creation.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
api_app/analyzers_manager/file_analyzers/yara_scan.py Adds Unprotect API ingestion path and adjusts compile/update behavior for YARA repositories.
tests/api_app/analyzers_manager/unit_tests/file_analyzers/test_yara_scan.py Adds a unit test for Unprotect API ingestion creating .yar files.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +327 to +333
# We check the specific repo directory and any first-level subdirectories
for directory in self.first_level_directories + [self.directory]:
if directory != self.directory:
# recursive
rules = directory.rglob("*")
rules = directory.rglob("*") # recursive for subfolders
else:
# not recursive
rules = directory.glob("*")
rules = directory.glob("*") # non-recursive for main folder

Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

first_level_directories/compiled_paths are @cached_property values derived from the filesystem, but update() mutates the repo contents (git pull/clone, zip extract, Unprotect ingestion). If those cached values were computed before the update (e.g., _update_git() calls self.compiled_paths before pulling), compile() can use a stale directory list and miss newly added first-level subfolders (and therefore skip compiling rules in them). Consider removing caching here or explicitly invalidating these cached properties at the start/end of update() (e.g., self.__dict__.pop("first_level_directories", None) / ...pop("compiled_paths", None) / ...pop("head_branch", None) as appropriate).

Copilot uses AI. Check for mistakes.

# Normalize path to handle optional leading/trailing slash
path = parsed.path.strip("/")
return netloc == "unprotect.it" and path.startswith("api/detection_rules")
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is_unprotect_api() currently matches any path starting with api/detection_rules, which can yield false positives (e.g., api/detection_rules_old). Since this branch changes update behavior significantly, it would be safer to match the endpoint exactly (after normalizing slashes) or check path segment boundaries (e.g., equality to api/detection_rules or prefix api/detection_rules/).

Suggested change
return netloc == "unprotect.it" and path.startswith("api/detection_rules")
return netloc == "unprotect.it" and (
path == "api/detection_rules" or path.startswith("api/detection_rules/")
)

Copilot uses AI. Check for mistakes.
Comment on lines +62 to +66
"id": 1,
"name": "Test Rule",
"yara_rule": "rule test_rule { condition: true }",
},
{
Copy link

Copilot AI Mar 28, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The new validation behavior that discards syntactically invalid YARA rules (compile failure -> unlink) isn’t exercised here. Add a test case with an invalid yara_rule string and assert that no .yar file remains for it after update().

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants