Skip to content

PR for issue #29#48

Open
OG-Shabba-Ranks wants to merge 3 commits intoKoen1999:masterfrom
OG-Shabba-Ranks:master
Open

PR for issue #29#48
OG-Shabba-Ranks wants to merge 3 commits intoKoen1999:masterfrom
OG-Shabba-Ranks:master

Conversation

@OG-Shabba-Ranks
Copy link
Copy Markdown

No description provided.

Implement multiprocessing for rule processing: Refactored process_rules_file to use multiprocessing.Pool to significantly speed up the evaluation of large rulesets.

Add top-level worker functions: Extracted the core rule parsing and checking logic into a new _process_rule_task function so it can be safely pickled and distributed across worker processes.

Preserve logging across processes: Added a _worker_init function to initialize each worker process with a QueueHandler, successfully routing child process logs back to the main thread's QueueListener (building on the Koen1999#30 logging setup).

Batch file reading: Updated the file reading loop in process_rules_file to sequentially parse the file and group multi-line rules into a list of tasks before handing them off to the process pool.

Maintain output order: Used pool.map to ensure that the final output report preserves the exact original order of the rules in the file, regardless of which worker finishes first.
Comment on lines -462 to -475
"""Processes a rule file and returns a list of rules and their issues.

Args:
rules: A path to a Suricata rules file.
evaluate_disabled: A flag indicating whether disabled rules should be evaluated.
checkers: The checkers to be used when processing the rule file.

Returns:
A list of rules and their issues.

Raises:
RuntimeError: If no checkers could be automatically discovered.

"""
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you clarify why documentation was removed in several places?

Comment on lines +568 to +574
# Spin up the process pool
with multiprocessing.Pool(
initializer=_worker_init,
initargs=(log_queue,)
) as pool:
# pool.map preserves the original order of the rules in the file
results = pool.map(_process_rule_task, tasks)
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach implies processing only begins after the entire file is read. Perhaps it can start earlier and tasks can be added as lines are read?

Comment on lines -602 to -616
"""Checks a rule and returns a dictionary containing the rule and a list of issues found.

Args:
rule: The rule to be checked.
checkers: The checkers to be used to check the rule.
ignore: Regular expressions to match checker codes to ignore

Returns:
A list of issues found in the rule.
Each issue is typed as a `dict`.

Raises:
InvalidRuleError: If the rule does not follow the Suricata syntax.

"""
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See other comment

Comment on lines +455 to +457
# ----------------------------------------------------------------------
# NEW MULTIPROCESSING WORKER FUNCTIONS
# ----------------------------------------------------------------------
Copy link
Copy Markdown
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These comments should be removed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants