Skip to content

feat: Refine HTTP timeouts across numerous analyzers and ingestors. Closes #3495#3501

Open
chauhan-varun wants to merge 6 commits intointelowlproject:developfrom
chauhan-varun:refactor/http-timeout-refinement-3495
Open

feat: Refine HTTP timeouts across numerous analyzers and ingestors. Closes #3495#3501
chauhan-varun wants to merge 6 commits intointelowlproject:developfrom
chauhan-varun:refactor/http-timeout-refinement-3495

Conversation

@chauhan-varun
Copy link

@chauhan-varun chauhan-varun commented Mar 20, 2026

Description

This PR implements a centralized HTTP timeout mechanism project-wide to prevent indefinite hangs in Celery workers, as said in issue #3495.

Key changes:

  • Introduced a global HTTP_TIMEOUT setting in intel_owl/settings/commons.py (default 30 seconds).
  • Created a robust wrapper in api_app/http_utils.py for the requests library which automatically applies the default timeout to all requests (including a new Session wrapper).
  • Migrated over 125 files, including analyzers, connectors, and ingestors, from direct requests calls to http_utils.
  • Refined the health_check method in api_app/analyzers_manager/classes.py to use settings.HTTP_TIMEOUT.
  • Fixed several secondary bugs caught during migration:
    • Added missing quote and urljoin imports in mmdb_server.py and malprob.py.
    • Ensured proper URL encoding of observable names in these analyzers.
  • Performed a project-wide cleanup with Ruff, ensuring 0 linting errors.
  • Verified in Docker: Ran the full analyzer unit test suite (199 tests) inside the container, resolving all regressions (AttributeError, TypeError) caused by the migration.

Type of change

  • Bug fix (non-breaking change which fixes an issue).
  • New feature (non-breaking change which adds functionality).
  • Breaking change (fix or feature that would cause existing functionality to not work as expected).

Checklist

  • I have read and understood the rules about how to Contribute to this project
  • The pull request is for the branch develop
  • A new plugin (analyzer, connector, visualizer, playbook, pivot or ingestor) was added or changed, in which case:
    • I strictly followed the documentation "How to create a Plugin"
    • Usage file was updated. A link to the PR to the docs repo has been added as a comment here.
    • Advanced-Usage was updated (in case the plugin provides additional optional configuration). A link to the PR to the docs repo has been added as a comment here.
    • I have dumped the configuration from Django Admin using the dumpplugin command and added it in the project as a data migration. ("How to share a plugin with the community")
    • If a File analyzer was added and it supports a mimetype which is not already supported, you added a sample of that type inside the archive test_files.zip and you added the default tests for that mimetype in test_classes.py.
    • If you created a new analyzer and it is free (does not require any API key), please add it in the FREE_TO_USE_ANALYZERS playbook by following this guide.
    • Check if it could make sense to add that analyzer/connector to other freely available playbooks.
    • I have provided the resulting raw JSON of a finished analysis and a screenshot of the results.
    • If the plugin interacts with an external service, I have created an attribute called precisely url that contains this information. This is required for Health Checks (HEAD HTTP requests).
    • If a new analyzer has beed added, I have created a unittest for it in the appropriate dir. I have also mocked all the external calls, so that no real calls are being made while testing.
    • I have added that raw JSON sample to the get_mocker_response() method of the unittest class. This serves us to provide a valid sample for testing.
    • I have created the corresponding DataModel for the new analyzer following the documentation
  • I have inserted the copyright banner at the start of the file: # This file is a part of IntelOwl https://github.com/intelowlproject/IntelOwl # See the file 'LICENSE' for copying permission. (Retained in migrated files)
  • Please avoid adding new libraries as requirements whenever it is possible. Use new libraries only if strictly needed to solve the issue you are working for. In case of doubt, ask a maintainer permission to use a specific library. (Used urllib.parse which is standard library)
  • If external libraries/packages with restrictive licenses were added, they were added in the Legal Notice section. (None added)
  • Linters (Ruff) gave 0 errors. If you have correctly installed pre-commit, it does these checks and adjustments on your behalf.
  • I have added tests for the feature/bug I solved (see tests folder). All the tests (new and old ones) gave 0 errors.
  • If the GUI has been modified:
    • I have a provided a screenshot of the result in the PR.
    • I have created new frontend tests for the new component or updated existing ones.
  • After you had submitted the PR, if DeepSource, Django Doctors or other third-party linters have triggered any alerts during the CI checks, I have solved those alerts.

@chauhan-varun
Copy link
Author

chauhan-varun commented Mar 20, 2026

Hey @mlodic, I've finished implementing the centralized HTTP timeouts. I've migrated all the analyzers to a new http_utils wrapper and verified them with a full test run all 199 passed.

The default timeout is now set to 30s as requested. Please have a look and let me know if everything looks good!

image image

@mlodic
Copy link
Member

mlodic commented Mar 20, 2026

it seems a bit overkill to be honest and could result in possible unwanted behaviors. Not sure if it is worth adding this.
We already have a soft time limit in place which would kill the timeout anyway

@chauhan-varun
Copy link
Author

chauhan-varun commented Mar 20, 2026

I totally understand the concern about breaking long-running analyzers. The reason I thought the HTTP-level timeout was useful is that it allows the analyzer to catch the timeout exception and fail gracefully (showing a nice error in the UI) rather than having Celery kill the entire task, which can be a bit more abrupt.

That said, since my http_utils wrapper still allows passing an explicit timeout, any analyzer that needs more than 30s can just override the default.

@chauhan-varun
Copy link
Author

To address the 'wanted behavior' concern, what if I increase the default timeout to 60 or 90 seconds? That would still prevent truly 'infinite' hangs while being safe for slow analyzers.

Or, if you prefer, I can revert the mass migration and only apply this new http_utils to the most critical analyzers. Let me know what you think is best!

@mlodic
Copy link
Member

mlodic commented Mar 20, 2026

90 seconds is definitively better

Copy link
Member

@mlodic mlodic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are there any other places where the previously existing timeout has been changed?

@chauhan-varun
Copy link
Author

The only spots where I'd accidentally changed an existing short timeout were the health_check methods. I've now reverted those back to the original 10s so they stay fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants