Skip to content

fix(web): don't cache empty VM tags when machines table is transiently empty#2947

Merged
kevoreilly merged 1 commit intokevoreilly:masterfrom
intezer:fix/dont-cache-empty-vms-tags
Mar 21, 2026
Merged

fix(web): don't cache empty VM tags when machines table is transiently empty#2947
kevoreilly merged 1 commit intokevoreilly:masterfrom
intezer:fix/dont-cache-empty-vms-tags

Conversation

@itamarga
Copy link
Contributor

The CAPE service clears the machines table on startup (clean_machines) and re-adds them from config. If it crashes before re-adding, the table is temporarily empty. Gunicorn workers that handle a tagged submission during this window permanently cache an empty tag list, causing all subsequent tagged submissions to be rejected with "incorrect tag(s)".

Only cache the result when machines actually exist in the DB.

…y empty

The CAPE service clears the machines table on startup (clean_machines)
and re-adds them from config. If it crashes before re-adding, the table
is temporarily empty. Gunicorn workers that handle a tagged submission
during this window permanently cache an empty tag list, causing all
subsequent tagged submissions to be rejected with "incorrect tag(s)".

Only cache the result when machines actually exist in the DB.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request resolves a critical issue in the CAPE service where Gunicorn workers could permanently cache an empty list of VM tags if the machine table was temporarily empty during startup or recovery. This led to subsequent tagged submissions being incorrectly rejected. The fix ensures that VM tags are only cached when machines are actually present in the database, preventing the erroneous caching of empty tag lists and improving the robustness of tag-based submission handling.

Highlights

  • Prevent caching of empty VM tags: Modified the load_vms_tags function to only cache VM tags when actual machines are present in the database. This addresses an issue where a transiently empty machine table could lead to permanent caching of an empty tag list, causing subsequent tagged submissions to be rejected.
  • Ensure consistent return type: Updated load_vms_tags to always return an empty list ([]) if no VM tags are available or found, preventing None returns and ensuring a predictable output.
Changelog
  • lib/cuckoo/common/web_utils.py
    • Ensured the load_vms_tags function returns an empty list instead of None when no VM tags are available.
    • Modified the load_vms_tags function to only update the cached _all_vms_tags if machines are found in the database.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

The pull request addresses an issue where an empty list of VM tags could be cached, leading to subsequent tagged submissions being rejected. The changes correctly prevent caching an empty list when no machines are found. However, there is an opportunity to further refine the caching logic to ensure that an empty tag list is not cached even if machines exist but none of them have associated tags. A specific suggestion is provided to improve this condition.

Comment on lines +249 to +250
if machines:
_all_vms_tags = list(sorted(set(all_tags)))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

high

The current condition if machines: correctly prevents caching an empty tag list when no machines are returned by Database().list_machines(). However, if machines are present but none of them have any associated tags, all_tags will still be an empty list. In this scenario, _all_vms_tags would be updated to [], which could still lead to the "permanently cache an empty tag list" issue described in the pull request. To fully address this, _all_vms_tags should only be updated if all_tags actually contains tags.

Suggested change
if machines:
_all_vms_tags = list(sorted(set(all_tags)))
if all_tags:
_all_vms_tags = list(sorted(set(all_tags)))

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is intentional, some deployments may not use tags

@kevoreilly kevoreilly merged commit 5222726 into kevoreilly:master Mar 21, 2026
4 checks passed
@itamarga itamarga deleted the fix/dont-cache-empty-vms-tags branch March 21, 2026 19:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants