Skip to content

Reduce memory footprint of module metadata Obj instances#21230

Open
bcoles wants to merge 2 commits intorapid7:masterfrom
bcoles:obj-dedup-cache
Open

Reduce memory footprint of module metadata Obj instances#21230
bcoles wants to merge 2 commits intorapid7:masterfrom
bcoles:obj-dedup-cache

Conversation

@bcoles
Copy link
Copy Markdown
Contributor

@bcoles bcoles commented Apr 4, 2026

Description

Reduces per-instance memory overhead in Msf::Modules::Metadata::Obj when loading the ~6,600 modules from the JSON metadata cache. The metadata cache is the first thing loaded on startup, so every allocation here is multiplied across every module entry.

All changes are internal to init_from_hash and the class-level helpers. No public API changes.

Three techniques are applied:

1. String deduplication via String#-@ (fstring interning)

Fields like type, platform, arch, and author are highly duplicated across modules (e.g., only 7 unique type values across 6,600+ entries). String#-@ returns a frozen, deduplicated copy from Ruby's internal string table, so identical values share a single object.

2. Shared frozen empty containers

Many modules have empty aliases, references, notes, or targets. Previously each got its own [] or {} allocation. These now share EMPTY_ARRAY and EMPTY_HASH frozen constants.

3. PlatformList caching

PlatformList.transform was called once per module (~6,600 times) despite only ~100 unique platform strings existing. A class-level cache (cached_platform_list) returns the same PlatformList object for identical platform strings, moving the private parse_platform_list instance method to a shared class method.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR reduces memory overhead when instantiating Msf::Modules::Metadata::Obj entries from the on-disk JSON metadata cache, which is loaded early during framework startup and multiplied across thousands of modules.

Changes:

  • Interns highly-duplicated strings (e.g., type/platform/arch/author entries) to reduce duplicate heap allocations.
  • Reuses shared frozen empty array/hash constants for empty metadata fields.
  • Adds a class-level cache to reuse parsed Msf::Module::PlatformList objects for repeated platform strings.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated no new comments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

Status: Todo

Development

Successfully merging this pull request may close these issues.

3 participants