(fix) Fetch referenced config on models that require it by Napuh · Pull Request #38 · alvarobartt/hf-mem

Napuh · 2026-03-01T22:21:26Z

Description

Some models, like llava-hf/llava-1.5-7b-hf have the necessary config to calculate the KV cache size referenced to other models. For example:

"text_config": {
  "_name_or_path": "lmsys/vicuna-7b-v1.5",
  "architectures": [
  "LlamaForCausalLM"
  ],
  "max_position_embeddings": 4096,
  "model_type": "llama",
  "rms_norm_eps": 1e-05,
  "torch_dtype": "float16",
  "vocab_size": 32064
}

Estimating the memory requirements for this model with the --experimental flag for KV-cache size estimation, throws and error:

TypeError: unsupported operand type(s) for *: 'int' and 'NoneType'

This PR allows to fetch the referenced config, and use those values to estimate the KV-cache size. If the _name_or_path key is present in text_config, it uses those values as a base for LLM parameters (num_hidden_layers, hidden_size, etc.), and then overwrites with values from the original text_config (like max_position_embeddings, which may have been modified)

Before this PR, ...ForConditionalGenerationmodels with referenced configs failed to calculate the KV-cache size:

hf-mem --model-id llava-hf/llava-1.5-7b-hf --experimental

Traceback (most recent call last):
  File "/Users/napuh/CODE/hf-mem/.venv/bin/hf-mem", line 10, in <module>
    sys.exit(main())
             ^^^^^^
  File "/Users/napuh/CODE/hf-mem/src/hf_mem/cli.py", line 457, in main
    asyncio.run(
  File "/Users/napuh/.local/share/uv/python/cpython-3.12.11-macos-aarch64-none/lib/python3.12/asyncio/runners.py", line 195, in run
    return runner.run(main)
           ^^^^^^^^^^^^^^^^
  File "/Users/napuh/.local/share/uv/python/cpython-3.12.11-macos-aarch64-none/lib/python3.12/asyncio/runners.py", line 118, in run
    return self._loop.run_until_complete(task)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/Users/napuh/.local/share/uv/python/cpython-3.12.11-macos-aarch64-none/lib/python3.12/asyncio/base_events.py", line 691, in run_until_complete
    return future.result()
           ^^^^^^^^^^^^^^^
  File "/Users/napuh/CODE/hf-mem/src/hf_mem/cli.py", line 355, in run
    2
TypeError: unsupported operand type(s) for *: 'int' and 'NoneType'

After this PR, the command executes succesfully.

I have read and followed the guidelines in CONTRIBUTING.md.
This has been discussed over an issue or discussion.

alvarobartt

Thanks @Napuh I've left a nit with that I think we can merge 🤗

src/hf_mem/cli.py

Napuh added 2 commits March 1, 2026 23:07

fix: fetch referenced config if _name_or_path present on text_config

0d00ada

fix: update referenced config fields with original config fields

3648552

alvarobartt approved these changes Mar 2, 2026

View reviewed changes

src/hf_mem/cli.py Outdated Show resolved Hide resolved

fix: use walrus operator for getting referenced_model config

a265313

alvarobartt merged commit ca9f0ec into alvarobartt:main Mar 5, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

(fix) Fetch referenced config on models that require it#38

(fix) Fetch referenced config on models that require it#38
alvarobartt merged 3 commits intoalvarobartt:mainfrom
Napuh:main

Napuh commented Mar 1, 2026

Uh oh!

alvarobartt left a comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Napuh commented Mar 1, 2026

Description

Uh oh!

alvarobartt left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants