"python ./run_csm.py" error: torch._dynamo.exc.RecompileLimitExceeded: cache_size_limit reached

Any ideas and fixes to this error? Thanks!

_After cloning the repo: https://github.com/davidbrowne17/csm-streaming, and following setup steps, I then tried to run "python ./run_csm.py" in the repo directory. I got this error. The following is the logging and full stack trace on my terminal._

> (.venv) csm-streaming$ python ./run_csm.py
Using device: cuda
Loading CSM-1B model with extreme optimizations for real-time performance...
Model compilation complete. Creating generator...
Starting maximum-intensity warmup sequence...
Optimizing GPU memory allocation...
Creating diverse audio contexts...
Forcing compilation of critical components...
Running final generation with exact same setup as a real request...
Preparing GPU for low-latency generation...
Starting audio generation for: 'This is the final warmup that exactly matches a re...'
First chunk latency: 26059.0ms
Total time: 26.77s
Generated 52 frames (4.16s of audio)
Real-time factor: 6.435x (target: <1.0)

==================================================
AUDIO GENERATION PERFORMANCE METRICS
==================================================
First chunk latency: 26059.0ms
Total generation time: 27.30s
Audio duration: 4.16s
Real-time factor (RTF): 6.562x (target: <1.0)
Number of chunks: 3
Average chunk size: 1386.7ms
Average inter-chunk latency: 368.7ms
Min/Max inter-chunk latency: 246.2ms / 491.2ms
Chunks per second: 0.11
Output file: warmup_final.wav
==================================================
Final GPU optimization...
Maximum-intensity warmup complete. First generation should now be MUCH faster.
Generating: Hey how are you doing?
W0521 20:49:28.140000 683560 .venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py:906] [0/8] torch._dynamo hit config.cache_size_limit (8)
W0521 20:49:28.140000 683560 .venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py:906] [0/8]    function: 'forward' (/home/tmp/csm-streaming/.venv/lib/python3.10/site-packages/torchtune/modules/transformer.py:554)
W0521 20:49:28.140000 683560 .venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py:906] [0/8]    last reason: 0/0: tensor 'L['mask']' size mismatch at index 1. expected 10, actual 1
W0521 20:49:28.140000 683560 .venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py:906] [0/8] To log all recompilation reasons, use TORCH_LOGS="recompiles".
W0521 20:49:28.140000 683560 .venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py:906] [0/8] To diagnose recompilation issues, see https://pytorch.org/docs/main/torch.compiler_troubleshooting.html.
Traceback (most recent call last):
  File "/home/tmp/csm-streaming/./run_csm.py", line 115, in <module>
    main()
  File "/home/tmp/csm-streaming/./run_csm.py", line 97, in main
    audio_tensor = generator.generate(
  File "/home/tmp/csm-streaming/.venv/lib/python3.10/site-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/home/tmp/csm-streaming/generator.py", line 441, in generate
    sample = self._model.generate_frame(curr_tokens, curr_tokens_mask, curr_pos, temperature, topk)
  File "/home/tmp/csm-streaming/models.py", line 152, in generate_frame
    h = self.backbone(h, input_pos=input_pos, mask=curr_backbone_mask).to(dtype=dtype)
  File "/home/tmp/csm-streaming/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/tmp/csm-streaming/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/tmp/csm-streaming/.venv/lib/python3.10/site-packages/torch/_dynamo/eval_frame.py", line 574, in _fn
    return fn(*args, **kwargs)
  File "/home/tmp/csm-streaming/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1739, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
  File "/home/tmp/csm-streaming/.venv/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1750, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/tmp/csm-streaming/.venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 1380, in __call__
    return self._torchdynamo_orig_callable(
  File "/home/tmp/csm-streaming/.venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 547, in __call__
    return _compile(
  File "/home/tmp/csm-streaming/.venv/lib/python3.10/site-packages/torch/_dynamo/convert_frame.py", line 925, in _compile
    raise RecompileLimitExceeded(f"{limit_type} reached")
torch._dynamo.exc.RecompileLimitExceeded: cache_size_limit reached

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

"python ./run_csm.py" error: torch._dynamo.exc.RecompileLimitExceeded: cache_size_limit reached #32

==================================================
AUDIO GENERATION PERFORMANCE METRICS

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Uh oh!

"python ./run_csm.py" error: torch._dynamo.exc.RecompileLimitExceeded: cache_size_limit reached #32

Description

================================================== AUDIO GENERATION PERFORMANCE METRICS

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions

==================================================
AUDIO GENERATION PERFORMANCE METRICS