Skip to content

Conversation

@DefTruth
Copy link
Contributor

@DefTruth DefTruth commented Oct 11, 2025

upgrade cache-dit api to 1.0.x

python run_benchmark.py \
    --ckpt ${CKPT} \
    --trace-file optimized_cache_dit.json.gz \
    --compile_export_mode compile \
    --disable_fa3 \
    --num_inference_steps 28 \
    --cache_dit_config cache_config.yaml \
    --output-file optimized_cache_dit.png --disable_quant

Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 3/3 [00:00<00:00, 40.76it/s]
Loading pipeline components...:  71%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████▏                                                | 5/7 [00:00<00:00, 14.82it/s]You set `add_prefix_space`. The tokenizer needs to be converted from the slow tokenizers
Loading checkpoint shards: 100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 78.88it/s]
Loading pipeline components...: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 7/7 [00:00<00:00, 10.54it/s]
INFO 10-11 06:19:58 [cache_adapter.py:46] FluxPipeline is officially supported by cache-dit. Use it's pre-defined BlockAdapter directly!
INFO 10-11 06:19:58 [block_adapters.py:201] Auto fill blocks_name: ['transformer_blocks', 'single_transformer_blocks'].
INFO 10-11 06:19:58 [block_adapters.py:482] Match Block Forward Pattern: FluxTransformerBlock, ForwardPattern.Pattern_1
INFO 10-11 06:19:58 [block_adapters.py:482] IN:('hidden_states', 'encoder_hidden_states'), OUT:('encoder_hidden_states', 'hidden_states'))
INFO 10-11 06:19:58 [block_adapters.py:482] Match Block Forward Pattern: FluxSingleTransformerBlock, ForwardPattern.Pattern_1
INFO 10-11 06:19:58 [block_adapters.py:482] IN:('hidden_states', 'encoder_hidden_states'), OUT:('encoder_hidden_states', 'hidden_states'))
INFO 10-11 06:19:58 [cache_adapter.py:141] Use default 'enable_separate_cfg' from block adapter register: False, Pipeline: FluxPipeline.
INFO 10-11 06:19:58 [cache_adapter.py:275] Collected Cache Config: DBCACHE_F1B0_W0M0MC2_R0.3, Calibrator Config: TaylorSeer_O(2)
INFO 10-11 06:19:58 [cache_adapter.py:275] Collected Cache Config: DBCACHE_F1B0_W0M0MC2_R0.3, Calibrator Config: TaylorSeer_O(2)
INFO 10-11 06:19:58 [pattern_base.py:51] Match Cached Blocks: CachedBlocks_Pattern_0_1_2, for transformer_blocks, cache_context: transformer_blocks_140567643384928, cache_manager: FluxPipeline_140567649235936.
INFO 10-11 06:19:58 [pattern_base.py:51] Match Cached Blocks: CachedBlocks_Pattern_0_1_2, for single_transformer_blocks, cache_context: single_transformer_blocks_140567528265904, cache_manager: FluxPipeline_140567649235936.
time mean/var: tensor([13.3521, 13.3886, 13.4104, 13.4335, 13.4646, 13.4767, 13.4747, 13.4828,
        13.4852, 13.4868]) 13.445541381835938 0.0022438988089561462

🤗Cache Options: FluxSingleTransformerBlock

{'cache_config': BasicCacheConfig(Fn_compute_blocks=1, Bn_compute_blocks=0, residual_diff_threshold=0.3, max_warmup_steps=0, max_cached_steps=-1, max_continuous_cached_steps=2, enable_separate_cfg=False, cfg_compute_first=False, cfg_diff_compute_separate=True), 'calibrator_config': TaylorSeerCalibratorConfig(enable_calibrator=True, enable_encoder_calibrator=True, calibrator_type='taylorseer', calibrator_cache_type='residual', calibrator_kwargs={}, taylorseer_order=2), 'name': 'single_transformer_blocks_140567528265904'}

⚡️Cache Steps and Residual Diffs Statistics: FluxSingleTransformerBlock

| Cache Steps | Diffs P00 | Diffs P25 | Diffs P50 | Diffs P75 | Diffs P95 | Diffs Min | Diffs Max |
|-------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
| 7           | 0.117     | 0.304     | 0.523     | 0.6       | 0.769     | 0.117     | 0.793     |


🤗Cache Options: FluxTransformerBlock

{'cache_config': BasicCacheConfig(Fn_compute_blocks=1, Bn_compute_blocks=0, residual_diff_threshold=0.3, max_warmup_steps=0, max_cached_steps=-1, max_continuous_cached_steps=2, enable_separate_cfg=False, cfg_compute_first=False, cfg_diff_compute_separate=True), 'calibrator_config': TaylorSeerCalibratorConfig(enable_calibrator=True, enable_encoder_calibrator=True, calibrator_type='taylorseer', calibrator_cache_type='residual', calibrator_kwargs={}, taylorseer_order=2), 'name': 'transformer_blocks_140567643384928'}

⚡️Cache Steps and Residual Diffs Statistics: FluxTransformerBlock

| Cache Steps | Diffs P00 | Diffs P25 | Diffs P50 | Diffs P75 | Diffs P95 | Diffs Min | Diffs Max |
|-------------|-----------|-----------|-----------|-----------|-----------|-----------|-----------|
| 17          | 0.034     | 0.064     | 0.111     | 0.163     | 0.312     | 0.034     | 0.32      |
  • baseline: NVIDIA L20, 25s
bf16
  • w/ cache-dit: NVIDIA L20, 13s
optimized_cache_dit

Copy link

@Met4physics Met4physics left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here comes an error:

Traceback (most recent call last):
  File "/storage/flux-fast/run_benchmark.py", line 92, in <module>
    main(args)
    ~~~~^^^^^^
  File "/storage/flux-fast/run_benchmark.py", line 27, in main
    pipeline = load_pipeline(args)
  File "/storage/utils/pipeline_utils.py", line 476, in load_pipeline
    pipeline = optimize(pipeline, args)
  File "/storage/flux-fast/utils/pipeline_utils.py", line 413, in optimize
    pipeline, **cache_dit.load_options(args.cache_dit_config),
                ~~~~~~~~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^
  File "/storage/miniconda3/envs/fastflux/lib/python3.13/site-packages/cache_dit/cache_factory/utils.py", line 57, in load_options
    return load_cache_options_from_yaml(path)
  File "/storage/miniconda3/envs/fastflux/lib/python3.13/site-packages/cache_dit/cache_factory/utils.py", line 20, in load_cache_options_from_yaml
    raise ValueError(
        f"Configuration file missing required item: {key}"
    )
ValueError: Configuration file missing required item: cache_type

@DefTruth
Copy link
Contributor Author

@Met4physics please install cache-dit==1.0.3 and re-try

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants