Skip to content

Stop requiring CacheConfig in GenerationConfig with StaticCache #35026

@poedator

Description

@poedator

I have an observation that in some common use cases, configuring StaticCache during GenerationConfig initialisation is unnecessary.
it was introduced in #32830

Specifically, when the model is used with .generate, only the cache_implementation config option is relevant. The rest of the cache config is determined inside generate(), specifically in transformers.generation.utils::GenerationMixin._get_cache(). In that function, StaticCache is created based on the requested generation parameters, ignoring cache_config entirely.

Suggestion:

  • do not require StaticCache config in GenerationConfig.init
  • have some custom logic for ExecuTorch, that does not affect other use cases
  • have default arguments for StaticCache.init() so that it would quietly get created with some default parameters (not a good idea though).
  • have tests in transformers that set just cache_implementation="static" and then call .generate()

current workaround:

do not set cache_implementation in GenerationConfig constructor, but set it afterwards with:
model.generation_config.cache_implementation = "static"

system info
applies to transformers from september 2024 up to the current 4.46.3

Who can help?

@gante

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions