-
Notifications
You must be signed in to change notification settings - Fork 30.9k
Description
I have an observation that in some common use cases, configuring StaticCache during GenerationConfig initialisation is unnecessary.
it was introduced in #32830
Specifically, when the model is used with .generate
, only the cache_implementation
config option is relevant. The rest of the cache config is determined inside generate()
, specifically in transformers.generation.utils::GenerationMixin._get_cache()
. In that function, StaticCache is created based on the requested generation parameters, ignoring cache_config
entirely.
Suggestion:
- do not require StaticCache config in GenerationConfig.init
- have some custom logic for ExecuTorch, that does not affect other use cases
- have default arguments for StaticCache.init() so that it would quietly get created with some default parameters (not a good idea though).
- have tests in transformers that set just
cache_implementation="static"
and then call.generate()
current workaround:
do not set cache_implementation in GenerationConfig constructor, but set it afterwards with:
model.generation_config.cache_implementation = "static"
system info
applies to transformers from september 2024 up to the current 4.46.3