-
Notifications
You must be signed in to change notification settings - Fork 30.8k
Closed
Labels
Description
Feature request
Be able to construct and load a model like:
model = AutoModelForCausalLM.from_pretrained(
hf_model_repo,
attn_implementation="sdpa",
generation_config=GenerationConfig(
use_cache=True,
cache_implementation=cache_implementation,
max_length=max_cache_len,
cache_config={
"batch_size": batch_size,
"max_cache_len": max_cache_len,
},
),
)
See additional context in #32253
Motivation
This feature request is to support torch.export()
, and ensure the model is exportable in a way that can be further lowered and run in ExecuTorch with performance out-of-the-box.
Your contribution
TBD