Caching xenova repo onnx files - filename parameter not working

### System Info

LAMP stack, Debian 10
Python 3.10
pip using OrtModel, onnx, transformer.. , and latest upgraded versions (as at 19 march 2025)

pip install --upgrade optimum transformers
pip install --upgrade huggingface huggingface-hub;

### Who can help?

@xenova 

Hi Josh

**Actual behaviour:**

The encoder_model_quantized is cached correctly using  "encoder_file_name" 
BUT "file_name" caches the original full sized decoder_model_merged.onnx , instead of the version I call "decoder_model_quantized "

This behaviour mimics fallback, because if I completely remove "file_name" parameter it also demonstrates the same behaviour.

### Information

- [x] The official example scripts
- [ ] My own modified scripts

### Tasks

- [x] An officially supported task in the `examples` folder (such as GLUE/SQuAD, ...)
- [ ] My own task or dataset (give details below)

### Reproduction

Issue:
When running model directly from repo, I use the following settings and expect to cache and use the named 'quantised' versions of the model.

eg.
```
config = AutoConfig.from_pretrained(f'{path}')
model = ORTModelForSeq2SeqLM.from_pretrained(
    "Xenova/opus-mt-en-de",
    config=config,
    subfolder="onnx",
    encoder_file_name="encoder_model_quantized.onnx",
    file_name="decoder_model_quantized.onnx",
    accelerator="ort",
)
```

I have also tested with  decoder_file_name="decoder_model_quantized.onnx", 

### Expected behavior

**Expected behaviour - alternatives:**

1) when explicitly stating the name of the models to be used in the file_name and encoder_file_name parameters, these are the files cached and used from repo

2) when only explicitly stating the name of the encoder to use: encoder_file_name="encoder_model_quantized.onnx", that the FALLBACK should be the corresponding decoder model.. ie. decoder_model_quantized.onnx. or
encoder_file_name="encoder_model_fp16.onnx", that the FALLBACK should be the corresponding decoder model.. ie. decoder_model_merged_fp16.onnx.



Apologies if I've overlooked something the docs or other issues, but I've checked both using github and google site:github.com searches and bupkis.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Caching xenova repo onnx files - filename parameter not working #2218

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Caching xenova repo onnx files - filename parameter not working #2218

Description

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions