[Feature request] Add support for external data file (.onnx_data)

Not really an bug with Transformers.js, but with the conversion script.

Got an error when trying to convert [lmsys/fastchat-t5-3b-v1.0](https://huggingface.co/lmsys/fastchat-t5-3b-v1.0) with `text2text-generation-with-past` task.
Using task `text2text-generation` works fine, though.

Am I missing something?

And is there a way to run the model without it being created with `-with-past`?
Currently when I run `const pipe = await pipeline("text2text-generation", "lmsys/fastchat-t5-3b-v1.0");` it triggers
```
Error: File not found. Could not locate "models/lmsys/fastchat-t5-3b-v1.0/seq2seq-lm-with-past/encoder_model.onnx".
```
Files inside `models/lmsys/fastchat-t5-3b-v1.0/seq2seq-lm-with-past/` are the following:
```
added_tokens.json
config.json
generation_config.json
model.onnx
model.onnx_data
special_tokens_map.json
spiece.model
tokenizer.json
tokenizer_config.json
```

**How to reproduce**

Run:
```bash
python -m scripts.convert --model_id lmsys/fastchat-t5-3b-v1.0 --from_hub --quantize --task text2text-generation-with-past
```

Expect an output like this:
```bash
Using framework PyTorch: 2.0.0
Overriding 1 configuration item(s)
        - use_cache -> True
Asked a sequence length of 16, but a sequence length of 1 will be used with use_past == True for `input_ids`.
Asked a sequence length of 16, but a sequence length of 1 will be used with use_past == True for `decoder_input_ids`.
/opt/homebrew/lib/python3.10/site-packages/transformers/modeling_utils.py:828: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  if causal_mask.shape[1] < attention_mask.shape[1]:
/opt/homebrew/lib/python3.10/site-packages/transformers/models/t5/modeling_t5.py:507: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
  elif past_key_value.shape[2] != key_value_states.shape[1]:
In-place op on output of tensor.shape. See https://pytorch.org/docs/master/onnx.html#avoid-inplace-operations-when-using-tensor-shape-in-tracing-mode
In-place op on output of tensor.shape. See https://pytorch.org/docs/master/onnx.html#avoid-inplace-operations-when-using-tensor-shape-in-tracing-mode
================ Diagnostic Run torch.onnx.export version 2.0.0 ================
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================

Saving external data to one file...
Asked a sequence length of 16, but a sequence length of 1 will be used with use_past == True for `input_ids`.
Asked a sequence length of 16, but a sequence length of 1 will be used with use_past == True for `decoder_input_ids`.
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/python@3.10/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/homebrew/Cellar/python@3.10/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
    exec(code, run_globals)
  File "/Users/user/Repositories/transformers.js/scripts/convert.py", line 310, in <module>
    main()
  File "/Users/user/Repositories/transformers.js/scripts/convert.py", line 282, in main
    _, onnx_outputs = export_models(
  File "/opt/homebrew/lib/python3.10/site-packages/optimum/exporters/onnx/convert.py", line 609, in export_models
    export(
  File "/opt/homebrew/lib/python3.10/site-packages/optimum/exporters/onnx/convert.py", line 722, in export
    config.fix_dynamic_axes(output, device=device, input_shapes=input_shapes, dtype=dtype)
  File "/opt/homebrew/lib/python3.10/site-packages/optimum/exporters/onnx/base.py", line 285, in fix_dynamic_axes
    outputs = session.run(None, onnx_inputs)
  File "/opt/homebrew/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 200, in run
    return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid Feed Input Name:past_key_values.9.encoder.value
```

**Expected behavior**

Was expecting it to work, the same way `python -m scripts.convert --model_id lmsys/fastchat-t5-3b-v1.0 --from_hub --quantize --task text2text-generation` worked.

**Environment**
- Transformers.js version: N/A
- Browser (if applicable): N/A
- Operating system (if applicable): MacOS


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature request] Add support for external data file (.onnx_data) #105

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Feature request] Add support for external data file (.onnx_data) #105

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions