-
Notifications
You must be signed in to change notification settings - Fork 1k
Description
Not really an bug with Transformers.js, but with the conversion script.
Got an error when trying to convert lmsys/fastchat-t5-3b-v1.0 with text2text-generation-with-past task.
Using task text2text-generation works fine, though.
Am I missing something?
And is there a way to run the model without it being created with -with-past?
Currently when I run const pipe = await pipeline("text2text-generation", "lmsys/fastchat-t5-3b-v1.0"); it triggers
Error: File not found. Could not locate "models/lmsys/fastchat-t5-3b-v1.0/seq2seq-lm-with-past/encoder_model.onnx".
Files inside models/lmsys/fastchat-t5-3b-v1.0/seq2seq-lm-with-past/ are the following:
added_tokens.json
config.json
generation_config.json
model.onnx
model.onnx_data
special_tokens_map.json
spiece.model
tokenizer.json
tokenizer_config.json
How to reproduce
Run:
python -m scripts.convert --model_id lmsys/fastchat-t5-3b-v1.0 --from_hub --quantize --task text2text-generation-with-pastExpect an output like this:
Using framework PyTorch: 2.0.0
Overriding 1 configuration item(s)
- use_cache -> True
Asked a sequence length of 16, but a sequence length of 1 will be used with use_past == True for `input_ids`.
Asked a sequence length of 16, but a sequence length of 1 will be used with use_past == True for `decoder_input_ids`.
/opt/homebrew/lib/python3.10/site-packages/transformers/modeling_utils.py:828: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
if causal_mask.shape[1] < attention_mask.shape[1]:
/opt/homebrew/lib/python3.10/site-packages/transformers/models/t5/modeling_t5.py:507: TracerWarning: Converting a tensor to a Python boolean might cause the trace to be incorrect. We can't record the data flow of Python values, so this value will be treated as a constant in the future. This means that the trace might not generalize to other inputs!
elif past_key_value.shape[2] != key_value_states.shape[1]:
In-place op on output of tensor.shape. See https://pytorch.org/docs/master/onnx.html#avoid-inplace-operations-when-using-tensor-shape-in-tracing-mode
In-place op on output of tensor.shape. See https://pytorch.org/docs/master/onnx.html#avoid-inplace-operations-when-using-tensor-shape-in-tracing-mode
================ Diagnostic Run torch.onnx.export version 2.0.0 ================
verbose: False, log level: Level.ERROR
======================= 0 NONE 0 NOTE 0 WARNING 0 ERROR ========================
Saving external data to one file...
Asked a sequence length of 16, but a sequence length of 1 will be used with use_past == True for `input_ids`.
Asked a sequence length of 16, but a sequence length of 1 will be used with use_past == True for `decoder_input_ids`.
Traceback (most recent call last):
File "/opt/homebrew/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/opt/homebrew/Cellar/[email protected]/3.10.11/Frameworks/Python.framework/Versions/3.10/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/Users/user/Repositories/transformers.js/scripts/convert.py", line 310, in <module>
main()
File "/Users/user/Repositories/transformers.js/scripts/convert.py", line 282, in main
_, onnx_outputs = export_models(
File "/opt/homebrew/lib/python3.10/site-packages/optimum/exporters/onnx/convert.py", line 609, in export_models
export(
File "/opt/homebrew/lib/python3.10/site-packages/optimum/exporters/onnx/convert.py", line 722, in export
config.fix_dynamic_axes(output, device=device, input_shapes=input_shapes, dtype=dtype)
File "/opt/homebrew/lib/python3.10/site-packages/optimum/exporters/onnx/base.py", line 285, in fix_dynamic_axes
outputs = session.run(None, onnx_inputs)
File "/opt/homebrew/lib/python3.10/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 200, in run
return self._sess.run(output_names, input_feed, run_options)
onnxruntime.capi.onnxruntime_pybind11_state.InvalidArgument: [ONNXRuntimeError] : 2 : INVALID_ARGUMENT : Invalid Feed Input Name:past_key_values.9.encoder.valueExpected behavior
Was expecting it to work, the same way python -m scripts.convert --model_id lmsys/fastchat-t5-3b-v1.0 --from_hub --quantize --task text2text-generation worked.
Environment
- Transformers.js version: N/A
- Browser (if applicable): N/A
- Operating system (if applicable): MacOS