You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[multimodal] Let Audio take float data blob (#14427)
If the processed audio went through Mel transform, the spectrogram are
float values. We should allow `Audio` class to be able to take this,
since multimodal runner pybind API will have to be able to take
processed input. Once we have the pybind API we can do something like:
```python
model_id = "mistralai/Voxtral-Mini-3B-2507"
processor = AutoProcessor.from_pretrained(model_id)
audio_url = "https://huggingface.co/datasets/eustlb/audio-samples/resolve/main/dude_where_is_my_car.wav"
conversation = [
{
"role": "user",
"content": [
{"type": "audio", "url": audio_url},
{
"type": "text",
"text": "What can you tell me about this audio?",
},
],
},
]
inputs = processor.apply_chat_template(conversation,
tokenize=True,
return_dict=True,
return_tensors="pt")
inputs_combined = [
make_text_input("<s>[INST][BEGIN_AUDIO]"),
make_audio_input(inputs["input_features"]),
make_text_input("\nWhat can you tell me about this audio?[/INST]"),
]
runner = MultimodalRunner("voxtral.pte", "tekken.json", None)
config = GenerationConfig()
config.max_new_tokens = 100
runner.generate(inputs_combined, config)
```
### Summary
[PLEASE REMOVE] See [CONTRIBUTING.md's Pull
Requests](https://github.com/pytorch/executorch/blob/main/CONTRIBUTING.md#pull-requests)
for ExecuTorch PR guidelines.
[PLEASE REMOVE] If this PR closes an issue, please add a `Fixes
#<issue-id>` line.
[PLEASE REMOVE] If this PR introduces a fix or feature that should be
the upcoming release notes, please add a "Release notes: <area>" label.
For a list of available release notes labels, check out
[CONTRIBUTING.md's Pull
Requests](https://github.com/pytorch/executorch/blob/main/CONTRIBUTING.md#pull-requests).
### Test plan
[PLEASE REMOVE] How did you test this PR? Please write down any manual
commands you used and note down tests that you have written if
applicable.
0 commit comments