-
Notifications
You must be signed in to change notification settings - Fork 48
Description
System Info
- OS: Ubuntu 22.04.5 LTS
- Framework: pure PHP
PHP Version
8.3.25
Environment/Platform
- Command-line application
- Web application
- Serverless
- Other (please specify)
Description
When pre-downloading or using the default model for text-generation (as per the documentation), there is an issue with the download file paths resulting in a download error.
Reproduction
Steps
- Set up a minimal environment.
- Use the following snippet for example:
use function Codewithkyrian\Transformers\Pipelines\pipeline;
$textGen = pipeline(Codewithkyrian\Transformers\Pipelines\Task::TextGeneration);Expected outcome
The library will download the default model.
Actual outcome
As part of the download process, the library tries to download the file https://huggingface.co/Xenova/gpt2/resolve/main/onnx/model_quantized.onnx and fails, because it does not exist.
Analysis
This seems to be related to certain defaults, which may not be safe to assume ('onnx' for $subfolder in PretrainedModel::constructSession() and true for $quantized in PretrainedModel::fromPretrained()).
I have tried to download a variety of models with the download command. For some of them it was enough to make use of the --quantized option, for others it would have been necessary to set the subfolder which isn't possible with the download command.
When using different models while relying on the "on-the-fly" download capabilities, there is a chance to toggle the $quantized option like so for example:
$textGen = pipeline(
task: Codewithkyrian\Transformers\Pipelines\Task::TextGeneration,
quantized: false
);but the $subFolder parameter remains hidden and cannot be altered.
Possible Solution
Would it be feasible to always require those values explicitly?