You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
## Summary
This pull request makes the code try all dataset deserializers before
failing. This ensures that the data format not working with one
deserializer doesn't cause premature failure.
## Details
It collects all errors, and if none of them succeed in deserializing a
dataset it either prints the errors, if present, or it reaches the old
code which prints an error stating that there are no suitable
deserializers for the data.
Here is an example of the error message raised if I force it to fail:
`
guidellm.data.deserializers.deserializer.DataNotSupportedError: data
deserialization failed; 2 errors occurred while attempting to
deserialize data {"prompt_tokens": 1, "output_tokens": 100}:
[HFValidationError('Repo id must use alphanumeric chars or \'-\', \'_\',
\'.\', \'--\' and \'..\' are forbidden, \'-\' and \'.\' cannot start or
end the name, max length is 96: \'{"prompt_tokens": 1, "output_tokens":
100}\'.'), TypeError('InMemoryDictDatasetDeserializer.__call__() takes 4
positional arguments but 5 were given')]
`
## Test Plan
Run GuideLLM with various formats of data to ensure the proper one is
used, and test invalid inputs, too.
---
- [x] "I certify that all code in this PR is my own, except as noted
below."
## Use of AI
- [ ] Includes AI-assisted code completion
- [ ] Includes code generated by an AI application
- [ ] Includes AI-generated tests (NOTE: AI written tests should have a
docstring that includes `## WRITTEN BY AI ##`)
0 commit comments