-
Notifications
You must be signed in to change notification settings - Fork 34
Open
Description
It seems like the latest datasets release conflicts with span_marker.tokenizer.py. This conflicts with the dependency transformers>=4.23.0,<5, since older versions of transformers' tokenizer seem to be incompatible with the new datasets. This prevents several methods like SpanMarkerModel.from_pretrained(...).predict from working independently of the input format, and instead always showing the following error:
ValueError: text input must of type `str` (single example), `List[str]` (batch or single pretokenized example) or `List[List[str]]` (batch of pretokenized examples).
For now users can fix it by specifically selecting an older datasets version (i.e. 3.6.0) but I would suggest having the datasets dependency be changed to datasets>=2.14.0,<4. Does that sound alright (I can open a PR for it) or were you thinking on moving to a newer transformers version instead @tomaarsen ?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels