Unpickling a pickled List of Doc
objects.
#10544
-
I have referred to the following issues within the spaCy repo to get answers for my issue but I haven't found a solution:
How to reproduce the behaviourHere's my code to generate
This is part of the code where this function is called:
Here's the pickling code:
where the variables in UPPERCASE are just filenames. When I use
it gives me the following error:
I looked at spacy.tokens.doc.unpickle_doc() but couldn't really find an issue. I did notice there is a free floating line in the code (Line 1739). I'm not sure if that's intended or an oversight, just thought of bringing it to your attention. Any suggestions/guidance would be greatly appreciated. Your Environment
|
Beta Was this translation helpful? Give feedback.
Replies: 3 comments
-
Hmm, I can't reproduce this. That line in the code is what tells In general, you should use a If you really want to use pickle: my initial guesses might be that the file is corrupted somehow or that there are differences in the environments where you're saving/loading? If you install all the dev dependencies, you can run a few tests to make sure that nothing is wrong with your install (would be unlikely): python -m pip install -r https://raw.githubusercontent.com/explosion/spaCy/v3.2.3/requirements.txt
python -m pytest --pyargs spacy.tests.doc.test_pickle_doc |
Beta Was this translation helpful? Give feedback.
-
Thank you for the suggestion, @adrianeboyd ! I ran the
You are spot-on! I cloned Having said that, I'm still interested in employing the Just so that I understand correctly, you're suggesting to use
and later in the code instead of
and after the
instead of pickle? So I did that, and then tried to retrieve it using the following code:
that gives me |
Beta Was this translation helpful? Give feedback.
-
You just need to use And the API docs (but there's not much else to it): https://spacy.io/api/docbin |
Beta Was this translation helpful? Give feedback.
Hmm, I can't reproduce this. That line in the code is what tells
pickle
how to handleDoc
objects.In general, you should use a
DocBin
to save your docs instead ofpickle
. It's much more efficient, safe, and portable: https://spacy.io/usage/saving-loading#docsIf you really want to use pickle: my initial guesses might be that the file is corrupted somehow or that there are differences in the environments where you're saving/loading?
If you install all the dev dependencies, you can run a few tests to make sure that nothing is wrong with your install (would be unlikely):