-
-
Notifications
You must be signed in to change notification settings - Fork 4.6k
Closed
Description
How to reproduce the behaviour
I am trying to build a custom ner model. So for a reference I tried the below code and generated a demo_train.spacy file.
import spacy
from spacy.tokens import DocBin
nlp = spacy.blank("en")
training_data = [
("Tokyo Tower is 333m tall.", [(0, 11, "BUILDING")]),
]
# the DocBin will store the example documents
db = DocBin()
for text, annotations in training_data:
doc = nlp(text)
ents = []
for start, end, label in annotations:
span = doc.char_span(start, end, label=label)
ents.append(span)
doc.ents = ents
db.add(doc)
db.to_disk("./spacy3/demo_train.spacy")
After the demo_train.spacy file was created, I debug the data using:
!python -m spacy debug data /home/sooraj/rough/doccano/spacy3/demo_train.spacy
The result for this command was an error which is given below:
Traceback (most recent call last):
File "/usr/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/home/sooraj/.virtualenvs/spacy3/lib/python3.8/site-packages/spacy/__main__.py", line 4, in <module>
setup_cli()
File "/home/sooraj/.virtualenvs/spacy3/lib/python3.8/site-packages/spacy/cli/_util.py", line 69, in setup_cli
command(prog_name=COMMAND)
File "/home/sooraj/.virtualenvs/spacy3/lib/python3.8/site-packages/click/core.py", line 829, in __call__
return self.main(*args, **kwargs)
File "/home/sooraj/.virtualenvs/spacy3/lib/python3.8/site-packages/click/core.py", line 782, in main
rv = self.invoke(ctx)
File "/home/sooraj/.virtualenvs/spacy3/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/sooraj/.virtualenvs/spacy3/lib/python3.8/site-packages/click/core.py", line 1259, in invoke
return _process_result(sub_ctx.command.invoke(sub_ctx))
File "/home/sooraj/.virtualenvs/spacy3/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
return ctx.invoke(self.callback, **ctx.params)
File "/home/sooraj/.virtualenvs/spacy3/lib/python3.8/site-packages/click/core.py", line 610, in invoke
return callback(*args, **kwargs)
File "/home/sooraj/.virtualenvs/spacy3/lib/python3.8/site-packages/typer/main.py", line 497, in wrapper
return callback(**use_params) # type: ignore
File "/home/sooraj/.virtualenvs/spacy3/lib/python3.8/site-packages/spacy/cli/debug_data.py", line 65, in debug_data_cli
debug_data(
File "/home/sooraj/.virtualenvs/spacy3/lib/python3.8/site-packages/spacy/cli/debug_data.py", line 89, in debug_data
cfg = util.load_config(config_path, overrides=config_overrides)
File "/home/sooraj/.virtualenvs/spacy3/lib/python3.8/site-packages/spacy/util.py", line 549, in load_config
return config.from_disk(
File "/home/sooraj/.virtualenvs/spacy3/lib/python3.8/site-packages/thinc/config.py", line 454, in from_disk
text = file_.read()
File "/usr/lib/python3.8/codecs.py", line 322, in decode
(result, consumed) = self._buffer_decode(data, self.errors, final)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9c in position 1: invalid start byte
I have used the example given in the spacy website to generate spacy file. Why is it showing this?
Your Environment
- Operating System: Ubuntu 18.04
- Python Version Used: 3.8
- spaCy Version Used: 3.1.0
- Environment Information:
Metadata
Metadata
Assignees
Labels
No labels