packaging data with a custom language component #9915
Replies: 2 comments 4 replies
-
If you're using a separate package with entry points, you would install this package with If it's not an entry point in a separate package, then the typical method would be to load the data from a kwarg setting in If you try to use both |
Beta Was this translation helpful? Give feedback.
-
I can see that that would work in the end, but you're doing a lot of steps by hand that you can implement as part of the component pretty easily. I'm not 100% sure what happens with a But in general, managing additional packages is a hassle, so I would pick the following option if I were implementing this, with the data loaded from a path in https://spacy.io/usage/processing-pipelines#component-data-initialization Then you only need to include the code with Another example with the standard initialization and serialization is the component in this PR, which loads patterns from JSONL: https://github.com/explosion/spaCy/pull/9880/files And I suppose the more hacky option if the JSON file isn't that large is to convert it to a native python data structure and import it directly. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I am trying to add a custom language component to an existing NER pipeline.
First, I added it in a python script, and saved
to_disk
.My custom component file,
postprocess.py
looks like this:Now I have a config.cfg:
In setup.py I've added these lines:
When I package it with
python -m spacy package ./sdoh-best ./packages --code sdoh-best/postprocess.py,sdoh-best/functions.py,sdoh-best/mapping-ontology-to-umls.json
,I am running into:
I tried adding this to the config.cfg, but that didn't help:
Beta Was this translation helpful? Give feedback.
All reactions