assemble cli generates incomple meta.json #11515
-
Hi, Using the assemble command, I am trying to put together a model that sources morphologizer, tagger, parser, and lemmatizer from one model, and the senter from another model which only contains the senter (the senter is trained with a different corpus). Everything works, but the meta.json file generated by the assemble command does not include the "performance" section. So the accuracy values are not included. When I do something similar adding a ner model instead of the senter, the assemble command works perfectly. I wonder if the issue is that both the first model with the parser, and the second model with the senter have overlapping but different sent_p, sent_r, and sent_f values that the assemble command cannot solve (it should replace the ones generate by the parser by those of the senter). But it can also be that I'm omitting something in the assemble config. My configuration is the following: [components] [components.lemmatizer] [components.morphologizer] [components.parser] #[components.sentencizer] [components.tagger] [components.tok2vec] [components.senter] |
Beta Was this translation helpful? Give feedback.
Replies: 3 comments 1 reply
-
It sounds like the In general it makes sense to throw out all the performance metadata when using We will take a look to see about making this more consistent and try to provide more info to users running |
Beta Was this translation helpful? Give feedback.
-
Hello, We've looked into the issue you were experiencing but couldn't reproduce the same behavior of the |
Beta Was this translation helpful? Give feedback.
-
Hi,
I will recreate the project in which I got into this issue and make it available as a repo. I’m a afraid a deleted the original project. It could take me one to two weeks since I am kind of busy now. Thanks for looking into this.
… On Jan 9, 2023, at 7:46 AM, Edward ***@***.***> wrote:
Hello,
We've looked into the issue you were experiencing but couldn't reproduce the same behavior of the assemble command. I know this discussion is a bit older and that you might've moved on to work on something different, but by any chance, are you still experiencing the issue when using spacy assemble? Thanks in advance 😄
—
Reply to this email directly, view it on GitHub <#11515 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKJYKB2BUNN6JOFS55UQJH3WRQXFDANCNFSM6AAAAAAQO2WMY4>.
You are receiving this because you authored the thread.
|
Beta Was this translation helpful? Give feedback.
It sounds like the
spacy assemble
behavior around performance metadata is inconsistent.In general it makes sense to throw out all the performance metadata when using
spacy assemble
because the final pipeline might not include all the components from the original pipeline(s) and combining components in a different order (likesenter
vs.parser
) can also affect the final performance. There's no way forspacy assemble
to have the info to be able to combine these numbers -- you really need to run a new evaluation on the new pipeline.We will take a look to see about making this more consistent and try to provide more info to users running
spacy assemble
so it's clear what to expect.