assemble cli generates incomple meta.json #11515

jmyerston · 2022-09-17T04:18:19Z

jmyerston
Sep 17, 2022

Hi,

Using the assemble command, I am trying to put together a model that sources morphologizer, tagger, parser, and lemmatizer from one model, and the senter from another model which only contains the senter (the senter is trained with a different corpus).

Everything works, but the meta.json file generated by the assemble command does not include the "performance" section. So the accuracy values are not included. When I do something similar adding a ner model instead of the senter, the assemble command works perfectly.

I wonder if the issue is that both the first model with the parser, and the second model with the senter have overlapping but different sent_p, sent_r, and sent_f values that the assemble command cannot solve (it should replace the ones generate by the parser by those of the senter). But it can also be that I'm omitting something in the assemble config.

My configuration is the following:

[components]

[components.lemmatizer]
source = ${paths.tagger_source}

[components.morphologizer]
source = ${paths.tagger_source}

[components.parser]
source = ${paths.tagger_source}

#[components.sentencizer]
#source = ${paths.tagger_source}

[components.tagger]
source = ${paths.tagger_source}

[components.tok2vec]
source = ${paths.tagger_source}

[components.senter]
source = "training/small/senter/model-best/"
component = "senter"

Answered by adrianeboyd

Sep 23, 2022

It sounds like the spacy assemble behavior around performance metadata is inconsistent.

In general it makes sense to throw out all the performance metadata when using spacy assemble because the final pipeline might not include all the components from the original pipeline(s) and combining components in a different order (like senter vs. parser) can also affect the final performance. There's no way for spacy assemble to have the info to be able to combine these numbers -- you really need to run a new evaluation on the new pipeline.

We will take a look to see about making this more consistent and try to provide more info to users running spacy assemble so it's clear what to expect.

View full answer

adrianeboyd · 2022-09-23T09:04:19Z

adrianeboyd
Sep 23, 2022

It sounds like the spacy assemble behavior around performance metadata is inconsistent.

In general it makes sense to throw out all the performance metadata when using spacy assemble because the final pipeline might not include all the components from the original pipeline(s) and combining components in a different order (like senter vs. parser) can also affect the final performance. There's no way for spacy assemble to have the info to be able to combine these numbers -- you really need to run a new evaluation on the new pipeline.

We will take a look to see about making this more consistent and try to provide more info to users running spacy assemble so it's clear what to expect.

0 replies

thomashacker · 2023-01-09T15:46:14Z

thomashacker
Jan 9, 2023

Hello,

We've looked into the issue you were experiencing but couldn't reproduce the same behavior of the assemble command. I know this discussion is a bit older and that you might've moved on to work on something different, but by any chance, are you still experiencing the issue when using spacy assemble? Thanks in advance 😄

0 replies

jmyerston · 2023-01-09T18:19:16Z

jmyerston
Jan 9, 2023
Author

Hi, I will recreate the project in which I got into this issue and make it available as a repo. I’m a afraid a deleted the original project. It could take me one to two weeks since I am kind of busy now. Thanks for looking into this.

…

On Jan 9, 2023, at 7:46 AM, Edward ***@***.***> wrote: Hello, We've looked into the issue you were experiencing but couldn't reproduce the same behavior of the assemble command. I know this discussion is a bit older and that you might've moved on to work on something different, but by any chance, are you still experiencing the issue when using spacy assemble? Thanks in advance 😄 — Reply to this email directly, view it on GitHub <#11515 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AKJYKB2BUNN6JOFS55UQJH3WRQXFDANCNFSM6AAAAAAQO2WMY4>. You are receiving this because you authored the thread.

1 reply

thomashacker Jan 10, 2023

Thanks for the assistance!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

assemble cli generates incomple meta.json #11515

Uh oh!

{{title}}

Uh oh!

Replies: 3 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

assemble cli generates incomple meta.json #11515

Uh oh!

jmyerston Sep 17, 2022

Replies: 3 comments · 1 reply

Uh oh!

adrianeboyd Sep 23, 2022

Uh oh!

thomashacker Jan 9, 2023

Uh oh!

jmyerston Jan 9, 2023 Author

Uh oh!

thomashacker Jan 10, 2023

jmyerston
Sep 17, 2022

Replies: 3 comments 1 reply

adrianeboyd
Sep 23, 2022

thomashacker
Jan 9, 2023

jmyerston
Jan 9, 2023
Author