Appending morphologizer to Japanese pipeline #8677
hiroshi-matsuda-rit
started this conversation in
Language Support
Replies: 1 comment 4 replies
-
Thanks for the report! It's great to hear the morphologizer can improve the quality of POS tags. I guess this means we can get rid of the old hacky tag conversion too. Looking forward the the ELECTRA model! Can you clarify what's "proprietary" about it? |
Beta Was this translation helpful? Give feedback.
4 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
I found the morphologizer of spaCy v3 works fine with Japanese pipeline.
We can find significant improvement for UPOS but no obvious degrades in any other metrics.
Please try
ja_gsd-3.1.1.tar.gz
below.https://github.com/megagonlabs/UD_Japanese-GSD/releases/tag/r2.8-NE
https://github.com/megagonlabs/UD_Japanese-GSD/blob/master/leader_board.md
We're also evaluating Japanese pipelines with various transformers settings including UD centric ELECTRA pretrained model.
After discussing about pros and cons of these model settings in our gitter room, we'd like to send the best one here.
https://gitter.im/spaCy_ja/spacy-transformers?utm_source=share-link&utm_medium=link&utm_campaign=share-link
Thanks,
Beta Was this translation helpful? Give feedback.
All reactions