Skip to content
Discussion options

You must be logged in to vote

Hmm, as long as you're using UD French Sequoia v2.5 and the exact same config, that sounds unexpected. Our reported evaluation is on the dev set rather than the test set, so maybe that explains the difference? For that particular corpus I'd be surprised if the splits were so different, but for some UD corpora there are large differences/imbalances between test and the other splits. (We're concerned about repeatedly evaluating on the test sets in case we want to run a clean evaluation for a future publication, so we set the test sets aside and don't use them in our standard training setup.)

Replies: 1 comment 2 replies

Comment options

You must be logged in to vote
2 replies
@XavBeckers
Comment options

@adrianeboyd
Comment options

Answer selected by polm
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang / fr French language data and models perf / accuracy Performance: accuracy
2 participants