Skip to content

Conversation

@nattsw
Copy link
Contributor

@nattsw nattsw commented Mar 5, 2025

Currently backfilling very inefficiently, as we were querying by a single Topic or Post as long as it does not have any of the target languages (based on an array rather than a single locale).

This PR splits the backfill job into per locale rather than grouping all the locales together and finding which topics have missing translations. e.g. from the log item below, you can sort of tell that 18 topics are "shared" between all the languages, even if the language already has the translation.

DiscourseTranslator: Translating 18 topics and 19 posts to en, zh_CN, es, fr, de, pt_BR, it, ar, he

It does potentially end up loading a topic multiple times, but the alternative is a much slower backfill if new languages get added to the target languages.

@nattsw nattsw merged commit c953291 into main Mar 6, 2025
6 checks passed
@nattsw nattsw deleted the update-backfill branch March 6, 2025 14:55
@nattsw nattsw mentioned this pull request Mar 7, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants