Skip to content

Conversation

@nattsw
Copy link
Contributor

@nattsw nattsw commented Jan 13, 2025

If the user's locale is different from the post's detected locale, a post can be translated, and the 🌐 icon will appear.

  • User is "de", post is "en", 🌐 will show
  • User is "en_GB", post is "en", 🌐 will show (bug)

For each translator (e.g. Google, Amazon, Azure), we maintain a list of mappings SUPPORTED_LANG_MAPPING for each locale. On top of that, the mapping is also used to define what language the post should be translated to. When using most translators, "en_GB" forums will translate text to "en" due to these mappings. However, this is not a problem for Discourse AI provider -- since there are no restrictions on language, every post is translatable to any (regionalised) locale.

This PR introduces a site setting normalize_language_variants so that admins can properly indicate if (example locale)

  1. They want "ar" == "ar_SA" == "ar_MA" or;
  2. They would prefer that "ar_SA" is not the same as "ar".

This PR normalizes locales within Discourse AI translator, so that it sees "ar" to be the same as "ar_SA".

@nattsw nattsw force-pushed the normalize-language branch 7 times, most recently from b5da794 to 07eabc3 Compare January 13, 2025 04:46
The criteria to allow a post to be translated is if the user's locale is different from the site's default locale.
For each translator (e.g. Google, Amazon, Azure), we maintain a list of mappings `SUPPORTED_LANG_MAPPING` for each locale. On top of that, th
is mapping is also used to define what language the post should be translated to. When using most translators available in the plugin, "en_GB" forums
 will translate text to "en" due to these mappings. This is not a problem for Discourse AI provider, since there are no restrictions on language.

This PR introduces a site setting `normalize_language_variants_map` so that admins can properly indicate if
A. A. They want `"pt" == "pt_BR" == "pt_PT"` or `"zh" == "zh_CN" == "zh_TW"` or
B. They would prefer that `"en"` is not the same as `"en_GB"`.
@nattsw nattsw force-pushed the normalize-language branch from 07eabc3 to 6eaac39 Compare January 13, 2025 04:48
@SamSaffron
Copy link
Member

My only concern here is that this problem is being offloaded to the admin, should the plugin just handle all of this transparently? Meaning use maps for providers but pass on raw information to Discourse AI?

@nattsw
Copy link
Contributor Author

nattsw commented Jan 13, 2025

My original implementation was for the site setting to be a simple toggle, with normalize by default.

We can try something simpler for now without a site setting, which is to have Discourse AI just normalize by default.

@nattsw nattsw changed the title FIX: Allow admins to define locales that should be normalized FIX: Normalize languages within Discourse AI translator Jan 13, 2025
@nattsw nattsw merged commit 346d47c into main Jan 13, 2025
3 checks passed
@nattsw nattsw deleted the normalize-language branch January 13, 2025 09:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants