fix handling chat templates in uft-8 encoding (#3563)

dtrawins · web-flow · commit 6dffee0776c9 · 2025-08-07T16:09:46.000+02:00
diff --git a/docs/deploying_server_baremetal.md b/docs/deploying_server_baremetal.md
@@ -127,8 +127,6 @@ Run `setupvars` script to set required environment variables.
 
 > **Note**: If package contains Python, running this script changes Python settings for the shell that runs it. Environment variables are set only for the current shell so make sure you rerun the script before using model server in a new shell. 
 
-> **Note**: If package contains Python, OVMS uses Python's Jinja package to apply chat template when serving LLMs. In such case, please ensure you have Windows "Beta Unicode UTF-8 for worldwide language support" enabled. [Instruction](llm_utf8_troubleshoot.png)
-
 You can also build model server from source by following the [developer guide](windows_developer_guide.md).
 
 :::
diff --git a/docs/llm_utf8_troubleshoot.png b/docs/llm_utf8_troubleshoot.png
diff --git a/src/llm/servable_initializer.cpp b/src/llm/servable_initializer.cpp
@@ -92,7 +92,7 @@ void GenAiServableInitializer::loadPyTemplateProcessor(std::shared_ptr<GenAiServ
             # Try to read data from tokenizer_config.json
             tokenizer_config_file = Path(templates_directory + "/tokenizer_config.json")
             if tokenizer_config_file.is_file():
-                f = open(templates_directory + "/tokenizer_config.json")
+                f = open(templates_directory + "/tokenizer_config.json", "r", encoding="utf-8")
                 data = json.load(f)
                 bos_token = data.get("bos_token", "")
                 bos_token = "" if bos_token is None else bos_token  # Null token conversion to empty string.