You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: hub/apps/design/globalizing/use-utf8-code-page.md
+11-7Lines changed: 11 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
-
description: How to use UTF code pages in Windows apps.
3
2
title: Use UTF-8 code pages in Windows apps
4
-
ms.date: 01/11/2022
3
+
description: How to use UTF code pages in Windows apps.
4
+
ms.date: 06/21/2023
5
5
ms.topic: article
6
6
ms.custom: seo-windows-dev
7
7
---
@@ -16,7 +16,10 @@ UTF-8 is the universal code page for internationalization and is able to encode
16
16
17
17
As of Windows Version 1903 (May 2019 Update), you can use the ActiveCodePage property in the appxmanifest for packaged apps, or the fusion manifest for unpackaged apps, to force a process to use UTF-8 as the process code page.
18
18
19
-
You can declare this property and target/run on earlier Windows builds, but you must handle legacy code page detection and conversion as usual. With a minimum target version of Windows Version 1903, the process code page will always be UTF-8 so legacy code page detection and conversion can be avoided.
19
+
> [!NOTE]
20
+
> GDI doesn't currently support setting the ActiveCodePage property per process. Instead, GDI defaults to the active system codepage. To configure your app to render UTF-8 text via GDI, go to Windows **Settings** > **Time \& language** > **Language \& region** > **Administrative language settings** > **Change system locale**, and check **Beta: Use Unicode UTF-8 for worldwide language support**. Then reboot the PC for the change to take effect.
21
+
22
+
You can declare the ActiveCodePage property, and target/run on earlier Windows builds, but you must handle legacy code page detection and conversion as usual. With a minimum target version of Windows Version 1903, the process code page will always be UTF-8, so legacy code page detection and conversion can be avoided.
20
23
21
24
> [!NOTE]
22
25
> An encoded character takes between 1 and 4 bytes. UTF-8 encoding supports longer byte sequences, up to 6 bytes, but the biggest code point of Unicode 6.0 (U+10FFFF) only takes 4 bytes.
@@ -59,22 +62,23 @@ You can declare this property and target/run on earlier Windows builds, but you
59
62
```
60
63
61
64
> [!NOTE]
62
-
> Add a manifest to an existing executable from the command line with `mt.exe -manifest <MANIFEST> -outputresource:<EXE>;#1`
65
+
> Add a manifest to an existing executable from the command line with `mt.exe -manifest <MANIFEST> -outputresource:<EXE>;#1`.
63
66
64
67
## -A vs. -W APIs
65
68
66
69
Win32 APIs often support both -A and -W variants.
67
70
68
71
-A variants recognize the ANSI code page configured on the system and support `char*`, while -W variants operate in UTF-16 and support `WCHAR`.
69
72
70
-
Until recently, Windows has emphasized "Unicode" -W variants over -A APIs. However, recent releases have used the ANSI code page and -A APIs as a means to introduce UTF-8 support to apps. If the ANSI code page is configured for UTF-8, -A APIs typically operate in UTF-8. This model has the benefit of supporting existing code built with -A APIs without any code changes.
73
+
Until recently, Windows has emphasized "Unicode" -W variants over -A APIs. However, recent releases have used the ANSI code page and -A APIs as a means to introduce UTF-8 support to apps. If the ANSI code page is configured for UTF-8, then -A APIs typically operate in UTF-8. This model has the benefit of supporting existing code built with -A APIs without any code changes.
71
74
72
75
## Code page conversion
73
76
74
-
As Windows operates natively in UTF-16 (`WCHAR`), you might need to convert UTF-8 data to UTF-16 (or vice versa) to interoperate with Windows APIs.
77
+
Because Windows operates natively in UTF-16 (`WCHAR`), you might need to convert UTF-8 data to UTF-16 (or vice versa) to interoperate with Windows APIs.
75
78
76
79
[MultiByteToWideChar](/windows/desktop/api/stringapiset/nf-stringapiset-multibytetowidechar) and [WideCharToMultiByte](/windows/desktop/api/stringapiset/nf-stringapiset-widechartomultibyte) let you convert between UTF-8 and UTF-16 (`WCHAR`) (and other code pages). This is particularly useful when a legacy Win32 API might only understand `WCHAR`. These functions allow you to convert UTF-8 input to `WCHAR` to pass into a -W API and then convert any results back if necessary.
77
-
When using these functions with `CodePage` set to `CP_UTF8`, use `dwFlags` of either `0` or `MB_ERR_INVALID_CHARS`, otherwise an `ERROR_INVALID_FLAGS` occurs.
80
+
81
+
Use `dwFlags` of either `0` or `MB_ERR_INVALID_CHARS` when using these functions with `CodePage` set to `CP_UTF8` (otherwise an `ERROR_INVALID_FLAGS` occurs).
78
82
79
83
> [!NOTE]
80
84
> `CP_ACP` equates to `CP_UTF8` only if running on Windows Version 1903 (May 2019 Update) or above and the ActiveCodePage property described above is set to UTF-8. Otherwise, it honors the legacy system code page. We recommend using `CP_UTF8` explicitly.
0 commit comments