From 680651c31a656572f9cb72d71fc1eddfdc2b3a9e Mon Sep 17 00:00:00 2001
From: Peter Bierma <zintensitydev@gmail.com>
Date: Sat, 30 Nov 2024 17:10:54 -0500
Subject: [PATCH 1/9] Document what happens when PyUnicode_AsUTF8() is given
 embedded null characters.

---
 Doc/c-api/unicode.rst | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index 59bd7661965d93..7b77305ab889de 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -1035,6 +1035,12 @@ These are the UTF-8 codec APIs:
 
    As :c:func:`PyUnicode_AsUTF8AndSize`, but does not store the size.
 
+   .. warning::
+
+      This function does not strip null bytes from *unicode*, so the length of the
+      returned string (from ``strlen()``) is possibly smaller than the length of the
+      passed unicode object.
+
    .. versionadded:: 3.3
 
    .. versionchanged:: 3.7

From 8bfd541063edd934c9c00fcbf284e328efa4ed4e Mon Sep 17 00:00:00 2001
From: Peter Bierma <zintensitydev@gmail.com>
Date: Sat, 30 Nov 2024 17:15:54 -0500
Subject: [PATCH 2/9] Suggest PyUnicode_AsUTF8AndSize for user input.

---
 Doc/c-api/unicode.rst | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index 7b77305ab889de..28fb7a4e304752 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -1039,7 +1039,8 @@ These are the UTF-8 codec APIs:
 
       This function does not strip null bytes from *unicode*, so the length of the
       returned string (from ``strlen()``) is possibly smaller than the length of the
-      passed unicode object.
+      passed unicode object. Prefer :c:func:`PyUnicode_AsUTF8AndSize` when dealing with
+      user input.
 
    .. versionadded:: 3.3
 

From 52e91172badf8ebddef18e160ca472424813bfb2 Mon Sep 17 00:00:00 2001
From: Peter Bierma <zintensitydev@gmail.com>
Date: Sat, 30 Nov 2024 17:18:47 -0500
Subject: [PATCH 3/9] Switch to a note instead of a warning.

---
 Doc/c-api/unicode.rst | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index 28fb7a4e304752..db61c76090d386 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -1035,12 +1035,12 @@ These are the UTF-8 codec APIs:
 
    As :c:func:`PyUnicode_AsUTF8AndSize`, but does not store the size.
 
-   .. warning::
+   .. note::
 
-      This function does not strip null bytes from *unicode*, so the length of the
-      returned string (from ``strlen()``) is possibly smaller than the length of the
-      passed unicode object. Prefer :c:func:`PyUnicode_AsUTF8AndSize` when dealing with
-      user input.
+      This function does not handle null bytes inside of *unicode*, so the length of the
+      returned string (from ``strlen()``) could be smaller than the length of the
+      passed unicode object, if the string contained embedded null characters. Prefer
+      :c:func:`PyUnicode_AsUTF8AndSize` when dealing with user input.
 
    .. versionadded:: 3.3
 

From 1b393d4aa35a3baab6036bcdaf46fad14fa37c5b Mon Sep 17 00:00:00 2001
From: Peter Bierma <zintensitydev@gmail.com>
Date: Sun, 1 Dec 2024 08:53:45 -0500
Subject: [PATCH 4/9] Update Doc/c-api/unicode.rst

Co-authored-by: Stan U. <89152624+StanFromIreland@users.noreply.github.com>
---
 Doc/c-api/unicode.rst | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index db61c76090d386..1b55f804e0da73 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -1037,10 +1037,10 @@ These are the UTF-8 codec APIs:
 
    .. note::
 
-      This function does not handle null bytes inside of *unicode*, so the length of the
+      This function does not handle null bytes within the unicode object. As a result, the length of the
       returned string (from ``strlen()``) could be smaller than the length of the
-      passed unicode object, if the string contained embedded null characters. Prefer
-      :c:func:`PyUnicode_AsUTF8AndSize` when dealing with user input.
+      passed unicode object, if the string contained embedded null characters. When handling user input, 
+      it is recommended to use :c:func:`PyUnicode_AsUTF8AndSize` instead.
 
    .. versionadded:: 3.3
 

From 040608b59e4437766eb587700a60074d5bc654cb Mon Sep 17 00:00:00 2001
From: Peter Bierma <zintensitydev@gmail.com>
Date: Sun, 1 Dec 2024 09:41:18 -0500
Subject: [PATCH 5/9] Update Doc/c-api/unicode.rst

Co-authored-by: Tomas R. <tomas.roun8@gmail.com>
---
 Doc/c-api/unicode.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index 1b55f804e0da73..a487997c7406bb 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -1039,7 +1039,7 @@ These are the UTF-8 codec APIs:
 
       This function does not handle null bytes within the unicode object. As a result, the length of the
       returned string (from ``strlen()``) could be smaller than the length of the
-      passed unicode object, if the string contained embedded null characters. When handling user input, 
+      passed unicode object, if the string contained embedded null characters. When handling user input,
       it is recommended to use :c:func:`PyUnicode_AsUTF8AndSize` instead.
 
    .. versionadded:: 3.3

From 6fb8cbe80a609c15c9f9c6839800f04d65e15943 Mon Sep 17 00:00:00 2001
From: Peter Bierma <zintensitydev@gmail.com>
Date: Sun, 15 Dec 2024 10:46:19 -0500
Subject: [PATCH 6/9] Play with the wording a little bit.

---
 Doc/c-api/unicode.rst | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index a487997c7406bb..2fa481e5daad6d 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -1035,11 +1035,11 @@ These are the UTF-8 codec APIs:
 
    As :c:func:`PyUnicode_AsUTF8AndSize`, but does not store the size.
 
-   .. note::
+   .. warning::
 
-      This function does not handle null bytes within the unicode object. As a result, the length of the
-      returned string (from ``strlen()``) could be smaller than the length of the
-      passed unicode object, if the string contained embedded null characters. When handling user input,
+      This function does not handle null bytes within the unicode object.
+      As a result, the length of the returned string could be interpreted as
+      smaller than the length of *unicode*. When handling user input,
       it is recommended to use :c:func:`PyUnicode_AsUTF8AndSize` instead.
 
    .. versionadded:: 3.3

From 3c7b6be694a7175a1583aa4e3a2bec08adeaccb6 Mon Sep 17 00:00:00 2001
From: Peter Bierma <zintensitydev@gmail.com>
Date: Mon, 13 Jan 2025 12:37:19 -0500
Subject: [PATCH 7/9] Add a reference.

---
 Doc/c-api/unicode.rst | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index 2fa481e5daad6d..35e388f7cf7667 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -1037,10 +1037,12 @@ These are the UTF-8 codec APIs:
 
    .. warning::
 
-      This function does not handle null bytes within the unicode object.
-      As a result, the length of the returned string could be interpreted as
-      smaller than the length of *unicode*. When handling user input,
-      it is recommended to use :c:func:`PyUnicode_AsUTF8AndSize` instead.
+      This function does not have any special behavior for
+      `null bytes <https://en.wikipedia.org/wiki/Null_character>`_ embedded within
+      *unicode*. As a result, strings containing null bytes will remain in the returned
+      string, which some C functions might interpret as the end of the string, leading to
+      truncation. When handling user input, it is recommended to use :c:func:`PyUnicode_AsUTF8AndSize`
+      instead.
 
    .. versionadded:: 3.3
 

From 0eac45f93af4328a0c11c1804370eba86e80827b Mon Sep 17 00:00:00 2001
From: Peter Bierma <zintensitydev@gmail.com>
Date: Mon, 13 Jan 2025 15:45:51 -0500
Subject: [PATCH 8/9] Update Doc/c-api/unicode.rst

Co-authored-by: Victor Stinner <vstinner@python.org>
---
 Doc/c-api/unicode.rst | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index 35e388f7cf7667..c3c7516d3b908b 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -1038,8 +1038,8 @@ These are the UTF-8 codec APIs:
    .. warning::
 
       This function does not have any special behavior for
-      `null bytes <https://en.wikipedia.org/wiki/Null_character>`_ embedded within
-      *unicode*. As a result, strings containing null bytes will remain in the returned
+      `null characters <https://en.wikipedia.org/wiki/Null_character>`_ embedded within
+      *unicode*. As a result, strings containing null characters will remain in the returned
       string, which some C functions might interpret as the end of the string, leading to
       truncation. When handling user input, it is recommended to use :c:func:`PyUnicode_AsUTF8AndSize`
       instead.

From 35e078386e454696182be5372af3d0b4941a94c9 Mon Sep 17 00:00:00 2001
From: Peter Bierma <zintensitydev@gmail.com>
Date: Mon, 13 Jan 2025 15:47:01 -0500
Subject: [PATCH 9/9] Switch the wording away from 'user input'

---
 Doc/c-api/unicode.rst | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/Doc/c-api/unicode.rst b/Doc/c-api/unicode.rst
index c3c7516d3b908b..cd878f13765d15 100644
--- a/Doc/c-api/unicode.rst
+++ b/Doc/c-api/unicode.rst
@@ -1041,7 +1041,7 @@ These are the UTF-8 codec APIs:
       `null characters <https://en.wikipedia.org/wiki/Null_character>`_ embedded within
       *unicode*. As a result, strings containing null characters will remain in the returned
       string, which some C functions might interpret as the end of the string, leading to
-      truncation. When handling user input, it is recommended to use :c:func:`PyUnicode_AsUTF8AndSize`
+      truncation. If truncation is an issue, it is recommended to use :c:func:`PyUnicode_AsUTF8AndSize`
       instead.
 
    .. versionadded:: 3.3