Skip to content

Commit 1bc84ca

Browse files
further edits to address feedback
1 parent 8c0b883 commit 1bc84ca

File tree

1 file changed

+11
-9
lines changed

1 file changed

+11
-9
lines changed

doc/source/user_guide/migration-3-strings.rst

Lines changed: 11 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -86,14 +86,17 @@ It can also be specified explicitly using the ``"str"`` alias:
8686
2 NaN
8787
dtype: str
8888
89+
Similarly, functions like :func:`read_csv`, :func:`read_parquet`, and otherwise
90+
will now use the new string dtype when reading string data.
91+
8992
In contrast to the current object dtype, the new string dtype will only store
9093
strings. This also means that it will raise an error if you try to store a
9194
non-string value in it (see below for more details).
9295

93-
Missing values with the new string dtype are always represented as ``NaN``, and
94-
the missing value behaviour is similar to other default dtypes.
96+
Missing values with the new string dtype are always represented as ``NaN`` (``np.nan``),
97+
and the missing value behavior is similar to other default dtypes.
9598

96-
This new string dtype should work the same as how you have been
99+
This new string dtype should otherwise work the same as how you have been
97100
using pandas with string data today. For example, all string-specific methods
98101
through the ``str`` accessor will work the same:
99102

@@ -112,13 +115,13 @@ through the ``str`` accessor will work the same:
112115
class. The dtype can be constructed as ``pd.StringDtype(na_value=np.nan)``,
113116
but for general usage we recommend to use the shorter ``"str"`` alias.
114117

115-
Overview of behaviour differences and how to address them
118+
Overview of behavior differences and how to address them
116119
---------------------------------------------------------
117120

118121
The dtype is no longer object dtype
119122
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
120123

121-
When inferring string data, the data type of the resulting DataFrame column or
124+
When inferring or reading string data, the data type of the resulting DataFrame column or
122125
Series will silently start being the new ``"str"`` dtype instead of ``"object"``
123126
dtype, and this can have some impact on your code.
124127

@@ -209,7 +212,7 @@ the missing value sentinel is always NaN (``np.nan``):
209212
>>> print(ser[2])
210213
nan
211214
212-
Generally this should be no problem when relying on missing value behaviour in
215+
Generally this should be no problem when relying on missing value behavior in
213216
pandas methods (for example, ``ser.isna()`` will give the same result as before).
214217
But when you relied on the exact value of ``None`` being present, that can
215218
impact your code.
@@ -227,9 +230,8 @@ the dtype and the exact missing value sentinel:
227230
True
228231
229232
One caveat: this function works both on scalars and on array-likes, and in the
230-
latter case it will return an array of boolean dtype. When using it in a boolean
231-
context (for example, ``if pd.isna(..): ..``) be sure to only pass a scalar to
232-
it.
233+
latter case it will return an array of bools. When using it in a Boolean context
234+
(for example, ``if pd.isna(..): ..``) be sure to only pass a scalar to it.
233235

234236
"setitem" operations will now raise an error for non-string data
235237
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

0 commit comments

Comments
 (0)