@@ -50,7 +50,7 @@ Since its introduction, the `StringDtype` has always been opt-in, and has used
50
50
the experimental ` pd.NA ` sentinel for missing values (which was also [ introduced
51
51
in pandas 1.0] ( https://pandas.pydata.org/docs/whatsnew/v1.0.0.html#experimental-na-scalar-to-denote-missing-values ) ).
52
52
However, up to this date, pandas has not yet taken the step to use ` pd.NA ` by
53
- default, and thus the ` StringDtype ` deviates in missing value behaviour compared
53
+ default for any dtype , and thus the ` StringDtype ` deviates in missing value behaviour compared
54
54
to the default data types.
55
55
56
56
In 2023, [ PDEP-10] ( https://pandas.pydata.org/pdeps/0010-required-pyarrow-dependency.html )
@@ -116,7 +116,7 @@ By default, pandas will infer this new string dtype instead of object dtype for
116
116
string data (when creating pandas objects, such as in constructors or IO
117
117
functions).
118
118
119
- The existing ` future.infer_string ` option can be used to opt-in to the future
119
+ In pandas 2.2, the existing ` future.infer_string ` option can be used to opt-in to the future
120
120
default behaviour:
121
121
122
122
``` python
@@ -202,9 +202,9 @@ dtype need a way to specify this.
202
202
203
203
Currently (pandas 2.2), ` StringDtype(storage="pyarrow_numpy") ` is used, where
204
204
the ` "pyarrow_numpy" ` storage was used to disambiguate from the existing
205
- ` "pyarrow" ` option using ` pd.NA ` . However, "pyarrow_numpy" is a rather confusing
205
+ ` "pyarrow" ` option using ` pd.NA ` . However, ` "pyarrow_numpy" ` is a rather confusing
206
206
option and doesn't generalize well. Therefore, this PDEP proposes a new naming
207
- scheme as outlined below, and "pyarrow_numpy" will be deprecated and removed
207
+ scheme as outlined below, and ` "pyarrow_numpy" ` will be deprecated and removed
208
208
before pandas 3.0.
209
209
210
210
The ` storage ` keyword of ` StringDtype ` is kept to disambiguate the underlying
@@ -258,7 +258,7 @@ However:
258
258
dtype that has massive benefits for users, both in usability as (for the
259
259
significant part of the user base that has PyArrow installed) in performance.
260
260
2 . In case pandas eventually transitions to use ` pd.NA ` as the default missing value
261
- sentinel, a migration path for _ all_ our data types will be needed, and thus
261
+ sentinel, a migration path for _ all_ pandas data types will be needed, and thus
262
262
the challenges around this will not be unique to the string dtype and
263
263
therefore not a reason to delay this.
264
264
0 commit comments