@@ -161,7 +161,7 @@ the missing value sentinel, and:
161
161
162
162
Because the original ` StringDtype ` implementations already use ` pd.NA ` and
163
163
return masked integer and boolean arrays in operations, a new variant of the
164
- existing dtypes that uses ` NaN ` and default data types is needed. The original
164
+ existing dtypes that uses ` NaN ` and default data types was needed. The original
165
165
variant of ` StringDtype ` using ` pd.NA ` will still be available for those who
166
166
want to keep using it (see below in the "Naming" subsection for how to specify
167
167
this).
@@ -175,8 +175,8 @@ this (adding a new variant of the dtype) and a new `StringArray` subclass only
175
175
needs minor changes to follow the above-mentioned missing value semantics
176
176
([ GH-58451 ] ( https://github.com/pandas-dev/pandas/pull/58451 ) ).
177
177
178
- For pandas 3.0, this is the most realistic option given this implementation is
179
- already available for a long time. Beyond 3.0, we can still explore further
178
+ For pandas 3.0, this is the most realistic option given this implementation has
179
+ already been available for a long time. Beyond 3.0, we can still explore further
180
180
improvements such as using NumPy 2.0 ([ GH-58503 ] ( https://github.com/pandas-dev/pandas/issues/58503 ) )
181
181
or nanoarrow ([ GH-58552 ] ( https://github.com/pandas-dev/pandas/issues/58552 ) ),
182
182
but at that point that is an implementation detail that should not have a
@@ -362,16 +362,23 @@ options:
362
362
## Timeline
363
363
364
364
The future PyArrow-backed string dtype was already made available behind a feature
365
- flag in pandas 2.1 (by ` pd.options.future.infer_string = True ` ).
365
+ flag in pandas 2.1 (enabled by ` pd.options.future.infer_string = True ` ).
366
366
367
- Some small enhancements or fixes (or naming changes) might still be needed and
368
- can be backported to pandas 2.2.x.
367
+ Some small enhancements or fixes might still be needed and can continue to be
368
+ backported to pandas 2.2.x.
369
369
370
- The variant using numpy object-dtype could potentially also be backported to
371
- 2.2.x to allow easier testing.
370
+ The variant using numpy object-dtype can also be backported to the 2.2.x branch
371
+ to allow easier testing. We would propose to release this as 2.3.0 (created from
372
+ the 2.2.x branch, given that the main branch already includes many other changes
373
+ targeted for 3.0), together with the deprecation warning when creating a dtype
374
+ from ` "string" ` / ` pd.StringDtype() ` .
372
375
373
- For pandas 3.0, this flag becomes enabled by default.
376
+ The 2.3.0 release would then have all future string functionality available
377
+ (both the pyarrow and object-dtype based variants of the default string dtype),
378
+ and warn existing users of the ` StringDtype ` in advance of 3.0 about how to
379
+ update their code.
374
380
381
+ For pandas 3.0, this ` future.infer_string ` flag becomes enabled by default.
375
382
376
383
## PDEP-XX History
377
384
0 commit comments