API: Setting Arrow-backed dtypes by default

I've been using the new Arrow backed dtypes, and I'm a bit confused on how it is decided which backend is used. One example:

```python
>>> with pandas.option_context("mode.dtype_backend", "pyarrow"):
...     pandas.Series([1, 2, 3, 4])
... 
0    1
1    2
2    3
3    4
dtype: int64
```

Why is setting the `dtype_backend` to `pyarrow` not enough to use Arrow in the `Series` constructor when no dtype is specified?

Also, when using for example `read_csv`:

```python
>>> import pandas
>>> pandas.read_csv('test.csv').dtypes
name    object
age      int64
dtype: object
>>> pandas.read_csv('test.csv', use_nullable_dtypes=True).dtypes
name    string[python]
age              Int64
dtype: object
>>> with pandas.option_context("mode.dtype_backend", "pyarrow"):
...     pandas.read_csv('test.csv').dtypes
... 
name    object
age      int64
dtype: object
>>> with pandas.option_context("mode.dtype_backend", "pyarrow"):
...     pandas.read_csv('test.csv', use_nullable_dtypes=True).dtypes
... 
name    string[pyarrow]
age      int64[pyarrow]
dtype: object
```

Why again is not enough that the user set the backend to `pyarrow` to use Arrow dtypes, and needs to call `use_nullable_dtypes`? This s what we returned, which doesn't make sense to me:

| | dtype_backend=None | dtype_backend=pyarrow |
|-|-|-|
| use_nullable_dtypes=False| NumPy | **NumPy ???** |
| use_nullable_dtypes=True| Arrow+NumPy nullables | Arrow |

What I would expect:

| | dtype_backend=None | dtype_backend=pyarrow |
|-|-|-|
| use_nullable_dtypes=False| NumPy | Arrow |
| use_nullable_dtypes=True| Arrow eventually, Arrow+Numpy nullables for now | Arrow |

Sorry if I missed the discussion, maybe I'm just missing something. But I don't see what's the use case for a user to explicitly say they want Arrow types with the option, but still giving them NumPy backed series and dataframes... Is this something it was agreed, or we just didn't make the changes to have a more intuitive behavior?

CC: @mroeschke 

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

API: Setting Arrow-backed dtypes by default #51433

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

	dtype_backend=None	dtype_backend=pyarrow
use_nullable_dtypes=False	NumPy	NumPy ???
use_nullable_dtypes=True	Arrow+NumPy nullables	Arrow

Uh oh!

API: Setting Arrow-backed dtypes by default #51433

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions