Skip to content

Commit b7ea9ae

Browse files
committed
whatsnew
1 parent 36143ad commit b7ea9ae

File tree

1 file changed

+49
-0
lines changed

1 file changed

+49
-0
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 49 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -335,6 +335,55 @@ small behavior differences as collateral:
335335
- Adding or subtracting a :class:`Day` with a :class:`Timedelta` is no longer supported.
336336
- Adding or subtracting a :class:`Day` offset to a timezone-aware :class:`Timestamp` or datetime-like may lead to an ambiguous or non-existent time, which will raise.
337337

338+
.. _whatsnew_300.api_breaking.nan_vs_na:
339+
340+
Changed treatment of NaN values in pyarrow and numpy-nullable floating dtypes
341+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
342+
343+
Previously, when dealing with a nullable dtype (e.g. ``Float64Dtype`` or ``int64[pyarrow]``), ``NaN`` was treated as interchangeable with :class:`NA` in some circumstances but not others. This was done to make adoption easier, but caused some confusion (:issue:`32265`). In 3.0, an option ``"mode.nan_is_na"`` (default ``True``) controls whether to treat ``NaN`` as equivalent to :class:`NA`.
344+
345+
With ``pd.set_option("mode.nan_is_na", True)`` (again, this is the default), ``NaN`` can be passed to constructors, ``__setitem__``, ``__contains__`` and be treated the same as :class:`NA`. The only change users will see is that arithmetic and ``np.ufunc`` operations that previously introduced ``NaN`` entries produce :class:`NA` entries instead:
346+
347+
*Old behavior:*
348+
349+
.. code-block:: ipython
350+
351+
In [2]: ser = pd.Series([0, None], dtype=pd.Float64Dtype())
352+
In [3]: ser / 0
353+
Out[3]:
354+
0 NaN
355+
1 <NA>
356+
dtype: Float64
357+
358+
*New behavior:*
359+
360+
.. ipython:: python
361+
362+
ser = pd.Series([0, None], dtype=pd.Float64Dtype())
363+
ser / 0
364+
365+
By contrast, with ``pd.set_option("mode.nan_is_na", False)``, ``NaN`` is always considered distinct and specifically as a floating-point value, so cannot be used with integer dtypes:
366+
367+
*Old behavior:*
368+
369+
.. code-block:: ipython
370+
371+
In [2]: ser = pd.Series([1, np.nan], dtype=pd.Float64Dtype())
372+
In [3]: ser[1]
373+
Out[3]: <NA>
374+
375+
*New behavior:*
376+
377+
.. ipython:: python
378+
379+
pd.set_option("mode.nan_is_na", False)
380+
ser = pd.Series([1, np.nan], dtype=pd.Float64Dtype())
381+
ser[1]
382+
383+
If we had passed ``pd.Int64Dtype()`` or ``"int64[pyarrow]"`` for the dtype in the latter example, this would raise, as a float ``NaN`` cannot be held by an integer dtype.
384+
385+
With ``"mode.nan_is_na"`` set to ``False``, ``ser.to_numpy()`` (and ``frame.values`` and ``np.asarray(obj)``) will convert to ``object`` dtype if :class:`NA` entries are present, where before they would coerce to ``NaN``. To retain a float numpy dtype, explicitly pass ``na_value=np.nan`` to :meth:`Series.to_numpy`.
386+
338387
.. _whatsnew_300.api_breaking.deps:
339388

340389
Increased minimum version for Python

0 commit comments

Comments
 (0)