Skip to content

Conversation

snitish
Copy link
Member

@snitish snitish commented Mar 1, 2025

…or when called on a float array with missing values

The cause of the error is a numpy array created via np.empty(). Sometimes it contains garbage values which subsequently get passed to np.round causing errors.

…or when called on a float array with missing values
@snitish snitish requested a review from MarcoGorelli as a code owner March 1, 2025 03:21
@mroeschke mroeschke added Datetime Datetime data dtype Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate labels Mar 1, 2025
@mroeschke mroeschke added this to the 3.0 milestone Mar 2, 2025
@mroeschke mroeschke merged commit 57fd502 into pandas-dev:main Mar 2, 2025
42 checks passed
@mroeschke
Copy link
Member

Thanks @snitish

@vgolskiy
Copy link

vgolskiy commented Apr 1, 2025

Pandas v.2.2.3, the bug is still there. I am processing a pd.Series filled with float timestamps. Overflows while to_datetime conversion with following dt.round to ms:
time = pd.to_datetime(time, unit='ms', errors='coerce').dt.round(freq='ms')
The error is related to pd.to_datetime, you can see below.

More details:
File "../pandas/core/tools/datetimes.py", line 1067, in to_datetime
values = convert_listlike(arg._values, format)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "../pandas/core/tools/datetimes.py", line 407, in _convert_listlike_datetimes
return _to_datetime_with_unit(arg, unit, name, utc, errors)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "../pandas/core/tools/datetimes.py", line 512, in _to_datetime_with_unit
arr = cast_from_unit_vectorized(arg, unit=unit)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "conversion.pyx", line 149, in pandas._libs.tslibs.conversion.cast_from_unit_vectorized
File "../numpy/_core/fromnumeric.py", line 3710, in round
return _wrapfunc(a, 'round', decimals=decimals, out=out)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "../numpy/_core/fromnumeric.py", line 57, in _wrapfunc
return bound(*args, **kwds)
^^^^^^^^^^^^^^^^^^^^
FloatingPointError: overflow encountered in multiply

@asishm
Copy link
Member

asishm commented Apr 3, 2025

@vgolskiy yes, the fix will be released with 3.0 (unless this gets backported to the 2.3.x branch)

@RyuuOujiXS
Copy link

What workaround is recommended until pandas 3 is released?

@snitish
Copy link
Member Author

snitish commented Apr 10, 2025

@RyuuOujiXS you can call to_datetime() after dropping NaNs, then add back the NaNs in the right places.

@vgolskiy
Copy link

vgolskiy commented Apr 10, 2025

@snitish , so it was the NaN value that was causing overflow? Also, after removing NaNs, it is better to return NaTs as the data will have a datetime type by that moment.
@RyuuOujiXS I described the workaround in the last comment here #58419 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Datetime Datetime data dtype Missing-data np.nan, pd.NaT, pd.NA, dropna, isnull, interpolate

Projects

None yet

5 participants