Skip to content

Commit b917b37

Browse files
BUG: Converting string of type lxml.etree._ElementUnicodeResult to a datetime using pandas.to_datetime (#62604)
Co-authored-by: Matthew Roeschke <[email protected]>
1 parent b6c2964 commit b917b37

File tree

3 files changed

+18
-0
lines changed

3 files changed

+18
-0
lines changed

doc/source/whatsnew/v3.0.0.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -971,6 +971,8 @@ Datetimelike
971971
- Bug in constructing arrays with :class:`ArrowDtype` with ``timestamp`` type incorrectly allowing ``Decimal("NaN")`` (:issue:`61773`)
972972
- Bug in constructing arrays with a timezone-aware :class:`ArrowDtype` from timezone-naive datetime objects incorrectly treating those as UTC times instead of wall times like :class:`DatetimeTZDtype` (:issue:`61775`)
973973
- Bug in setting scalar values with mismatched resolution into arrays with non-nanosecond ``datetime64``, ``timedelta64`` or :class:`DatetimeTZDtype` incorrectly truncating those scalars (:issue:`56410`)
974+
- Bug in :func:`to_datetime` where passing an ``lxml.etree._ElementUnicodeResult`` together with ``format`` raised ``TypeError``. Now subclasses of ``str`` are handled. (:issue:`60933`)
975+
974976

975977
Timedelta
976978
^^^^^^^^^

pandas/_libs/tslibs/strptime.pyx

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -405,6 +405,11 @@ def array_strptime(
405405
if len(val) == 0 or val in nat_strings:
406406
iresult[i] = NPY_NAT
407407
continue
408+
elif type(val) is not str:
409+
# GH#60933: normalize string subclasses
410+
# (e.g. lxml.etree._ElementUnicodeResult). The downstream Cython
411+
# path expects an exact `str`, so ensure we pass a plain str
412+
val = str(val)
408413
elif checknull_with_nat_and_na(val):
409414
iresult[i] = NPY_NAT
410415
continue

pandas/tests/tools/test_to_datetime.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -3790,3 +3790,14 @@ def test_to_datetime_wrapped_datetime64_ps():
37903790
["1970-01-01 00:00:01.901901901"], dtype="datetime64[ns]", freq=None
37913791
)
37923792
tm.assert_index_equal(result, expected)
3793+
3794+
3795+
def test_to_datetime_lxml_elementunicoderesult_with_format(cache):
3796+
etree = pytest.importorskip("lxml.etree")
3797+
3798+
s = "2025-02-05 16:59:57"
3799+
node = etree.XML(f"<date>{s}</date>")
3800+
val = node.xpath("/date/node()")[0] # _ElementUnicodeResult
3801+
3802+
out = to_datetime(Series([val]), format="%Y-%m-%d %H:%M:%S", cache=cache)
3803+
assert out.iloc[0] == Timestamp(s)

0 commit comments

Comments
 (0)