-
Notifications
You must be signed in to change notification settings - Fork 4k
Open
Description
Describe the bug, including details regarding any error messages, version, and platform.
It seems pandas 3 changed default string type when going from python types?
_________________ [doctest] pyarrow.lib.RecordBatch.add_column _________________
2861 >>> import pyarrow as pa
2862 >>> import pandas as pd
2863 >>> df = pd.DataFrame({'n_legs': [2, 4, 5, 100],
2864 ... 'animals': ["Flamingo", "Horse", "Brittle stars", "Centipede"]})
2865 >>> batch = pa.RecordBatch.from_pandas(df)
2866
2867 Add column:
2868
2869 >>> year = [2021, 2022, 2019, 2021]
2870 >>> batch.add_column(0,"year", year)
Differences (unified diff with -expected +actual):
@@ -2,5 +2,5 @@
year: int64
n_legs: int64
-animals: string
+animals: large_string
----
year: [2021,2022,2019,2021]Here's the same check before and after the upgrade:
pandas 2.3.3
pandas 3.0.0
Component(s)
Python