You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: .github/PULL_REQUEST_TEMPLATE.md
+1Lines changed: 1 addition & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,3 +3,4 @@
3
3
-[ ] All [code checks passed](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#pre-commit).
4
4
-[ ] Added [type annotations](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#type-hints) to new arguments/methods/functions.
5
5
-[ ] Added an entry in the latest `doc/source/whatsnew/vX.X.X.rst` file if fixing a bug or adding a new feature.
6
+
-[ ] If I used AI to develop this pull request, I prompted it to follow `AGENTS.md`.
`pandas` is an open source, BSD-licensed library providing high-performance, easy-to-use data structures and data analysis tools for the Python programming language.
5
+
6
+
## Purpose
7
+
- Assist contributors by suggesting code changes, tests, and documentation edits for the pandas repository while preserving stability and compatibility.
8
+
9
+
## Persona & Tone
10
+
- Concise, neutral, code-focused. Prioritize correctness, readability, and tests.
11
+
12
+
## Project Guidelines
13
+
- Be sure to follow all guidelines for contributing to the codebase specified at https://pandas.pydata.org/docs/development/contributing_codebase.html
14
+
- These guidelines are also available in the following local files, which should be loaded into context and adhered to
Specifies what to do upon encountering a bad line (a line with too many fields).
344
344
Allowed values are :
345
345
346
-
- 'error', raise an ParserError when a bad line is encountered.
346
+
- 'error', raise a ParserError when a bad line is encountered.
347
347
- 'warn', print a warning when a bad line is encountered and skip that line.
348
348
- 'skip', skip bad lines without raising or warning when they are encountered.
349
349
@@ -3717,6 +3717,7 @@ The look and feel of Excel worksheets created from pandas can be modified using
3717
3717
3718
3718
* ``float_format`` : Format string for floating point numbers (default ``None``).
3719
3719
* ``freeze_panes`` : A tuple of two integers representing the bottommost row and rightmost column to freeze. Each of these parameters is one-based, so (1, 1) will freeze the first row and first column (default ``None``).
3720
+
* ``autofilter`` : A boolean indicating whether to add automatic filters to all columns (default ``False``).
In most cases, parsing strings to datetimes (with any of :func:`to_datetime`, :class:`DatetimeIndex`, or :class:`Timestamp`) will produce objects with microsecond ("us") unit. The exception to this rule is if your strings have nanosecond precision, in which case the result will have "ns" unit:
Previously, :func:`to_datetime` and :class:`DatetimeIndex` would always parse strings to "ns" unit. During pandas 2.x, :class:`Timestamp` could give any of "s", "ms", "us", or "ns" depending on the specificity of the input string.
256
+
244
257
.. _timeseries.converting.format:
245
258
246
259
Providing a format argument
@@ -379,6 +392,16 @@ We subtract the epoch (midnight at January 1, 1970 UTC) and then floor divide by
Another common way to perform this conversion is to convert directly to an integer dtype. Note that the exact integers this produces will depend on the specific unit
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v3.0.0.rst
+13-3Lines changed: 13 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -203,6 +203,7 @@ Other enhancements
203
203
- :class:`Rolling` and :class:`Expanding` now support ``nunique`` (:issue:`26958`)
204
204
- :class:`Rolling` and :class:`Expanding` now support aggregations ``first`` and ``last`` (:issue:`33155`)
205
205
- :func:`qcut` now accepts the ``right`` parameter, consistent with :func:`cut` (:issue:`63053`)
206
+
- :func:`DataFrame.to_excel` has a new ``autofilter`` parameter to add automatic filters to all columns (:issue:`61194`)
206
207
- :func:`read_parquet` accepts ``to_pandas_kwargs`` which are forwarded to :meth:`pyarrow.Table.to_pandas` which enables passing additional keywords to customize the conversion to pandas, such as ``maps_as_pydicts`` to read the Parquet map data type as python dictionaries (:issue:`56842`)
207
208
- :func:`to_numeric` on big integers converts to ``object`` datatype with python integers when not coercing. (:issue:`51295`)
208
209
- :meth:`.DataFrameGroupBy.transform`, :meth:`.SeriesGroupBy.transform`, :meth:`.DataFrameGroupBy.agg`, :meth:`.SeriesGroupBy.agg`, :meth:`.SeriesGroupBy.apply`, :meth:`.DataFrameGroupBy.apply` now support ``kurt`` (:issue:`40139`)
@@ -358,7 +359,7 @@ When passing strings, the resolution will depend on the precision of the string,
358
359
In [5]: pd.to_datetime(["2024-03-22 11:43:01.002003004"]).dtype
359
360
Out[5]: dtype('<M8[ns]')
360
361
361
-
The inferred resolution now matches that of the input strings:
362
+
The inferred resolution now matches that of the input strings for nanosecond-precision strings, otherwise defaulting to microseconds:
362
363
363
364
.. ipython:: python
364
365
@@ -367,13 +368,17 @@ The inferred resolution now matches that of the input strings:
367
368
In [4]: pd.to_datetime(["2024-03-22 11:43:01.002003"]).dtype
368
369
In [5]: pd.to_datetime(["2024-03-22 11:43:01.002003004"]).dtype
369
370
371
+
This is also a change for the :class:`Timestamp` constructor with a string input, which in version 2.x.y could give second or millisecond unit, which users generally disliked (:issue:`52653`)
372
+
370
373
In cases with mixed-resolution inputs, the highest resolution is used:
371
374
372
375
.. code-block:: ipython
373
376
374
377
In [2]: pd.to_datetime([pd.Timestamp("2024-03-22 11:43:01"), "2024-03-22 11:43:01.002"]).dtype
375
378
Out[2]: dtype('<M8[ns]')
376
379
380
+
.. warning:: Many users will now get "M8[us]" dtype data in cases when they used to get "M8[ns]". For most use cases they should not notice a difference. One big exception is converting to integers, which will give integers 1000x smaller.
:func:`concat` no longer ignores ``sort`` when all objects have a :class:`DatetimeIndex`
@@ -1032,13 +1037,13 @@ Bug fixes
1032
1037
Categorical
1033
1038
^^^^^^^^^^^
1034
1039
- Bug in :class:`Categorical` where constructing from a pandas :class:`Series` or :class:`Index` with ``dtype='object'`` did not preserve the categories' dtype as ``object``; now the ``categories.dtype`` is preserved as ``object`` for these cases, while numpy arrays and Python sequences with ``dtype='object'`` continue to infer the most specific dtype (for example, ``str`` if all elements are strings) (:issue:`61778`)
1040
+
- Bug in :class:`pandas.Categorical` displaying string categories without quotes when using "string" dtype (:issue:`63045`)
1035
1041
- Bug in :func:`Series.apply` where ``nan`` was ignored for :class:`CategoricalDtype` (:issue:`59938`)
1036
1042
- Bug in :func:`bdate_range` raising ``ValueError`` with frequency ``freq="cbh"`` (:issue:`62849`)
1037
1043
- Bug in :func:`testing.assert_index_equal` raising ``TypeError`` instead of ``AssertionError`` for incomparable ``CategoricalIndex`` when ``check_categorical=True`` and ``exact=False`` (:issue:`61935`)
1038
1044
- Bug in :meth:`Categorical.astype` where ``copy=False`` would still trigger a copy of the codes (:issue:`62000`)
1039
1045
- Bug in :meth:`DataFrame.pivot` and :meth:`DataFrame.set_index` raising an ``ArrowNotImplementedError`` for columns with pyarrow dictionary dtype (:issue:`53051`)
1040
1046
- Bug in :meth:`Series.convert_dtypes` with ``dtype_backend="pyarrow"`` where empty :class:`CategoricalDtype` :class:`Series` raised an error or got converted to ``null[pyarrow]`` (:issue:`59934`)
1041
-
-
1042
1047
1043
1048
Datetimelike
1044
1049
^^^^^^^^^^^^
@@ -1111,12 +1116,14 @@ Conversion
1111
1116
- Bug in :meth:`DataFrame.astype` not casting ``values`` for Arrow-based dictionary dtype correctly (:issue:`58479`)
1112
1117
- Bug in :meth:`DataFrame.update` bool dtype being converted to object (:issue:`55509`)
1113
1118
- Bug in :meth:`Series.astype` might modify read-only array inplace when casting to a string dtype (:issue:`57212`)
1119
+
- Bug in :meth:`Series.convert_dtypes` and :meth:`DataFrame.convert_dtypes` raising ``TypeError`` when called on data with complex dtype (:issue:`60129`)
1114
1120
- Bug in :meth:`Series.convert_dtypes` and :meth:`DataFrame.convert_dtypes` removing timezone information for objects with :class:`ArrowDtype` (:issue:`60237`)
1115
1121
- Bug in :meth:`Series.reindex` not maintaining ``float32`` type when a ``reindex`` introduces a missing value (:issue:`45857`)
1116
1122
- Bug in :meth:`to_datetime` and :meth:`to_timedelta` with input ``None`` returning ``None`` instead of ``NaT``, inconsistent with other conversion methods (:issue:`23055`)
1117
1123
1118
1124
Strings
1119
1125
^^^^^^^
1126
+
- Bug in :meth:`Series.str.match` failing to raise when given a compiled ``re.Pattern`` object and conflicting ``case`` or ``flags`` arguments (:issue:`62240`)
1120
1127
- Bug in :meth:`Series.str.replace` raising an error on valid group references (``\1``, ``\2``, etc.) on series converted to PyArrow backend dtype (:issue:`62653`)
1121
1128
- Bug in :meth:`Series.str.zfill` raising ``AttributeError`` for :class:`ArrowDtype` (:issue:`61485`)
1122
1129
- Bug in :meth:`Series.value_counts` would not respect ``sort=False`` for series having ``string`` dtype (:issue:`55224`)
@@ -1127,12 +1134,12 @@ Interval
1127
1134
- :meth:`Index.is_monotonic_decreasing`, :meth:`Index.is_monotonic_increasing`, and :meth:`Index.is_unique` could incorrectly be ``False`` for an ``Index`` created from a slice of another ``Index``. (:issue:`57911`)
1128
1135
- Bug in :class:`Index`, :class:`Series`, :class:`DataFrame` constructors when given a sequence of :class:`Interval` subclass objects casting them to :class:`Interval` (:issue:`46945`)
1129
1136
- Bug in :func:`interval_range` where start and end numeric types were always cast to 64 bit (:issue:`57268`)
1137
+
- Bug in :func:`pandas.interval_range` incorrectly inferring ``int64`` dtype when ``np.float32`` and ``int`` are used for ``start`` and ``freq`` (:issue:`58964`)
1130
1138
- Bug in :meth:`IntervalIndex.get_indexer` and :meth:`IntervalIndex.drop` when one of the sides of the index is non-unique (:issue:`52245`)
1131
1139
- Construction of :class:`IntervalArray` and :class:`IntervalIndex` from arrays with mismatched signed/unsigned integer dtypes (e.g., ``int64`` and ``uint64``) now raises a :exc:`TypeError` instead of proceeding silently. (:issue:`55715`)
1132
1140
1133
1141
Indexing
1134
1142
^^^^^^^^
1135
-
- Bug in :meth:`DataFrame.__getitem__` returning modified columns when called with ``slice`` in Python 3.12 (:issue:`57500`)
1136
1143
- Bug in :meth:`DataFrame.__getitem__` when slicing a :class:`DataFrame` with many rows raised an ``OverflowError`` (:issue:`59531`)
1137
1144
- Bug in :meth:`DataFrame.__setitem__` on an empty :class:`DataFrame` with a tuple corrupting the frame (:issue:`54385`)
1138
1145
- Bug in :meth:`DataFrame.from_records` throwing a ``ValueError`` when passed an empty list in ``index`` (:issue:`58594`)
@@ -1260,6 +1267,8 @@ Groupby/resample/rolling
1260
1267
- Bug in :meth:`Rolling.skew` incorrectly computing skewness for windows following outliers due to numerical instability. The calculation now properly handles catastrophic cancellation by recomputing affected windows (:issue:`47461`)
1261
1268
- Bug in :meth:`Series.resample` could raise when the date range ended shortly before a non-existent time. (:issue:`58380`)
1262
1269
- Bug in :meth:`Series.resample` raising error when resampling non-nanosecond resolutions out of bounds for nanosecond precision (:issue:`57427`)
1270
+
- Bug in :meth:`Series.rolling.var` and :meth:`Series.rolling.std` computing incorrect results due to numerical instability. (:issue:`47721`, :issue:`52407`, :issue:`54518`, :issue:`55343`)
1271
+
- Bug in :meth:`DataFrame.groupby` methods when operating on NumPy-nullable data failing when the NA mask was not C-contiguous (:issue:`61031`)
1263
1272
1264
1273
Reshaping
1265
1274
^^^^^^^^^
@@ -1299,6 +1308,7 @@ ExtensionArray
1299
1308
- Bug in :class:`Categorical` when constructing with an :class:`Index` with :class:`ArrowDtype` (:issue:`60563`)
1300
1309
- Bug in :meth:`.arrays.ArrowExtensionArray.__setitem__` which caused wrong behavior when using an integer array with repeated values as a key (:issue:`58530`)
1301
1310
- Bug in :meth:`ArrowExtensionArray.factorize` where NA values were dropped when input was dictionary-encoded even when dropna was set to False(:issue:`60567`)
1311
+
- Bug in :meth:`NDArrayBackedExtensionArray.take` which produced arrays whose dtypes didn't match their underlying data, when called with integer arrays (:issue:`62448`)
1302
1312
- Bug in :meth:`api.types.is_datetime64_any_dtype` where a custom :class:`ExtensionDtype` would return ``False`` for array-likes (:issue:`57055`)
1303
1313
- Bug in comparison between object with :class:`ArrowDtype` and incompatible-dtyped (e.g. string vs bool) incorrectly raising instead of returning all-``False`` (for ``==``) or all-``True`` (for ``!=``) (:issue:`59505`)
1304
1314
- Bug in constructing pandas data structures when passing into ``dtype`` a string of the type followed by ``[pyarrow]`` while PyArrow is not installed would raise ``NameError`` rather than ``ImportError`` (:issue:`57928`)
0 commit comments