Skip to content

Commit 295e778

Browse files
authored
Merge branch 'pandas-dev:main' into remove-prompts-sphinx-copybutton
2 parents e8d7be2 + 1d22331 commit 295e778

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

60 files changed

+713
-212
lines changed

doc/source/reference/indexing.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -390,6 +390,7 @@ Conversion
390390
DatetimeIndex.to_pydatetime
391391
DatetimeIndex.to_series
392392
DatetimeIndex.to_frame
393+
DatetimeIndex.to_julian_date
393394

394395
Methods
395396
~~~~~~~

doc/source/reference/testing.rst

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -52,6 +52,12 @@ Exceptions and warnings
5252
errors.OptionError
5353
errors.OutOfBoundsDatetime
5454
errors.OutOfBoundsTimedelta
55+
errors.PandasChangeWarning
56+
errors.Pandas4Warning
57+
errors.Pandas5Warning
58+
errors.PandasPendingDeprecationWarning
59+
errors.PandasDeprecationWarning
60+
errors.PandasFutureWarning
5561
errors.ParserError
5662
errors.ParserWarning
5763
errors.PerformanceWarning

doc/source/whatsnew/v0.23.0.rst

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -105,6 +105,16 @@ Please note that the string ``index`` is not supported with the round trip forma
105105
.. ipython:: python
106106
:okwarning:
107107
108+
df = pd.DataFrame(
109+
{
110+
'foo': [1, 2, 3, 4],
111+
'bar': ['a', 'b', 'c', 'd'],
112+
'baz': pd.date_range('2018-01-01', freq='d', periods=4),
113+
'qux': pd.Categorical(['a', 'b', 'c', 'c'])
114+
},
115+
index=pd.Index(range(4), name='idx')
116+
)
117+
108118
df.index.name = 'index'
109119
110120
df.to_json('test.json', orient='table')

doc/source/whatsnew/v2.3.2.rst

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,8 @@ Bug fixes
2626
"string" type in the JSON Table Schema for :class:`StringDtype` columns
2727
(:issue:`61889`)
2828
- Boolean operations (``|``, ``&``, ``^``) with bool-dtype objects on the left and :class:`StringDtype` objects on the right now cast the string to bool, with a deprecation warning (:issue:`60234`)
29+
- Fixed ``~Series.str.match``, ``~Series.str.fullmatch`` and ``~Series.str.contains``
30+
with compiled regex for the Arrow-backed string dtype (:issue:`61964`, :issue:`61942`)
2931

3032
.. ---------------------------------------------------------------------------
3133
.. _whatsnew_232.contributors:

doc/source/whatsnew/v3.0.0.rst

Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -24,6 +24,18 @@ Enhancement1
2424
Enhancement2
2525
^^^^^^^^^^^^
2626

27+
New Deprecation Policy
28+
^^^^^^^^^^^^^^^^^^^^^^
29+
pandas 3.0.0 introduces a new 3-stage deprecation policy: using ``DeprecationWarning`` initially, then switching to ``FutureWarning`` for broader visibility in the last minor version before the next major release, and then removal of the deprecated functionality in the major release. This was done to give downstream packages more time to adjust to pandas deprecations, which should reduce the amount of warnings that a user gets from code that isn't theirs. See `PDEP 17 <https://pandas.pydata.org/pdeps/0017-backwards-compatibility-and-deprecation-policy.html>`_ for more details.
30+
31+
All warnings for upcoming changes in pandas will have the base class :class:`pandas.errors.PandasChangeWarning`. Users may also use the following subclasses to control warnings.
32+
33+
- :class:`pandas.errors.Pandas4Warning`: Warnings which will be enforced in pandas 4.0.
34+
- :class:`pandas.errors.Pandas5Warning`: Warnings which will be enforced in pandas 5.0.
35+
- :class:`pandas.errors.PandasPendingDeprecationWarning`: Base class of all warnings which emit a ``PendingDeprecationWarning``, independent of the version they will be enforced.
36+
- :class:`pandas.errors.PandasDeprecationWarning`: Base class of all warnings which emit a ``DeprecationWarning``, independent of the version they will be enforced.
37+
- :class:`pandas.errors.PandasFutureWarning`: Base class of all warnings which emit a ``FutureWarning``, independent of the version they will be enforced.
38+
2739
.. _whatsnew_300.enhancements.other:
2840

2941
Other enhancements
@@ -857,6 +869,7 @@ I/O
857869
- Bug in :meth:`read_csv` raising ``TypeError`` when ``index_col`` is specified and ``na_values`` is a dict containing the key ``None``. (:issue:`57547`)
858870
- Bug in :meth:`read_csv` raising ``TypeError`` when ``nrows`` and ``iterator`` are specified without specifying a ``chunksize``. (:issue:`59079`)
859871
- Bug in :meth:`read_csv` where the order of the ``na_values`` makes an inconsistency when ``na_values`` is a list non-string values. (:issue:`59303`)
872+
- Bug in :meth:`read_csv` with ``engine="pyarrow"`` and ``dtype="Int64"`` losing precision (:issue:`56136`)
860873
- Bug in :meth:`read_excel` raising ``ValueError`` when passing array of boolean values when ``dtype="boolean"``. (:issue:`58159`)
861874
- Bug in :meth:`read_html` where ``rowspan`` in header row causes incorrect conversion to ``DataFrame``. (:issue:`60210`)
862875
- Bug in :meth:`read_json` ignoring the given ``dtype`` when ``engine="pyarrow"`` (:issue:`59516`)

pandas/_libs/tslibs/timestamps.pyx

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1994,12 +1994,14 @@ class Timestamp(_Timestamp):
19941994
>>> pd.Timestamp.utcnow() # doctest: +SKIP
19951995
Timestamp('2020-11-16 22:50:18.092888+0000', tz='UTC')
19961996
"""
1997+
from pandas.errors import Pandas4Warning
1998+
19971999
warnings.warn(
19982000
# The stdlib datetime.utcnow is deprecated, so we deprecate to match.
19992001
# GH#56680
20002002
"Timestamp.utcnow is deprecated and will be removed in a future "
20012003
"version. Use Timestamp.now('UTC') instead.",
2002-
FutureWarning,
2004+
Pandas4Warning,
20032005
stacklevel=find_stack_level(),
20042006
)
20052007
return cls.now(UTC)
@@ -2036,13 +2038,15 @@ class Timestamp(_Timestamp):
20362038
>>> pd.Timestamp.utcfromtimestamp(1584199972)
20372039
Timestamp('2020-03-14 15:32:52+0000', tz='UTC')
20382040
"""
2041+
from pandas.errors import Pandas4Warning
2042+
20392043
# GH#22451
20402044
warnings.warn(
20412045
# The stdlib datetime.utcfromtimestamp is deprecated, so we deprecate
20422046
# to match. GH#56680
20432047
"Timestamp.utcfromtimestamp is deprecated and will be removed in a "
20442048
"future version. Use Timestamp.fromtimestamp(ts, 'UTC') instead.",
2045-
FutureWarning,
2049+
Pandas4Warning,
20462050
stacklevel=find_stack_level(),
20472051
)
20482052
return cls.fromtimestamp(ts, tz="UTC")

pandas/core/arrays/_arrow_string_mixins.py

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -302,23 +302,29 @@ def _str_contains(
302302

303303
def _str_match(
304304
self,
305-
pat: str,
305+
pat: str | re.Pattern,
306306
case: bool = True,
307307
flags: int = 0,
308308
na: Scalar | lib.NoDefault = lib.no_default,
309309
):
310-
if not pat.startswith("^"):
310+
if isinstance(pat, re.Pattern):
311+
# GH#61952
312+
pat = pat.pattern
313+
if isinstance(pat, str) and not pat.startswith("^"):
311314
pat = f"^{pat}"
312315
return self._str_contains(pat, case, flags, na, regex=True)
313316

314317
def _str_fullmatch(
315318
self,
316-
pat,
319+
pat: str | re.Pattern,
317320
case: bool = True,
318321
flags: int = 0,
319322
na: Scalar | lib.NoDefault = lib.no_default,
320323
):
321-
if not pat.endswith("$") or pat.endswith("\\$"):
324+
if isinstance(pat, re.Pattern):
325+
# GH#61952
326+
pat = pat.pattern
327+
if isinstance(pat, str) and (not pat.endswith("$") or pat.endswith("\\$")):
322328
pat = f"{pat}$"
323329
return self._str_match(pat, case, flags, na)
324330

pandas/core/arrays/datetimes.py

Lines changed: 19 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2254,9 +2254,26 @@ def isocalendar(self) -> DataFrame:
22542254

22552255
def to_julian_date(self) -> npt.NDArray[np.float64]:
22562256
"""
2257-
Convert Datetime Array to float64 ndarray of Julian Dates.
2258-
0 Julian date is noon January 1, 4713 BC.
2257+
Convert TimeStamp to a Julian Date.
2258+
2259+
This method returns the number of days as a float since noon January 1, 4713 BC.
2260+
22592261
https://en.wikipedia.org/wiki/Julian_day
2262+
2263+
Returns
2264+
-------
2265+
ndarray or Index
2266+
Float values that represent each date in Julian Calendar.
2267+
2268+
See Also
2269+
--------
2270+
Timestamp.to_julian_date : Equivalent method on ``Timestamp`` objects.
2271+
2272+
Examples
2273+
--------
2274+
>>> idx = pd.DatetimeIndex(["2028-08-12 00:54", "2028-08-12 02:06"])
2275+
>>> idx.to_julian_date()
2276+
Index([2461995.5375, 2461995.5875], dtype='float64')
22602277
"""
22612278

22622279
# http://mysite.verizon.net/aesir_research/date/jdalg2.htm

pandas/core/arrays/string_arrow.py

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -346,6 +346,8 @@ def _str_contains(
346346
):
347347
if flags:
348348
return super()._str_contains(pat, case, flags, na, regex)
349+
if isinstance(pat, re.Pattern):
350+
pat = pat.pattern
349351

350352
return ArrowStringArrayMixin._str_contains(self, pat, case, flags, na, regex)
351353

pandas/core/frame.py

Lines changed: 15 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -56,6 +56,7 @@
5656
from pandas.errors import (
5757
ChainedAssignmentError,
5858
InvalidIndexError,
59+
Pandas4Warning,
5960
)
6061
from pandas.errors.cow import (
6162
_chained_assignment_method_msg,
@@ -12061,7 +12062,7 @@ def all(
1206112062
**kwargs,
1206212063
) -> Series | bool: ...
1206312064

12064-
@deprecate_nonkeyword_arguments(version="4.0", allowed_args=["self"], name="all")
12065+
@deprecate_nonkeyword_arguments(Pandas4Warning, allowed_args=["self"], name="all")
1206512066
@doc(make_doc("all", ndim=1))
1206612067
def all(
1206712068
self,
@@ -12108,7 +12109,7 @@ def min(
1210812109
**kwargs,
1210912110
) -> Series | Any: ...
1211012111

12111-
@deprecate_nonkeyword_arguments(version="4.0", allowed_args=["self"], name="min")
12112+
@deprecate_nonkeyword_arguments(Pandas4Warning, allowed_args=["self"], name="min")
1211212113
@doc(make_doc("min", ndim=2))
1211312114
def min(
1211412115
self,
@@ -12155,7 +12156,7 @@ def max(
1215512156
**kwargs,
1215612157
) -> Series | Any: ...
1215712158

12158-
@deprecate_nonkeyword_arguments(version="4.0", allowed_args=["self"], name="max")
12159+
@deprecate_nonkeyword_arguments(Pandas4Warning, allowed_args=["self"], name="max")
1215912160
@doc(make_doc("max", ndim=2))
1216012161
def max(
1216112162
self,
@@ -12171,7 +12172,7 @@ def max(
1217112172
result = result.__finalize__(self, method="max")
1217212173
return result
1217312174

12174-
@deprecate_nonkeyword_arguments(version="4.0", allowed_args=["self"], name="sum")
12175+
@deprecate_nonkeyword_arguments(Pandas4Warning, allowed_args=["self"], name="sum")
1217512176
def sum(
1217612177
self,
1217712178
axis: Axis | None = 0,
@@ -12272,7 +12273,7 @@ def sum(
1227212273
result = result.__finalize__(self, method="sum")
1227312274
return result
1227412275

12275-
@deprecate_nonkeyword_arguments(version="4.0", allowed_args=["self"], name="prod")
12276+
@deprecate_nonkeyword_arguments(Pandas4Warning, allowed_args=["self"], name="prod")
1227612277
def prod(
1227712278
self,
1227812279
axis: Axis | None = 0,
@@ -12390,7 +12391,7 @@ def mean(
1239012391
**kwargs,
1239112392
) -> Series | Any: ...
1239212393

12393-
@deprecate_nonkeyword_arguments(version="4.0", allowed_args=["self"], name="mean")
12394+
@deprecate_nonkeyword_arguments(Pandas4Warning, allowed_args=["self"], name="mean")
1239412395
@doc(make_doc("mean", ndim=2))
1239512396
def mean(
1239612397
self,
@@ -12437,7 +12438,9 @@ def median(
1243712438
**kwargs,
1243812439
) -> Series | Any: ...
1243912440

12440-
@deprecate_nonkeyword_arguments(version="4.0", allowed_args=["self"], name="median")
12441+
@deprecate_nonkeyword_arguments(
12442+
Pandas4Warning, allowed_args=["self"], name="median"
12443+
)
1244112444
@doc(make_doc("median", ndim=2))
1244212445
def median(
1244312446
self,
@@ -12487,7 +12490,7 @@ def sem(
1248712490
**kwargs,
1248812491
) -> Series | Any: ...
1248912492

12490-
@deprecate_nonkeyword_arguments(version="4.0", allowed_args=["self"], name="sem")
12493+
@deprecate_nonkeyword_arguments(Pandas4Warning, allowed_args=["self"], name="sem")
1249112494
def sem(
1249212495
self,
1249312496
axis: Axis | None = 0,
@@ -12607,7 +12610,7 @@ def var(
1260712610
**kwargs,
1260812611
) -> Series | Any: ...
1260912612

12610-
@deprecate_nonkeyword_arguments(version="4.0", allowed_args=["self"], name="var")
12613+
@deprecate_nonkeyword_arguments(Pandas4Warning, allowed_args=["self"], name="var")
1261112614
def var(
1261212615
self,
1261312616
axis: Axis | None = 0,
@@ -12726,7 +12729,7 @@ def std(
1272612729
**kwargs,
1272712730
) -> Series | Any: ...
1272812731

12729-
@deprecate_nonkeyword_arguments(version="4.0", allowed_args=["self"], name="std")
12732+
@deprecate_nonkeyword_arguments(Pandas4Warning, allowed_args=["self"], name="std")
1273012733
def std(
1273112734
self,
1273212735
axis: Axis | None = 0,
@@ -12849,7 +12852,7 @@ def skew(
1284912852
**kwargs,
1285012853
) -> Series | Any: ...
1285112854

12852-
@deprecate_nonkeyword_arguments(version="4.0", allowed_args=["self"], name="skew")
12855+
@deprecate_nonkeyword_arguments(Pandas4Warning, allowed_args=["self"], name="skew")
1285312856
def skew(
1285412857
self,
1285512858
axis: Axis | None = 0,
@@ -12969,7 +12972,7 @@ def kurt(
1296912972
**kwargs,
1297012973
) -> Series | Any: ...
1297112974

12972-
@deprecate_nonkeyword_arguments(version="4.0", allowed_args=["self"], name="kurt")
12975+
@deprecate_nonkeyword_arguments(Pandas4Warning, allowed_args=["self"], name="kurt")
1297312976
def kurt(
1297412977
self,
1297512978
axis: Axis | None = 0,

0 commit comments

Comments
 (0)