Skip to content

Commit 8400cd3

Browse files
author
Abu Jabar Mubarak
authored
Merge branch 'pandas-dev:main' into main
2 parents a12294b + a2315af commit 8400cd3

File tree

9 files changed

+59
-19
lines changed

9 files changed

+59
-19
lines changed

README.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -175,7 +175,7 @@ All contributions, bug reports, bug fixes, documentation improvements, enhanceme
175175

176176
A detailed overview on how to contribute can be found in the **[contributing guide](https://pandas.pydata.org/docs/dev/development/contributing.html)**.
177177

178-
If you are simply looking to start working with the pandas codebase, navigate to the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues) and start looking through interesting issues. There are a number of issues listed under [Docs](https://github.com/pandas-dev/pandas/issues?labels=Docs&sort=updated&state=open) and [good first issue](https://github.com/pandas-dev/pandas/issues?labels=good+first+issue&sort=updated&state=open) where you could start out.
178+
If you are simply looking to start working with the pandas codebase, navigate to the [GitHub "issues" tab](https://github.com/pandas-dev/pandas/issues) and start looking through interesting issues. There are a number of issues listed under [Docs](https://github.com/pandas-dev/pandas/issues?q=is%3Aissue%20state%3Aopen%20label%3ADocs%20sort%3Aupdated-desc) and [good first issue](https://github.com/pandas-dev/pandas/issues?q=is%3Aissue%20state%3Aopen%20label%3A%22good%20first%20issue%22%20sort%3Aupdated-desc) where you could start out.
179179

180180
You can also triage issues which may include reproducing bug reports, or asking for vital information such as version numbers or reproduction instructions. If you would like to start triaging issues, one easy way to get started is to [subscribe to pandas on CodeTriage](https://www.codetriage.com/pandas-dev/pandas).
181181

doc/source/whatsnew/v0.4.x.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -11,7 +11,7 @@ New features
1111
- Added Python 3 support using 2to3 (:issue:`200`)
1212
- :ref:`Added <dsintro.name_attribute>` ``name`` attribute to ``Series``, now
1313
prints as part of ``Series.__repr__``
14-
- :meth:`Series.isnull`` and :meth:`Series.notnull` (:issue:`209`, :issue:`203`)
14+
- :meth:`Series.isnull` and :meth:`Series.notnull` (:issue:`209`, :issue:`203`)
1515
- :ref:`Added <basics.align>` ``Series.align`` method for aligning two series
1616
with choice of join method (ENH56_)
1717
- :ref:`Added <advanced.get_level_values>` method ``get_level_values`` to

doc/source/whatsnew/v2.0.0.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -984,7 +984,7 @@ Removal of prior version deprecations/changes
984984
- Removed :meth:`Series.str.__iter__` (:issue:`28277`)
985985
- Removed ``pandas.SparseArray`` in favor of :class:`arrays.SparseArray` (:issue:`30642`)
986986
- Removed ``pandas.SparseSeries`` and ``pandas.SparseDataFrame``, including pickle support. (:issue:`30642`)
987-
- Enforced disallowing passing an integer ``fill_value`` to :meth:`DataFrame.shift` and :meth:`Series.shift`` with datetime64, timedelta64, or period dtypes (:issue:`32591`)
987+
- Enforced disallowing passing an integer ``fill_value`` to :meth:`DataFrame.shift` and :meth:`Series.shift` with datetime64, timedelta64, or period dtypes (:issue:`32591`)
988988
- Enforced disallowing a string column label into ``times`` in :meth:`DataFrame.ewm` (:issue:`43265`)
989989
- Enforced disallowing passing ``True`` and ``False`` into ``inclusive`` in :meth:`Series.between` in favor of ``"both"`` and ``"neither"`` respectively (:issue:`40628`)
990990
- Enforced disallowing using ``usecols`` with out of bounds indices for ``read_csv`` with ``engine="c"`` (:issue:`25623`)
@@ -1045,7 +1045,7 @@ Removal of prior version deprecations/changes
10451045
- Enforced deprecation of silently dropping columns that raised a ``TypeError`` in :class:`Series.transform` and :class:`DataFrame.transform` when used with a list or dictionary (:issue:`43740`)
10461046
- Changed behavior of :meth:`DataFrame.apply` with list-like so that any partial failure will raise an error (:issue:`43740`)
10471047
- Changed behaviour of :meth:`DataFrame.to_latex` to now use the Styler implementation via :meth:`.Styler.to_latex` (:issue:`47970`)
1048-
- Changed behavior of :meth:`Series.__setitem__` with an integer key and a :class:`Float64Index` when the key is not present in the index; previously we treated the key as positional (behaving like ``series.iloc[key] = val``), now we treat it is a label (behaving like ``series.loc[key] = val``), consistent with :meth:`Series.__getitem__`` behavior (:issue:`33469`)
1048+
- Changed behavior of :meth:`Series.__setitem__` with an integer key and a :class:`Float64Index` when the key is not present in the index; previously we treated the key as positional (behaving like ``series.iloc[key] = val``), now we treat it is a label (behaving like ``series.loc[key] = val``), consistent with :meth:`Series.__getitem__` behavior (:issue:`33469`)
10491049
- Removed ``na_sentinel`` argument from :func:`factorize`, :meth:`.Index.factorize`, and :meth:`.ExtensionArray.factorize` (:issue:`47157`)
10501050
- Changed behavior of :meth:`Series.diff` and :meth:`DataFrame.diff` with :class:`ExtensionDtype` dtypes whose arrays do not implement ``diff``, these now raise ``TypeError`` rather than casting to numpy (:issue:`31025`)
10511051
- Enforced deprecation of calling numpy "ufunc"s on :class:`DataFrame` with ``method="outer"``; this now raises ``NotImplementedError`` (:issue:`36955`)

doc/source/whatsnew/v2.0.3.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -13,7 +13,7 @@ including other versions of pandas.
1313

1414
Fixed regressions
1515
~~~~~~~~~~~~~~~~~
16-
- Bug in :meth:`Timestamp.weekday`` was returning incorrect results before ``'0000-02-29'`` (:issue:`53738`)
16+
- Bug in :meth:`Timestamp.weekday` was returning incorrect results before ``'0000-02-29'`` (:issue:`53738`)
1717
- Fixed performance regression in merging on datetime-like columns (:issue:`53231`)
1818
- Fixed regression when :meth:`DataFrame.to_string` creates extra space for string dtypes (:issue:`52690`)
1919

doc/source/whatsnew/v2.1.0.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -721,7 +721,7 @@ Conversion
721721
Strings
722722
^^^^^^^
723723
- Bug in :meth:`Series.str` that did not raise a ``TypeError`` when iterated (:issue:`54173`)
724-
- Bug in ``repr`` for :class:`DataFrame`` with string-dtype columns (:issue:`54797`)
724+
- Bug in ``repr`` for :class:`DataFrame` with string-dtype columns (:issue:`54797`)
725725

726726
Interval
727727
^^^^^^^^

doc/source/whatsnew/v3.0.0.rst

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -784,7 +784,7 @@ MultiIndex
784784

785785
I/O
786786
^^^
787-
- Bug in :class:`DataFrame` and :class:`Series` ``repr`` of :py:class:`collections.abc.Mapping`` elements. (:issue:`57915`)
787+
- Bug in :class:`DataFrame` and :class:`Series` ``repr`` of :py:class:`collections.abc.Mapping` elements. (:issue:`57915`)
788788
- Bug in :meth:`.DataFrame.to_json` when ``"index"`` was a value in the :attr:`DataFrame.column` and :attr:`Index.name` was ``None``. Now, this will fail with a ``ValueError`` (:issue:`58925`)
789789
- Bug in :meth:`.io.common.is_fsspec_url` not recognizing chained fsspec URLs (:issue:`48978`)
790790
- Bug in :meth:`DataFrame._repr_html_` which ignored the ``"display.float_format"`` option (:issue:`59876`)
@@ -869,6 +869,7 @@ Reshaping
869869
- Bug in :meth:`DataFrame.merge` when merging two :class:`DataFrame` on ``intc`` or ``uintc`` types on Windows (:issue:`60091`, :issue:`58713`)
870870
- Bug in :meth:`DataFrame.pivot_table` incorrectly subaggregating results when called without an ``index`` argument (:issue:`58722`)
871871
- Bug in :meth:`DataFrame.pivot_table` incorrectly ignoring the ``values`` argument when also supplied to the ``index`` or ``columns`` parameters (:issue:`57876`, :issue:`61292`)
872+
- Bug in :meth:`DataFrame.pivot_table` where ``margins=True`` did not correctly include groups with ``NaN`` values in the index or columns when ``dropna=False`` was explicitly passed. (:issue:`61509`)
872873
- Bug in :meth:`DataFrame.stack` with the new implementation where ``ValueError`` is raised when ``level=[]`` (:issue:`60740`)
873874
- Bug in :meth:`DataFrame.unstack` producing incorrect results when manipulating empty :class:`DataFrame` with an :class:`ExtentionDtype` (:issue:`59123`)
874875
- Bug in :meth:`concat` where concatenating DataFrame and Series with ``ignore_index = True`` drops the series name (:issue:`60723`, :issue:`56257`)

pandas/core/reshape/pivot.py

Lines changed: 16 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -396,6 +396,7 @@ def __internal_pivot_table(
396396
observed=dropna,
397397
margins_name=margins_name,
398398
fill_value=fill_value,
399+
dropna=dropna,
399400
)
400401

401402
# discard the top level
@@ -422,6 +423,7 @@ def _add_margins(
422423
observed: bool,
423424
margins_name: Hashable = "All",
424425
fill_value=None,
426+
dropna: bool = True,
425427
):
426428
if not isinstance(margins_name, str):
427429
raise ValueError("margins_name argument must be a string")
@@ -461,6 +463,7 @@ def _add_margins(
461463
kwargs,
462464
observed,
463465
margins_name,
466+
dropna,
464467
)
465468
if not isinstance(marginal_result_set, tuple):
466469
return marginal_result_set
@@ -469,7 +472,7 @@ def _add_margins(
469472
# no values, and table is a DataFrame
470473
assert isinstance(table, ABCDataFrame)
471474
marginal_result_set = _generate_marginal_results_without_values(
472-
table, data, rows, cols, aggfunc, kwargs, observed, margins_name
475+
table, data, rows, cols, aggfunc, kwargs, observed, margins_name, dropna
473476
)
474477
if not isinstance(marginal_result_set, tuple):
475478
return marginal_result_set
@@ -538,6 +541,7 @@ def _generate_marginal_results(
538541
kwargs,
539542
observed: bool,
540543
margins_name: Hashable = "All",
544+
dropna: bool = True,
541545
):
542546
margin_keys: list | Index
543547
if len(cols) > 0:
@@ -551,7 +555,7 @@ def _all_key(key):
551555
if len(rows) > 0:
552556
margin = (
553557
data[rows + values]
554-
.groupby(rows, observed=observed)
558+
.groupby(rows, observed=observed, dropna=dropna)
555559
.agg(aggfunc, **kwargs)
556560
)
557561
cat_axis = 1
@@ -567,7 +571,7 @@ def _all_key(key):
567571
else:
568572
margin = (
569573
data[cols[:1] + values]
570-
.groupby(cols[:1], observed=observed)
574+
.groupby(cols[:1], observed=observed, dropna=dropna)
571575
.agg(aggfunc, **kwargs)
572576
.T
573577
)
@@ -610,7 +614,9 @@ def _all_key(key):
610614

611615
if len(cols) > 0:
612616
row_margin = (
613-
data[cols + values].groupby(cols, observed=observed).agg(aggfunc, **kwargs)
617+
data[cols + values]
618+
.groupby(cols, observed=observed, dropna=dropna)
619+
.agg(aggfunc, **kwargs)
614620
)
615621
row_margin = row_margin.stack()
616622

@@ -633,6 +639,7 @@ def _generate_marginal_results_without_values(
633639
kwargs,
634640
observed: bool,
635641
margins_name: Hashable = "All",
642+
dropna: bool = True,
636643
):
637644
margin_keys: list | Index
638645
if len(cols) > 0:
@@ -645,7 +652,7 @@ def _all_key():
645652
return (margins_name,) + ("",) * (len(cols) - 1)
646653

647654
if len(rows) > 0:
648-
margin = data.groupby(rows, observed=observed)[rows].apply(
655+
margin = data.groupby(rows, observed=observed, dropna=dropna)[rows].apply(
649656
aggfunc, **kwargs
650657
)
651658
all_key = _all_key()
@@ -654,7 +661,9 @@ def _all_key():
654661
margin_keys.append(all_key)
655662

656663
else:
657-
margin = data.groupby(level=0, observed=observed).apply(aggfunc, **kwargs)
664+
margin = data.groupby(level=0, observed=observed, dropna=dropna).apply(
665+
aggfunc, **kwargs
666+
)
658667
all_key = _all_key()
659668
table[all_key] = margin
660669
result = table
@@ -665,7 +674,7 @@ def _all_key():
665674
margin_keys = table.columns
666675

667676
if len(cols):
668-
row_margin = data.groupby(cols, observed=observed)[cols].apply(
677+
row_margin = data.groupby(cols, observed=observed, dropna=dropna)[cols].apply(
669678
aggfunc, **kwargs
670679
)
671680
else:

pandas/tests/reshape/test_crosstab.py

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -289,7 +289,7 @@ def test_margin_dropna4(self):
289289
# GH: 10772: Keep np.nan in result with dropna=False
290290
df = DataFrame({"a": [1, 2, 2, 2, 2, np.nan], "b": [3, 3, 4, 4, 4, 4]})
291291
actual = crosstab(df.a, df.b, margins=True, dropna=False)
292-
expected = DataFrame([[1, 0, 1.0], [1, 3, 4.0], [0, 1, np.nan], [2, 4, 6.0]])
292+
expected = DataFrame([[1, 0, 1], [1, 3, 4], [0, 1, 1], [2, 4, 6]])
293293
expected.index = Index([1.0, 2.0, np.nan, "All"], name="a")
294294
expected.columns = Index([3, 4, "All"], name="b")
295295
tm.assert_frame_equal(actual, expected)
@@ -301,11 +301,11 @@ def test_margin_dropna5(self):
301301
)
302302
actual = crosstab(df.a, df.b, margins=True, dropna=False)
303303
expected = DataFrame(
304-
[[1, 0, 0, 1.0], [0, 1, 0, 1.0], [0, 3, 1, np.nan], [1, 4, 0, 6.0]]
304+
[[1, 0, 0, 1.0], [0, 1, 0, 1.0], [0, 3, 1, 4.0], [1, 4, 1, 6.0]]
305305
)
306306
expected.index = Index([1.0, 2.0, np.nan, "All"], name="a")
307307
expected.columns = Index([3.0, 4.0, np.nan, "All"], name="b")
308-
tm.assert_frame_equal(actual, expected)
308+
tm.assert_frame_equal(actual, expected, check_dtype=False)
309309

310310
def test_margin_dropna6(self):
311311
# GH: 10772: Keep np.nan in result with dropna=False
@@ -326,7 +326,7 @@ def test_margin_dropna6(self):
326326
names=["b", "c"],
327327
)
328328
expected = DataFrame(
329-
[[1, 0, 1, 0, 0, 0, 2], [2, 0, 1, 1, 0, 1, 5], [3, 0, 2, 1, 0, 0, 7]],
329+
[[1, 0, 1, 0, 0, 0, 2], [2, 0, 1, 1, 0, 1, 5], [3, 0, 2, 1, 0, 1, 7]],
330330
columns=m,
331331
)
332332
expected.index = Index(["bar", "foo", "All"], name="a")
@@ -349,7 +349,7 @@ def test_margin_dropna6(self):
349349
[0, 0, np.nan],
350350
[2, 0, 2.0],
351351
[1, 1, 2.0],
352-
[0, 1, np.nan],
352+
[0, 1, 1.0],
353353
[5, 2, 7.0],
354354
],
355355
index=m,

pandas/tests/reshape/test_pivot.py

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2585,6 +2585,36 @@ def test_pivot_table_values_as_two_params(
25852585
expected = DataFrame(data=e_data, index=e_index, columns=e_cols)
25862586
tm.assert_frame_equal(result, expected)
25872587

2588+
def test_pivot_table_margins_include_nan_groups(self):
2589+
# GH#61509
2590+
df = DataFrame(
2591+
{
2592+
"i": [1, 2, 3],
2593+
"g1": ["a", "b", "b"],
2594+
"g2": ["x", None, None],
2595+
}
2596+
)
2597+
2598+
result = df.pivot_table(
2599+
index="g1",
2600+
columns="g2",
2601+
values="i",
2602+
aggfunc="count",
2603+
dropna=False,
2604+
margins=True,
2605+
)
2606+
2607+
expected = DataFrame(
2608+
{
2609+
"x": {"a": 1.0, "b": np.nan, "All": 1.0},
2610+
np.nan: {"a": np.nan, "b": 2.0, "All": 2.0},
2611+
"All": {"a": 1.0, "b": 2.0, "All": 3.0},
2612+
}
2613+
)
2614+
expected.index.name = "g1"
2615+
expected.columns.name = "g2"
2616+
tm.assert_frame_equal(result, expected, check_dtype=False)
2617+
25882618

25892619
class TestPivot:
25902620
def test_pivot(self):

0 commit comments

Comments
 (0)