Skip to content

BUG: DataFrameGroupBy.apply ignores group_keys setting when empty #60471

@ManelBH

Description

@ManelBH

Pandas version checks

  • I have checked that this issue has not already been reported.

  • I have confirmed this bug exists on the latest version of pandas.

  • I have confirmed this bug exists on the main branch of pandas.

Reproducible Example

df = pd.DataFrame({'A': 'a a b'.split(), 'B': [1, 2, 3], 'C': [4, 6, 5]})
g1 = df.groupby('A', group_keys=False)

df = pd.DataFrame({'A': [], 'B': [], 'C': []})
g2 = df.groupby('A', group_keys=False)
g3 = df.groupby('A', group_keys=True)

r1 = g1.apply(lambda x: x / x.sum())
r2 = g2.apply(lambda x: x / x.sum())
r3 = g3.apply(lambda x: x / x.sum())

print(r1.index) # Index([0, 1, 2], dtype='int64')
print(r2.index) # Index([], dtype='float64', name='A')
print(r3.index) # Index([], dtype='float64', name='A')

Issue Description

The group_keys parameter has no effect when the source dataframe is empty

Expected Behavior

group_keys=False should not include the group keys into the index regardless of whether the source dataframe is empty
I would expect results such as:

print(r2.index) # Index([], dtype='float64')
print(r2.index) # RangeIndex(start=0, stop=0, step=1)

Installed Versions

INSTALLED VERSIONS

commit : 0691c5c
python : 3.10.11
python-bits : 64
OS : Windows
OS-release : 10
Version : 10.0.22631
machine : AMD64
processor : Intel64 Family 6 Model 141 Stepping 1, GenuineIntel
byteorder : little
LC_ALL : None
LANG : None
LOCALE : es_ES.cp1252

pandas : 2.2.3
numpy : 1.26.4
pytz : 2024.2
dateutil : 2.9.0.post0
pip : 23.0.1
Cython : None
sphinx : None
IPython : 8.30.0
adbc-driver-postgresql: None
...
zstandard : None
tzdata : 2024.2
qtpy : None
pyqt5 : None

Metadata

Metadata

Assignees

Labels

ApplyApply, Aggregate, Transform, MapBugGroupby

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions