-
-
Notifications
You must be signed in to change notification settings - Fork 19.1k
Open
Labels
EnhancementNeeds TriageIssue that has not been reviewed by a pandas team memberIssue that has not been reviewed by a pandas team member
Description
Feature Type
-
Adding new functionality to pandas
-
Changing existing functionality in pandas
-
Removing existing functionality in pandas
Problem Description
Currently, df.explode()
and s.explode()
flatten lists/arrays within Series/DataFrames. However, information about the original position of each element within its list is lost. This makes it difficult to:
- Easily access specific sub-values after exploding.
- Reconstruct the original nested structure if needed.
Proposed Solution:
Introduce a new parameter, offset
, to both df.explode()
and s.explode()
.
Example Usage:
>>> s = pd.Series([[1, 2, 3], 'foo', [], [3, 4]])
>>> s
0 [1, 2, 3]
1 foo
2 []
3 [3, 4]
dtype: object
>>> s.explode() # <- Current behavior:
0 1
0 2
0 3
1 foo
2 NaN
3 3
3 4
dtype: object
>>> s.explode(offset=True) # <- With proposed feature
0 1 1
2 2
3 3
1 1 foo
2 1 NaN
3 1 3
2 4
dtype: object
Feature Description
Introduce a new parameter, offset
, to both df.explode()
and s.explode()
.
def explode(self, ..., offset: bool = False): # Default to False for backward compatibility
"""
Parameters:
...
offset: If True, include the original array offset as a level in the resulting MultiIndex.
"""
Alternative Solutions
While it's technically possible to infer the offset in some cases, it requires additional steps and assumptions about the data. The offset parameter provides a direct, intuitive solution.
Additional Context
No response
Metadata
Metadata
Assignees
Labels
EnhancementNeeds TriageIssue that has not been reviewed by a pandas team memberIssue that has not been reviewed by a pandas team member