You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ENH: Add Series method to explode a list-like column (#27267)
* [ENH] Add DataFrame method to explode a list-like column (GH #16538)
Sometimes a values column is presented with list-like values on one row.
Instead we may want to split each individual value onto its own row,
keeping the same mapping to the other key columns. While it's possible
to chain together existing pandas operations (in fact that's exactly
what this implementation is) to do this, the sequence of operations
is not obvious. By contrast this is available as a built-in operation
in say Spark and is a fairly common use case.
* move to Series
* handle generic list-like
* lint on asv
* move is_list_like to cython and share impl
* moar docs
* test larger sides to avoid a segfault
* fix ref
* typos
* benchmarks wrong
* add inversion
* add usecase
* cimport is_list_like
* use cimports
* doc-string
* docs & lint
* isort
* clean object check & update doc-strings
* lint
* test for nested
* better test
* try adding frame
* test for nested EA
* lint
* remove multi subset support
* update docs
* doc-string
* add test for MI
* lint and docs
* ordering
* moar lint
* multi-index column support
* 32-bit compat
* moar 32-bit compat
We can 'explode' the ``values`` column, transforming each list-like to a separate row, by using :meth:`~Series.explode`. This will replicate the index values from the original row:
822
+
823
+
.. ipython:: python
824
+
825
+
df['values'].explode()
826
+
827
+
You can also explode the column in the ``DataFrame``.
828
+
829
+
.. ipython:: python
830
+
831
+
df.explode('values')
832
+
833
+
:meth:`Series.explode` will replace empty lists with ``np.nan`` and preserve scalar entries. The dtype of the resulting ``Series`` is always ``object``.
834
+
835
+
.. ipython:: python
836
+
837
+
s = pd.Series([[1, 2, 3], 'foo', [], ['a', 'b']])
838
+
s
839
+
s.explode()
840
+
841
+
Here is a typical usecase. You have comma separated strings in a column and want to expand this.
842
+
843
+
.. ipython:: python
844
+
845
+
df = pd.DataFrame([{'var1': 'a,b,c', 'var2': 1},
846
+
{'var1': 'd,e,f', 'var2': 2}])
847
+
df
848
+
849
+
Creating a long form DataFrame is now straightforward using explode and chained operations
Copy file name to clipboardExpand all lines: doc/source/whatsnew/v0.25.0.rst
+22Lines changed: 22 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -182,6 +182,28 @@ The repr now looks like this:
182
182
json_normalize(data, max_level=1)
183
183
184
184
185
+
.. _whatsnew_0250.enhancements.explode:
186
+
187
+
Series.explode to split list-like values to rows
188
+
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
189
+
190
+
:class:`Series` and :class:`DataFrame` have gained the :meth:`DataFrame.explode` methods to transform list-likes to individual rows. See :ref:`section on Exploding list-like column <reshaping.explode>` in docs for more information (:issue:`16538`, :issue:`10511`)
191
+
192
+
193
+
Here is a typical usecase. You have comma separated string in a column.
194
+
195
+
.. ipython:: python
196
+
197
+
df = pd.DataFrame([{'var1': 'a,b,c', 'var2': 1},
198
+
{'var1': 'd,e,f', 'var2': 2}])
199
+
df
200
+
201
+
Creating a long form ``DataFrame`` is now straightforward using chained operations
0 commit comments