@@ -194,195 +194,6 @@ levels <merging.merge_on_columns_and_levels>` documentation section.
194194
195195 .. _whatsnew_0230.enhancements.sort_by_columns_and_levels :
196196
197- Sorting by a combination of columns and index levels
198- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
199-
200- Strings passed to :meth: `DataFrame.sort_values ` as the ``by `` parameter may
201- now refer to either column names or index level names. This enables sorting
202- ``DataFrame `` instances by a combination of index levels and columns without
203- resetting indexes. See the :ref: `Sorting by Indexes and Values
204- <basics.sort_indexes_and_values>` documentation section.
205- (:issue: `14353 `)
206-
207- .. ipython :: python
208-
209- # Build MultiIndex
210- idx = pd.MultiIndex.from_tuples([(' a' , 1 ), (' a' , 2 ), (' a' , 2 ),
211- (' b' , 2 ), (' b' , 1 ), (' b' , 1 )])
212- idx.names = [' first' , ' second' ]
213-
214- # Build DataFrame
215- df_multi = pd.DataFrame({' A' : np.arange(6 , 0 , - 1 )},
216- index = idx)
217- df_multi
218-
219- # Sort by 'second' (index) and 'A' (column)
220- df_multi.sort_values(by = [' second' , ' A' ])
221-
222-
223- .. _whatsnew_023.enhancements.extension :
224-
225- Extending pandas with custom types (experimental)
226- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
227-
228- pandas now supports storing array-like objects that aren't necessarily 1-D NumPy
229- arrays as columns in a DataFrame or values in a Series. This allows third-party
230- libraries to implement extensions to NumPy's types, similar to how pandas
231- implemented categoricals, datetimes with timezones, periods, and intervals.
232-
233- As a demonstration, we'll use cyberpandas _, which provides an ``IPArray `` type
234- for storing ip addresses.
235-
236- .. code-block :: ipython
237-
238- In [1]: from cyberpandas import IPArray
239-
240- In [2]: values = IPArray([
241- ...: 0,
242- ...: 3232235777,
243- ...: 42540766452641154071740215577757643572
244- ...: ])
245- ...:
246- ...:
247-
248- ``IPArray `` isn't a normal 1-D NumPy array, but because it's a pandas
249- :class: `~pandas.api.extensions.ExtensionArray `, it can be stored properly inside pandas' containers.
250-
251- .. code-block :: ipython
252-
253- In [3]: ser = pd.Series(values)
254-
255- In [4]: ser
256- Out[4]:
257- 0 0.0.0.0
258- 1 192.168.1.1
259- 2 2001:db8:85a3::8a2e:370:7334
260- dtype: ip
261-
262- Notice that the dtype is ``ip ``. The missing value semantics of the underlying
263- array are respected:
264-
265- .. code-block :: ipython
266-
267- In [5]: ser.isna()
268- Out[5]:
269- 0 True
270- 1 False
271- 2 False
272- dtype: bool
273-
274- For more, see the :ref: `extension types <extending.extension-types >`
275- documentation. If you build an extension array, publicize it on `the ecosystem page <https://pandas.pydata.org/community/ecosystem.html >`_.
276-
277- .. _cyberpandas : https://cyberpandas.readthedocs.io/en/latest/
278-
279-
280- .. _whatsnew_0230.enhancements.categorical_grouping :
281-
282- New ``observed `` keyword for excluding unobserved categories in ``GroupBy ``
283- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
284-
285- Grouping by a categorical includes the unobserved categories in the output.
286- When grouping by multiple categorical columns, this means you get the cartesian product of all the
287- categories, including combinations where there are no observations, which can result in a large
288- number of groups. We have added a keyword ``observed `` to control this behavior, it defaults to
289- ``observed=False `` for backward-compatibility. (:issue: `14942 `, :issue: `8138 `, :issue: `15217 `, :issue: `17594 `, :issue: `8669 `, :issue: `20583 `, :issue: `20902 `)
290-
291- .. ipython :: python
292-
293- cat1 = pd.Categorical([" a" , " a" , " b" , " b" ],
294- categories = [" a" , " b" , " z" ], ordered = True )
295- cat2 = pd.Categorical([" c" , " d" , " c" , " d" ],
296- categories = [" c" , " d" , " y" ], ordered = True )
297- df = pd.DataFrame({" A" : cat1, " B" : cat2, " values" : [1 , 2 , 3 , 4 ]})
298- df[' C' ] = [' foo' , ' bar' ] * 2
299- df
300-
301- To show all values, the previous behavior:
302-
303- .. ipython :: python
304-
305- df.groupby([' A' , ' B' , ' C' ], observed = False ).count()
306-
307-
308- To show only observed values:
309-
310- .. ipython :: python
311-
312- df.groupby([' A' , ' B' , ' C' ], observed = True ).count()
313-
314- For pivoting operations, this behavior is *already * controlled by the ``dropna `` keyword:
315-
316- .. ipython :: python
317-
318- cat1 = pd.Categorical([" a" , " a" , " b" , " b" ],
319- categories = [" a" , " b" , " z" ], ordered = True )
320- cat2 = pd.Categorical([" c" , " d" , " c" , " d" ],
321- categories = [" c" , " d" , " y" ], ordered = True )
322- df = pd.DataFrame({" A" : cat1, " B" : cat2, " values" : [1 , 2 , 3 , 4 ]})
323- df
324-
325-
326- .. code-block :: ipython
327-
328- In [1]: pd.pivot_table(df, values='values', index=['A', 'B'], dropna=True)
329-
330- Out[1]:
331- values
332- A B
333- a c 1.0
334- d 2.0
335- b c 3.0
336- d 4.0
337-
338- In [2]: pd.pivot_table(df, values='values', index=['A', 'B'], dropna=False)
339-
340- Out[2]:
341- values
342- A B
343- a c 1.0
344- d 2.0
345- y NaN
346- b c 3.0
347- d 4.0
348- y NaN
349- z c NaN
350- d NaN
351- y NaN
352-
353-
354- .. _whatsnew_0230.enhancements.window_raw :
355-
356- Rolling/Expanding.apply() accepts ``raw=False `` to pass a ``Series `` to the function
357- ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
358-
359- :func: `Series.rolling().apply() <.Rolling.apply> `, :func: `DataFrame.rolling().apply() <.Rolling.apply> `,
360- :func: `Series.expanding().apply() <.Expanding.apply> `, and :func: `DataFrame.expanding().apply() <.Expanding.apply> ` have gained a ``raw=None `` parameter.
361- This is similar to :func: `DataFame.apply `. This parameter, if ``True `` allows one to send a ``np.ndarray `` to the applied function. If ``False `` a ``Series `` will be passed. The
362- default is ``None ``, which preserves backward compatibility, so this will default to ``True ``, sending an ``np.ndarray ``.
363- In a future version the default will be changed to ``False ``, sending a ``Series ``. (:issue: `5071 `, :issue: `20584 `)
364-
365- .. ipython :: python
366-
367- s = pd.Series(np.arange(5 ), np.arange(5 ) + 1 )
368- s
369-
370- Pass a ``Series ``:
371-
372- .. ipython :: python
373-
374- s.rolling(2 , min_periods = 1 ).apply(lambda x : x.iloc[- 1 ], raw = False )
375-
376- Mimic the original behavior of passing a ndarray:
377-
378- .. ipython :: python
379-
380- s.rolling(2 , min_periods = 1 ).apply(lambda x : x[- 1 ], raw = True )
381-
382-
383- .. _whatsnew_0210.enhancements.limit_area :
384-
385-
386197.. _whatsnew_0.23.0.contributors :
387198
388199Contributors
0 commit comments