Skip to content

Commit 8d390df

Browse files
committed
Merge remote-tracking branch 'upstream/master' into drop_py2_ci
2 parents 7827b71 + 14a2da1 commit 8d390df

File tree

8 files changed

+140
-82
lines changed

8 files changed

+140
-82
lines changed

Makefile

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,4 +23,3 @@ doc:
2323
cd doc; \
2424
python make.py clean; \
2525
python make.py html
26-
python make.py spellcheck

ci/code_checks.sh

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -206,7 +206,7 @@ if [[ -z "$CHECK" || "$CHECK" == "doctests" ]]; then
206206

207207
MSG='Doctests frame.py' ; echo $MSG
208208
pytest -q --doctest-modules pandas/core/frame.py \
209-
-k"-axes -combine -itertuples -join -pivot_table -query -reindex -reindex_axis -round"
209+
-k" -itertuples -join -reindex -reindex_axis -round"
210210
RET=$(($RET + $?)) ; echo $MSG "DONE"
211211

212212
MSG='Doctests series.py' ; echo $MSG

doc/source/whatsnew/v0.24.1.rst

Lines changed: 7 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -50,16 +50,19 @@ The `sort` option for :meth:`Index.intersection` has changed in three ways.
5050
Fixed Regressions
5151
~~~~~~~~~~~~~~~~~
5252

53-
- Bug in :meth:`DataFrame.itertuples` with ``records`` orient raising an ``AttributeError`` when the ``DataFrame`` contained more than 255 columns (:issue:`24939`)
54-
- Bug in :meth:`DataFrame.itertuples` orient converting integer column names to strings prepended with an underscore (:issue:`24940`)
53+
- Fixed regression in :meth:`DataFrame.to_dict` with ``records`` orient raising an
54+
``AttributeError`` when the ``DataFrame`` contained more than 255 columns, or
55+
wrongly converting column names that were not valid python identifiers (:issue:`24939`, :issue:`24940`).
5556
- Fixed regression in :func:`read_sql` when passing certain queries with MySQL/pymysql (:issue:`24988`).
5657
- Fixed regression in :class:`Index.intersection` incorrectly sorting the values by default (:issue:`24959`).
5758
- Fixed regression in :func:`merge` when merging an empty ``DataFrame`` with multiple timezone-aware columns on one of the timezone-aware columns (:issue:`25014`).
5859
- Fixed regression in :meth:`Series.rename_axis` and :meth:`DataFrame.rename_axis` where passing ``None`` failed to remove the axis name (:issue:`25034`)
60+
- Fixed regression in :func:`to_timedelta` with `box=False` incorrectly returning a ``datetime64`` object instead of a ``timedelta64`` object (:issue:`24961`)
5961

60-
**Timedelta**
62+
.. _whatsnew_0241.bug_fixes:
6163

62-
- Bug in :func:`to_timedelta` with `box=False` incorrectly returning a ``datetime64`` object instead of a ``timedelta64`` object (:issue:`24961`)
64+
Bug Fixes
65+
~~~~~~~~~
6366

6467
**Reshaping**
6568

@@ -69,7 +72,6 @@ Fixed Regressions
6972

7073
- Fixed the warning for implicitly registered matplotlib converters not showing. See :ref:`whatsnew_0211.converters` for more (:issue:`24963`).
7174

72-
7375
**Other**
7476

7577
- Fixed AttributeError when printing a DataFrame's HTML repr after accessing the IPython config object (:issue:`25036`)

pandas/core/arrays/categorical.py

Lines changed: 2 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -2321,8 +2321,7 @@ def _values_for_factorize(self):
23212321
@classmethod
23222322
def _from_factorized(cls, uniques, original):
23232323
return original._constructor(original.categories.take(uniques),
2324-
categories=original.categories,
2325-
ordered=original.ordered)
2324+
dtype=original.dtype)
23262325

23272326
def equals(self, other):
23282327
"""
@@ -2674,9 +2673,7 @@ def _factorize_from_iterable(values):
26742673
if is_categorical(values):
26752674
if isinstance(values, (ABCCategoricalIndex, ABCSeries)):
26762675
values = values._values
2677-
categories = CategoricalIndex(values.categories,
2678-
categories=values.categories,
2679-
ordered=values.ordered)
2676+
categories = CategoricalIndex(values.categories, dtype=values.dtype)
26802677
codes = values.codes
26812678
else:
26822679
# The value of ordered is irrelevant since we don't use cat as such,

pandas/core/frame.py

Lines changed: 102 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -483,7 +483,7 @@ def axes(self):
483483
--------
484484
>>> df = pd.DataFrame({'col1': [1, 2], 'col2': [3, 4]})
485485
>>> df.axes
486-
[RangeIndex(start=0, stop=2, step=1), Index(['coll', 'col2'],
486+
[RangeIndex(start=0, stop=2, step=1), Index(['col1', 'col2'],
487487
dtype='object')]
488488
"""
489489
return [self.index, self.columns]
@@ -3016,28 +3016,30 @@ def query(self, expr, inplace=False, **kwargs):
30163016
30173017
Parameters
30183018
----------
3019-
expr : string
3019+
expr : str
30203020
The query string to evaluate. You can refer to variables
30213021
in the environment by prefixing them with an '@' character like
30223022
``@a + b``.
30233023
inplace : bool
30243024
Whether the query should modify the data in place or return
3025-
a modified copy
3026-
3027-
.. versionadded:: 0.18.0
3028-
3029-
kwargs : dict
3025+
a modified copy.
3026+
**kwargs
30303027
See the documentation for :func:`pandas.eval` for complete details
30313028
on the keyword arguments accepted by :meth:`DataFrame.query`.
30323029
3030+
.. versionadded:: 0.18.0
3031+
30333032
Returns
30343033
-------
3035-
q : DataFrame
3034+
DataFrame
3035+
DataFrame resulting from the provided query expression.
30363036
30373037
See Also
30383038
--------
3039-
pandas.eval
3040-
DataFrame.eval
3039+
eval : Evaluate a string describing operations on
3040+
DataFrame columns.
3041+
DataFrame.eval : Evaluate a string describing operations on
3042+
DataFrame columns.
30413043
30423044
Notes
30433045
-----
@@ -3076,9 +3078,23 @@ def query(self, expr, inplace=False, **kwargs):
30763078
30773079
Examples
30783080
--------
3079-
>>> df = pd.DataFrame(np.random.randn(10, 2), columns=list('ab'))
3080-
>>> df.query('a > b')
3081-
>>> df[df.a > df.b] # same result as the previous expression
3081+
>>> df = pd.DataFrame({'A': range(1, 6), 'B': range(10, 0, -2)})
3082+
>>> df
3083+
A B
3084+
0 1 10
3085+
1 2 8
3086+
2 3 6
3087+
3 4 4
3088+
4 5 2
3089+
>>> df.query('A > B')
3090+
A B
3091+
4 5 2
3092+
3093+
The previous expression is equivalent to
3094+
3095+
>>> df[df.A > df.B]
3096+
A B
3097+
4 5 2
30823098
"""
30833099
inplace = validate_bool_kwarg(inplace, 'inplace')
30843100
if not isinstance(expr, compat.string_types):
@@ -5142,8 +5158,7 @@ def _combine_const(self, other, func):
51425158

51435159
def combine(self, other, func, fill_value=None, overwrite=True):
51445160
"""
5145-
Perform column-wise combine with another DataFrame based on a
5146-
passed function.
5161+
Perform column-wise combine with another DataFrame.
51475162
51485163
Combines a DataFrame with `other` DataFrame using `func`
51495164
to element-wise combine columns. The row and column indexes of the
@@ -5159,13 +5174,14 @@ def combine(self, other, func, fill_value=None, overwrite=True):
51595174
fill_value : scalar value, default None
51605175
The value to fill NaNs with prior to passing any column to the
51615176
merge func.
5162-
overwrite : boolean, default True
5177+
overwrite : bool, default True
51635178
If True, columns in `self` that do not exist in `other` will be
51645179
overwritten with NaNs.
51655180
51665181
Returns
51675182
-------
5168-
result : DataFrame
5183+
DataFrame
5184+
Combination of the provided DataFrames.
51695185
51705186
See Also
51715187
--------
@@ -5209,15 +5225,15 @@ def combine(self, other, func, fill_value=None, overwrite=True):
52095225
>>> df1 = pd.DataFrame({'A': [0, 0], 'B': [None, 4]})
52105226
>>> df2 = pd.DataFrame({'A': [1, 1], 'B': [None, 3]})
52115227
>>> df1.combine(df2, take_smaller, fill_value=-5)
5212-
A B
5213-
0 0 NaN
5228+
A B
5229+
0 0 -5.0
52145230
1 0 3.0
52155231
52165232
Example that demonstrates the use of `overwrite` and behavior when
52175233
the axis differ between the dataframes.
52185234
52195235
>>> df1 = pd.DataFrame({'A': [0, 0], 'B': [4, 4]})
5220-
>>> df2 = pd.DataFrame({'B': [3, 3], 'C': [-10, 1],}, index=[1, 2])
5236+
>>> df2 = pd.DataFrame({'B': [3, 3], 'C': [-10, 1], }, index=[1, 2])
52215237
>>> df1.combine(df2, take_smaller)
52225238
A B C
52235239
0 NaN NaN NaN
@@ -5232,7 +5248,7 @@ def combine(self, other, func, fill_value=None, overwrite=True):
52325248
52335249
Demonstrating the preference of the passed in dataframe.
52345250
5235-
>>> df2 = pd.DataFrame({'B': [3, 3], 'C': [1, 1],}, index=[1, 2])
5251+
>>> df2 = pd.DataFrame({'B': [3, 3], 'C': [1, 1], }, index=[1, 2])
52365252
>>> df2.combine(df1, take_smaller)
52375253
A B C
52385254
0 0.0 NaN NaN
@@ -5716,19 +5732,19 @@ def pivot(self, index=None, columns=None, values=None):
57165732
57175733
This first example aggregates values by taking the sum.
57185734
5719-
>>> table = pivot_table(df, values='D', index=['A', 'B'],
5735+
>>> table = pd.pivot_table(df, values='D', index=['A', 'B'],
57205736
... columns=['C'], aggfunc=np.sum)
57215737
>>> table
57225738
C large small
57235739
A B
5724-
bar one 4 5
5725-
two 7 6
5726-
foo one 4 1
5727-
two NaN 6
5740+
bar one 4.0 5.0
5741+
two 7.0 6.0
5742+
foo one 4.0 1.0
5743+
two NaN 6.0
57285744
57295745
We can also fill missing values using the `fill_value` parameter.
57305746
5731-
>>> table = pivot_table(df, values='D', index=['A', 'B'],
5747+
>>> table = pd.pivot_table(df, values='D', index=['A', 'B'],
57325748
... columns=['C'], aggfunc=np.sum, fill_value=0)
57335749
>>> table
57345750
C large small
@@ -5740,12 +5756,11 @@ def pivot(self, index=None, columns=None, values=None):
57405756
57415757
The next example aggregates by taking the mean across multiple columns.
57425758
5743-
>>> table = pivot_table(df, values=['D', 'E'], index=['A', 'C'],
5759+
>>> table = pd.pivot_table(df, values=['D', 'E'], index=['A', 'C'],
57445760
... aggfunc={'D': np.mean,
57455761
... 'E': np.mean})
57465762
>>> table
5747-
D E
5748-
mean mean
5763+
D E
57495764
A C
57505765
bar large 5.500000 7.500000
57515766
small 5.500000 8.500000
@@ -5755,17 +5770,17 @@ def pivot(self, index=None, columns=None, values=None):
57555770
We can also calculate multiple types of aggregations for any given
57565771
value column.
57575772
5758-
>>> table = pivot_table(df, values=['D', 'E'], index=['A', 'C'],
5773+
>>> table = pd.pivot_table(df, values=['D', 'E'], index=['A', 'C'],
57595774
... aggfunc={'D': np.mean,
57605775
... 'E': [min, max, np.mean]})
57615776
>>> table
5762-
D E
5763-
mean max mean min
5777+
D E
5778+
mean max mean min
57645779
A C
5765-
bar large 5.500000 9 7.500000 6
5766-
small 5.500000 9 8.500000 8
5767-
foo large 2.000000 5 4.500000 4
5768-
small 2.333333 6 4.333333 2
5780+
bar large 5.500000 9.0 7.500000 6.0
5781+
small 5.500000 9.0 8.500000 8.0
5782+
foo large 2.000000 5.0 4.500000 4.0
5783+
small 2.333333 6.0 4.333333 2.0
57695784
"""
57705785

57715786
@Substitution('')
@@ -6903,41 +6918,67 @@ def round(self, decimals=0, *args, **kwargs):
69036918
columns not included in `decimals` will be left as is. Elements
69046919
of `decimals` which are not columns of the input will be
69056920
ignored.
6921+
*args
6922+
Additional keywords have no effect but might be accepted for
6923+
compatibility with numpy.
6924+
**kwargs
6925+
Additional keywords have no effect but might be accepted for
6926+
compatibility with numpy.
69066927
69076928
Returns
69086929
-------
6909-
DataFrame
6930+
DataFrame :
6931+
A DataFrame with the affected columns rounded to the specified
6932+
number of decimal places.
69106933
69116934
See Also
69126935
--------
6913-
numpy.around
6914-
Series.round
6936+
numpy.around : Round a numpy array to the given number of decimals.
6937+
Series.round : Round a Series to the given number of decimals.
69156938
69166939
Examples
69176940
--------
6918-
>>> df = pd.DataFrame(np.random.random([3, 3]),
6919-
... columns=['A', 'B', 'C'], index=['first', 'second', 'third'])
6941+
>>> df = pd.DataFrame([(.21, .32), (.01, .67), (.66, .03), (.21, .18)],
6942+
... columns=['dogs', 'cats'])
69206943
>>> df
6921-
A B C
6922-
first 0.028208 0.992815 0.173891
6923-
second 0.038683 0.645646 0.577595
6924-
third 0.877076 0.149370 0.491027
6925-
>>> df.round(2)
6926-
A B C
6927-
first 0.03 0.99 0.17
6928-
second 0.04 0.65 0.58
6929-
third 0.88 0.15 0.49
6930-
>>> df.round({'A': 1, 'C': 2})
6931-
A B C
6932-
first 0.0 0.992815 0.17
6933-
second 0.0 0.645646 0.58
6934-
third 0.9 0.149370 0.49
6935-
>>> decimals = pd.Series([1, 0, 2], index=['A', 'B', 'C'])
6944+
dogs cats
6945+
0 0.21 0.32
6946+
1 0.01 0.67
6947+
2 0.66 0.03
6948+
3 0.21 0.18
6949+
6950+
By providing an integer each column is rounded to the same number
6951+
of decimal places
6952+
6953+
>>> df.round(1)
6954+
dogs cats
6955+
0 0.2 0.3
6956+
1 0.0 0.7
6957+
2 0.7 0.0
6958+
3 0.2 0.2
6959+
6960+
With a dict, the number of places for specific columns can be
6961+
specfified with the column names as key and the number of decimal
6962+
places as value
6963+
6964+
>>> df.round({'dogs': 1, 'cats': 0})
6965+
dogs cats
6966+
0 0.2 0.0
6967+
1 0.0 1.0
6968+
2 0.7 0.0
6969+
3 0.2 0.0
6970+
6971+
Using a Series, the number of places for specific columns can be
6972+
specfified with the column names as index and the number of
6973+
decimal places as value
6974+
6975+
>>> decimals = pd.Series([0, 1], index=['cats', 'dogs'])
69366976
>>> df.round(decimals)
6937-
A B C
6938-
first 0.0 1 0.17
6939-
second 0.0 1 0.58
6940-
third 0.9 0 0.49
6977+
dogs cats
6978+
0 0.2 0.0
6979+
1 0.0 1.0
6980+
2 0.7 0.0
6981+
3 0.2 0.0
69416982
"""
69426983
from pandas.core.reshape.concat import concat
69436984

pandas/core/series.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1120,7 +1120,7 @@ def repeat(self, repeats, axis=None):
11201120
11211121
Returns
11221122
-------
1123-
repeated_series : Series
1123+
Series
11241124
Newly created Series with repeated elements.
11251125
11261126
See Also

0 commit comments

Comments
 (0)