-
-
Notifications
You must be signed in to change notification settings - Fork 19.4k
Description
Code Sample, a copy-pastable example if possible
In [1]: import pandas as pd
In [2]: import numpy as np
In [3]: s = pd.Series([
...: np.nan,
...: 1., np.nan,
...: 2., np.nan, np.nan,
...: 5., np.nan, np.nan, np.nan,
...: -1., np.nan, np.nan
...: ])
In [4]: s.interpolate(method='pad', limit_area='inside')
Out[4]:
0 NaN
1 1.0
2 1.0
3 2.0
4 2.0
5 2.0
6 5.0
7 5.0
8 5.0
9 5.0
10 -1.0
11 -1.0
12 -1.0
dtype: float64
# For method='linear' the `limit_area` kwarg works as expected
In [5]: s.interpolate(method='linear', limit_area='inside')
Out[5]:
0 NaN
1 1.0
2 1.5
3 2.0
4 3.0
5 4.0
6 5.0
7 3.5
8 2.0
9 0.5
10 -1.0
11 NaN
12 NaN
dtype: float64
Problem description
The kwargs limit_area and limit_direction for interpolate(), introduce in #16513 do not seem to have an effect when using the method pad. They work with other interpolation methods, e.g. linear.
There are two different pathways for interpolate depending on the selected method.
pandas/pandas/core/internals/blocks.py
Lines 1096 to 1116 in b8ad9da
| if m is not None: | |
| r = check_int_bool(self, inplace) | |
| if r is not None: | |
| return r | |
| return self._interpolate_with_fill(method=m, axis=axis, | |
| inplace=inplace, limit=limit, | |
| fill_value=fill_value, | |
| coerce=coerce, | |
| downcast=downcast) | |
| # validate the interp method | |
| m = missing.clean_interp_method(method, **kwargs) | |
| r = check_int_bool(self, inplace) | |
| if r is not None: | |
| return r | |
| return self._interpolate(method=m, index=index, values=values, | |
| axis=axis, limit=limit, | |
| limit_direction=limit_direction, | |
| limit_area=limit_area, | |
| fill_value=fill_value, inplace=inplace, | |
| downcast=downcast, **kwargs) |
For pad, ffill and bfill _interpolate_with_fill() is used which calls missing.interpolate_2d which does not seem to recognize the keywords limit_direction and limit_area. They are silently ignored.
I might be able to fix that during #25141, but maybe a fix needs more fundamental changes.
Expected Output
Output of pd.show_versions()
INSTALLED VERSIONS
commit: eaacefd
python: 3.7.2.final.0
python-bits: 64
OS: Darwin
OS-release: 16.7.0
machine: x86_64
processor: i386
byteorder: little
LC_ALL: None
LANG: None
LOCALE: de_DE.UTF-8
pandas: 0.25.0.dev0+725.geaacefd09
pytest: 4.1.1
pip: 18.1
setuptools: 40.6.3
Cython: 0.29.2
numpy: 1.15.4
scipy: 1.2.0
pyarrow: 0.11.1
xarray: 0.11.0
IPython: 7.2.0
sphinx: 1.8.2
patsy: 0.5.1
dateutil: 2.7.5
pytz: 2018.9
blosc: None
bottleneck: 1.2.1
tables: 3.4.4
numexpr: 2.6.9
feather: None
matplotlib: 3.0.2
openpyxl: 2.5.12
xlrd: 1.2.0
xlwt: 1.3.0
xlsxwriter: 1.1.2
lxml.etree: 4.3.0
bs4: 4.7.1
html5lib: 1.0.1
sqlalchemy: 1.2.16
pymysql: None
psycopg2: None
jinja2: 2.10
s3fs: 0.2.0
fastparquet: 0.2.1
pandas_gbq: None
pandas_datareader: None
gcsfs: None