Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 21 additions & 0 deletions doc/source/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,27 @@ Standard moving window functions
rolling_apply
rolling_quantile

Standard expanding window functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

.. currentmodule:: pandas.stats.moments

.. autosummary::
:toctree: generated/

expanding_count
expanding_sum
expanding_mean
expanding_median
expanding_var
expanding_std
expanding_corr
expanding_cov
expanding_skew
expanding_kurt
expanding_apply
expanding_quantile

Exponentially-weighted moving window functions
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Expand Down
75 changes: 74 additions & 1 deletion doc/source/computation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ accept the following arguments:
- ``window``: size of moving window
- ``min_periods``: threshold of non-null data points to require (otherwise
result is NA)
- ``freq``: optionally specify a :ref: `frequency string <timeseries.alias>`
- ``freq``: optionally specify a :ref:`frequency string <timeseries.alias>`
or :ref:`DateOffset <timeseries.offsets>` to pre-conform the data to.
Note that prior to pandas v0.8.0, a keyword argument ``time_rule`` was used
instead of ``freq`` that referred to the legacy time rule constants
Expand Down Expand Up @@ -288,6 +288,79 @@ columns using ``ix`` indexing:
@savefig rolling_corr_pairwise_ex.png width=4.5in
correls.ix[:, 'A', 'C'].plot()

Expanding window moment functions
---------------------------------
A common alternative to rolling statistics is to use an *expanding* window,
which yields the value of the statistic with all the data available up to that
point in time. As these calculations are a special case of rolling statistics,
they are implemented in pandas such that the following two calls are equivalent:

.. ipython:: python

rolling_mean(df, window=len(df), min_periods=1)[:5]

expanding_mean(df)[:5]

Like the ``rolling_`` functions, the following methods are included in the
``pandas`` namespace or can be located in ``pandas.stats.moments``.

.. csv-table::
:header: "Function", "Description"
:widths: 20, 80

``expanding_count``, Number of non-null observations
``expanding_sum``, Sum of values
``expanding_mean``, Mean of values
``expanding_median``, Arithmetic median of values
``expanding_min``, Minimum
``expanding_max``, Maximum
``expanding_std``, Unbiased standard deviation
``expanding_var``, Unbiased variance
``expanding_skew``, Unbiased skewness (3rd moment)
``expanding_kurt``, Unbiased kurtosis (4th moment)
``expanding_quantile``, Sample quantile (value at %)
``expanding_apply``, Generic apply
``expanding_cov``, Unbiased covariance (binary)
``expanding_corr``, Correlation (binary)
``expanding_corr_pairwise``, Pairwise correlation of DataFrame columns

Aside from not having a ``window`` parameter, these functions have the same
interfaces as their ``rolling_`` counterpart. Like above, the parameters they
all accept are:

- ``min_periods``: threshold of non-null data points to require. Defaults to
minimum needed to compute statistic. No ``NaNs`` will be output once
``min_periods`` non-null data points have been seen.
- ``freq``: optionally specify a :ref:`frequency string <timeseries.alias>`
or :ref:`DateOffset <timeseries.offsets>` to pre-conform the data to.
Note that prior to pandas v0.8.0, a keyword argument ``time_rule`` was used
instead of ``freq`` that referred to the legacy time rule constants

.. note::

The output of the ``rolling_`` and ``expanding_`` functions do not return a
``NaN`` if there are at least ``min_periods`` non-null values in the current
window. This differs from ``cumsum``, ``cumprod``, ``cummax``, and
``cummin``, which return ``NaN`` in the output wherever a ``NaN`` is
encountered in the input.

An expanding window statistic will be more stable (and less responsive) than
its rolling window counterpart as the increasing window size decreases the
relative impact of an individual data point. As an example, here is the
``expanding_mean`` output for the previous time series dataset:

.. ipython:: python
:suppress:

plt.close('all')

.. ipython:: python

ts.plot(style='k--')

@savefig expanding_mean_frame.png width=4.5in
expanding_mean(ts).plot(style='k')

Exponentially weighted moment functions
---------------------------------------

Expand Down