Skip to content

Commit 5d8c683

Browse files
committed
updating python versions
1 parent ebf995a commit 5d8c683

File tree

2 files changed

+68
-28
lines changed

2 files changed

+68
-28
lines changed

examples/stochastic_volatility.py

Lines changed: 20 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -9,11 +9,11 @@
99
from pymc.distributions.timeseries import *
1010

1111
from scipy.sparse import csc_matrix
12-
from scipy import optimize
12+
from scipy import optimize
1313

1414
# <markdowncell>
1515

16-
# Asset prices have time-varying volatility (variance of day over day `returns`). In some periods, returns are highly vaiable, and in others very stable. Stochastic volatility models model this with a latent volatility variable, modeled as a stochastic process. The following model is similar to the one described in the No-U-Turn Sampler paper, Hoffman (2011) p21.
16+
# Asset prices have time-varying volatility (variance of day over day `returns`). In some periods, returns are highly variable, while in others very stable. Stochastic volatility models model this with a latent volatility variable, modeled as a stochastic process. The following model is similar to the one described in the No-U-Turn Sampler paper, Hoffman (2011) p21.
1717
#
1818
# $$ \sigma \sim Exponential(50) $$
1919
#
@@ -31,28 +31,27 @@
3131

3232
# <codecell>
3333

34-
model = Model()
34+
# Load 400 returns from the S&P 500.
35+
n = 400
36+
returns = np.genfromtxt("data/SP500.csv")[-n:]
3537

3638
# <markdowncell>
3739

3840
# Specifying the model in pymc mirrors its statistical specification.
3941
#
40-
# However, it is easier to sample the scale of the volatility process innovations, $\sigma $, on a log scale, so we create it using `TransformedVar` and use `logtransform`. `TransformedVar` creates one variable in the transformed space and one in the normal space. The one in the transformed space (here $log(\sigma) $) is the one over which sampling will occur, and the one in the normal space is the one to use throughout the rest of the model.
42+
# However, it is easier to sample the scale of the volatility process innovations, $\sigma$, on a log scale, so we create it using `TransformedVar` and use `logtransform`. `TransformedVar` creates one variable in the transformed space and one in the normal space. The one in the transformed space (here $\text{log}(\sigma) $) is the one over which sampling will occur, and the one in the normal space is the one to use throughout the rest of the model.
4143
#
4244
# It takes a variable name, a distribution and a transformation to use.
4345

4446
# <codecell>
4547

46-
n = 400
47-
returns = np.genfromtxt("data/SP500.csv")[-n:]
48-
48+
model = Model()
4949
with model:
5050
sigma, log_sigma = model.TransformedVar('sigma', Exponential(1./.02, testval = .1),
51-
logtransform)
51+
logtransform)
5252

5353
nu = Exponential('nu', 1./10)
5454

55-
5655
s = GaussianRandomWalk('s', sigma**-2, shape = n)
5756

5857
r = T('r', nu, lam = exp(-2*s), observed = returns)
@@ -61,11 +60,11 @@
6160

6261
# ## Fit Model
6362
#
64-
# To get a decent scaling matrix for the hamiltonaian sampler, we find the hessian at a point. The method `Model.d2logpc` gives us a Theano compiled function that returns the matrix of 2nd derivatives.
63+
# To get a decent scaling matrix for the Hamiltonian sampler, we find the Hessian at a point. The method `Model.d2logpc` gives us a `Theano` compiled function that returns the matrix of 2nd derivatives.
6564
#
66-
# However, the 2nd derivatives for the degrees of freedom parameter, `nu`, are negative and thus not very informative and make the matrix non-positive definite, so we replace that entry with a reasonable guess at the scale. The interactions between `log_sigma`/`nu` and `s` are also not very useful, so we set them to zero.
65+
# However, the 2nd derivatives for the degrees of freedom parameter, `nu`, are negative and thus not very informative and make the matrix non-positive definite, so we replace that entry with a reasonable guess at the scale. The interactions between `log_sigma`/`nu` and `s` are also not very useful, so we set them to zero.
6766
#
68-
# The hessian matrix is also very sparse, so we make it a sparse matrix for faster sampling.
67+
# The Hessian matrix is also very sparse, so we make it a sparse matrix for faster sampling.
6968

7069
# <codecell>
7170

@@ -87,32 +86,35 @@ def hessian(point, nusd):
8786
# <codecell>
8887

8988
with model:
90-
start = find_MAP(vars = [s], fmin = optimize.fmin_l_bfgs_b)
89+
start = find_MAP(vars=[s], fmin = optimize.fmin_l_bfgs_b)
9190

9291
# <markdowncell>
9392

94-
# We do a short initial run to get near the right area, then start again using a new hessian at the new starting point to get faster sampling due to better scaling.
93+
# We do a short initial run to get near the right area, then start again using a new Hessian at the new starting point to get faster sampling due to better scaling.
9594

9695
# <codecell>
9796

9897
with model:
9998
step = HamiltonianMC(model.vars, hessian(start, 6))
10099
trace = sample(200, step, start, trace = model.vars + [sigma])
101100

101+
# Start next run at the last sampled position.
102102
start2 = trace.point(-1)
103103
step = HamiltonianMC(model.vars, hessian(start2, 6), path_length = 4.)
104-
trace = sample(8000, step, trace = trace)
104+
trace = sample(8000, step, trace=trace)
105105

106106
# <codecell>
107107

108108
#figsize(12,6)
109109
title(str(s))
110-
plot(trace[s][::10].T,'b', alpha = .01);
110+
plot(trace[s][::10].T,'b', alpha=.01)
111+
xlabel('time')
112+
ylabel('volatility')
111113

112114
#figsize(12,6)
113-
traceplot(trace, model.vars[:-1]);
115+
traceplot(trace, model.vars[:-1])
114116

115-
# <markdowncell>
117+
# <rawcell>
116118

117119
# ## References
118120
#

examples/tutorial.py

Lines changed: 48 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -12,13 +12,24 @@
1212

1313
# Model
1414
# -----
15-
# We consider the following generative model
15+
# Consider the following true generative model:
16+
#
17+
# $$ x_{true} \sim \textrm{Normal}(2,1) $$
18+
# $$ y_{true} \sim \textrm{Normal}(\textrm{exp}(x_{true}),1)$$
19+
# $$ z_{data} \sim \textrm{Normal}(x_{true} + y_{true},0.75)$$
20+
#
21+
# Where $x_{true}$ is a scalar, $y_{true}$ is a vector of length 2, and $z_{data}$ is a $2\times 20$ matrix.
22+
#
23+
# We can simulate this using Numpy:
1624

1725
# <codecell>
1826

27+
ndims = 2
28+
nobs = 20
29+
1930
xtrue = normal(scale = 2., size = 1)
20-
ytrue = normal(loc = np.exp(xtrue), scale = 1, size = (2,1))
21-
zdata = normal(loc = xtrue + ytrue, scale = .75, size = (2, 20))
31+
ytrue = normal(loc = np.exp(xtrue), scale = 1, size = (ndims,1))
32+
zdata = normal(loc = xtrue + ytrue, scale = .75, size = (ndims, nobs))
2233

2334
# <markdowncell>
2435

@@ -27,7 +38,16 @@
2738
# <markdowncell>
2839

2940
# Build Model
30-
# -----------
41+
# -----------
42+
#
43+
# Now we want to do inference assuming the following model:
44+
#
45+
# $$ x \sim \textrm{Normal}(0,1) $$
46+
# $$ y \sim \textrm{Normal}(\textrm{exp}(x),2)$$
47+
# $$ z \sim \textrm{Normal}(x + y,0.75)$$
48+
#
49+
# The aim here is to get posteriors over $x$ and $y$ given the data we have about $z$ (`zdata`).
50+
#
3151
# We create a new `Model` objects, and do operations within its context. The `with` lets PyMC know this model is the current model of interest.
3252
#
3353
# We construct new random variables with the constructor for its prior distribution such as `Normal` while within a model context (inside the `with`). When you make a random variable it is automatically added to the model. The constructor returns a Theano variable.
@@ -38,15 +58,20 @@
3858

3959
with Model() as model:
4060
x = Normal('x', mu = 0., tau = 1)
41-
y = Normal('y', mu = exp(x), tau = 2.**-2, shape = (2,1))
42-
43-
z = Normal('z', mu = x + y, tau = .75**-2, observed = zdata)
61+
y = Normal('y', mu = exp(x), tau = 2.**-2, shape = (ndims,1)) # here, shape is telling us it's a vector rather than a scalar.
62+
z = Normal('z', mu = x + y, tau = .75**-2, observed = zdata) # shape is inferred from zdata
63+
64+
# <markdowncell>
65+
66+
# A parenthetical note on the parameters for the normal. Variance is encoded as `tau`, indicating precision, which is simply inverse variance (so $\tau=\sigma^{-2}$ ). This is used because the gamma function is the conjugate prior for precision, and must be inverted to get variance. Encoding in terms of precision saves the inversion step in cases where variance is actually modeled using gamma as a prior.
4467

4568
# <markdowncell>
4669

4770
# Fit Model
4871
# ---------
49-
# We need a starting point for our sampling. The `find_MAP` function finds the maximum a posteriori point (MAP), which is often a good choice for starting point. `find_MAP` uses an optimization algorithm to find the local maximum of the log posterior.
72+
# We need a starting point for our sampling. The `find_MAP` function finds the maximum a posteriori point (MAP), which is often a good choice for starting point. `find_MAP` uses an optimization algorithm (`scipy.optimize.fmin_l_bfgs_b`, or [BFGS](http://en.wikipedia.org/wiki/BFGS_method), by default) to find the local maximum of the log posterior.
73+
#
74+
# Note that this `with` construction is used again. Functions like `find_MAP` and `HamiltonianMC` need to have a model in their context. `with` activates the context of a particular model within its block.
5075

5176
# <codecell>
5277

@@ -59,7 +84,13 @@
5984

6085
# <codecell>
6186

62-
start
87+
print "MAP found:"
88+
print "x:", start['x']
89+
print "y:", start['y']
90+
91+
print "Compare with true values:"
92+
print "ytrue", ytrue
93+
print "xtrue", xtrue
6394

6495
# <markdowncell>
6596

@@ -109,7 +140,7 @@
109140

110141
# <codecell>
111142

112-
traceplot(trace)
143+
traceplot(trace);
113144

114145
# <markdowncell>
115146

@@ -130,3 +161,10 @@
130161
# * Without a name argument, it simply constructs a distribution object and returns it. It won't construct a random variable. This object has properties like `logp` (density function) and `expectation`.
131162
# * With a name argument, it constructs a random variable using the distrubtion object as the prior distribution and inserts this random variable into the current model. Then the constructor returns the random variable.
132163

164+
# <codecell>
165+
166+
help(model)
167+
168+
# <codecell>
169+
170+

0 commit comments

Comments
 (0)