Skip to content

Example with two datasets

João Camacho edited this page Oct 11, 2021 · 2 revisions

The gpyrn was made with the objective of analyzing radial velocity data together with activity indicators. As such an important example to have is one where we do a analysis of two datasets.


Again using ipython. We start by importing everything we will need

import numpy as np
import matplotlib.pyplot as plt
from matplotlib.ticker import AutoMinorLocator
from gpyrn import meanfield, covfunc, meanfunc

Main difference now is that instead of one dataset we will create two

time = np.linspace(0, 100, 25)
y1 = 20*np.sin(2*np.pi*time / 31)
y1err = np.random.rand(25)
y2 = 25*np.sin(2*np.pi*time / 31 + 0.5*np.pi)
y2err = np.random.rand(25)

Is just two simple sine waves out of phase, the plot should be something like


Having them, we now create our gprn. Now things get a bit trickier. We now have two datasets, generalizing for n datasets, the order you add them to the gprn is time, y1, y1err, y2, y2err, ..., yn, ynerr. In our case we do

gprn = meanfield.inference(1, time, y1, y1err, y2, y2err)

Now we need to define the nodes and weights accordingly. We will keep using one node but the number of weights is always #nodeX#datasets. That means we will now have two weights. We will also have two mean functions and two jitters (one for each dataset).

nodes = [covfunc.Periodic(5, 31, 0.5)]
weight = [covfunc.SquaredExponential(5, 5), covfunc.SquaredExponential(10, 10)]
means = [meanfunc.Constant(0), meanfunc.Constant(0)]
jitter = [0.5, 0.5]

The order you write the weights, means and jitter is important! For this example (1 node only) the first weight connects to the first dataset, the second weight connects to the second dataset. The same goes for the means and jitters.

If you had a third dataset you would need a third weight that would connect to it. Everything is connected like


Now that we have all set, let us calculate the elbo

elbo, m, v = gprn.ELBOcalc(nodes, weight, means, jitter, iterations=5000, mu='init', var='init')
print('ELBO =', elbo)

You should get something like

ELBO = -350.7112788794546

You can, of course, also calculate its predictive mean

tstar = np.linspace(time.min(), time.max(), 1000)
a, _, _ = gprn.Prediction(nodes, weight, means, jitter, tstar, m, v)

And then you could plot it. However I want to see how are the node and weights influencing the predictive. For it you use the option separate = True

aa, _, _, bb = gprn.Prediction(nodes, weight, means, jitter, tstar, m, v, separate=True)

It now returns the predictive mean I called 'aa', and the preductive node and weights I called 'bb'. Plotting all together

fig = plt.figure(constrained_layout=True, figsize=(7, 7))
axs = fig.subplot_mosaic([['predictive 1', 'weight 1'],
                         ['predictive 1', 'node'],
                         ['predictive 2', 'node'],
                         ['predictive 2', 'weight 2'],],)

axs['predictive 1'].set(xlabel='', ylabel='y1')
axs['predictive 1'].errorbar(time, y1, y1err, fmt= '.k')
axs['predictive 1'].plot(tstar, a[0].T, '-r')
axs['predictive 1'].xaxis.set_minor_locator(AutoMinorLocator(5))
axs['predictive 1'].yaxis.set_minor_locator(AutoMinorLocator(5))
axs['predictive 1'].grid(which='major', alpha=0.5)
axs['predictive 1'].grid(which='minor', alpha=0.2)
axs['predictive 2'].set(xlabel='', ylabel='y2')
axs['predictive 2'].errorbar(time, y2, y2err, fmt= '.k')
axs['predictive 2'].plot(tstar, a[1].T, '-r')
axs['predictive 2'].xaxis.set_minor_locator(AutoMinorLocator(5))
axs['predictive 2'].yaxis.set_minor_locator(AutoMinorLocator(5))
axs['predictive 2'].grid(which='major', alpha=0.5)
axs['predictive 2'].grid(which='minor', alpha=0.2)

axs['weight 1'].set(xlabel='', ylabel='y1 weight')
axs['weight 1'].plot(tstar, bb[1][0].T, '-b')
axs['node'].set(xlabel='', ylabel='Node')
axs['node'].plot(tstar, bb[0].T, '-b')
axs['weight 2'].set(xlabel='', ylabel='y2 weight')
axs['weight 2'].plot(tstar, bb[1][1].T, '-b')

You should get something like this

It is now a perfect fit but considering I choose the node and weights parameters at random it is not that bad.



Clone this wiki locally