Skip to content

[WIP] Ch 7#42

Open
canyon289 wants to merge 4 commits intoaloctavodia:masterfrom
canyon289:ch_7
Open

[WIP] Ch 7#42
canyon289 wants to merge 4 commits intoaloctavodia:masterfrom
canyon289:ch_7

Conversation

@canyon289
Copy link
Collaborator

I have question on exercises 2 and 4 in the notebook. The rest are up for review

@AlexAndorra
Copy link

Thank you Ravin! Here are my thoughts, from what I understood:

  • Exercise 2: Yes, I think the generated values are the range of Y values from the multivariate normal. The goal of the exercise is probably to show that the range of Y values increases when increasing the number of samples obtained from the GP prior, because the variation in the samples mechanically goes up (which is the case here: the range approx. doubles from the book's example)

  • Exercise 4: I think there is a typo in the book's question and I guess the question is: "Re-run the model model_reg and get new plots but using as test_points X_new = np.linspace(np.floor(x.min()), 20, 100)[:,None]". In other words, get posterior predictive samples of the fit model (and graph them), but instead of stoping at the data's maximum, let's compute them up to x=20, where we don't have any observed data:

X_new = np.linspace(np.floor(x.min()), 20, 100)[:,None]

with model_reg:
    # conditional distribution evaluated over new input locations
    f_pred = gp.conditional("f_pred", X_new)
    # samples from the posterior predictive distribution evaluated at the X_new values
    pred_samples = pm.sample_posterior_predictive(trace_reg, vars=[f_pred])

_, ax = plt.subplots(figsize=(12,5))
ax.plot(X_new, pred_samples["f_pred"].T, "C1-", alpha=0.3)
ax.plot(X, y, "ko");

The resulting plot gives us an estimation of the uncertainty in our model, which is dependent on the data and model's specification.

  • Exercises 7 and 8: LGTM. Just one question: what is the purpose of the ϵ parameter? Because I don't see it used anywhere in the model.

Everything else LGTM! Thank you for this hard and useful work 👏
PyMCheers!

@aloctavodia
Copy link
Owner

Adding a few comments to @AlexAndorra answers

Exercise 2: The range does not really increase with the number of samples, instead the real range becomes more evident. The range of allowed values always come from the same multivariate Gaussian, but this is not very easy to see when the number of samples is low. I would recommend using semitransparent lines and the same color for all the lines.

Exercise 4: Alex interpretation is right, sorry about the typo. Even knowing about it it was really hard to me to find it! This is why proofreading by "others" is so important when writing something.

One more comment

Exercise 7, notice that for this example it is not possible to compute a good linear boundary

@AlexAndorra
Copy link

Oh ok! Thanks for correcting my misunderstanding Osvaldo 😃

@canyon289
Copy link
Collaborator Author

I added Exercise 3. If someone gets time could use a double check to make sure it's correct!

@aloctavodia
Copy link
Owner

Exercise 3 is OK!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants