Improvements on multidimensional notebook #1876

daniel-saunders-phil · 2025-08-06T22:56:41Z

The current multi-dimensional notebook has a couple of issues:

It's not hierarchical since Updating MMM Budget allocation examples, functionalities and dependencies #1849 but the text still discusses hierarchies.
The data is copied across hierarchies which cause difficult to resolve posterior correlations
Unnecessary sigma sigma parameter

This PR address those issues and offers a number of other improvements:

a new data set specific to the multi-dimensional class. The synthetic data was generated by the same MMM so that ensures at least decent performance.
Demos how to build a non-centered positive valued random variable. This is a useful trick for hierarchical MMMs because usually the channels are weakly informative and usually you need to use a non-centered distribution.
Removes the changepoint trend stuff. Change points are mixtures and mixtures can be quite hard to estimate. So I think it might be smarter to keep this notebook narrow and focus on hierarchies and dims.
Add a parameter recovery section to the notebook
Add a script to reproduce the synthetic data whenever we want and save off the true parameter values to use in the recovery section.

It's not finished yet because I'm still getting some pesky divergences. A bit unclear the cause atm.

📚 Documentation preview 📚: https://pymc-marketing--1876.org.readthedocs.build/en/1876/

…book, all the hierarchical parameters were removed in favour of fully pooled parameters over geographies. However the text still discusses hierarchies so I'm restoring them.

…between partial, full and no pooling models. We now have examples of all three so it's a bit more a friendly introduction.

…native prior_predictive method, I think it's better to call pymc's prior predictive method. The advatnage is that y_original_scale is easily recovered so it feels more consistent with the way we fetch y_original_scale later in the notebook. It is also confusing to go through the work of adding y_original_scale a few cells above and then make people fetch the scales all over again.

…example data exactly duplicated channels for each geography which would cause intense posterior correlations. New example data is generated from the multi-dimensional mmm. Also removed the change points - those are mixtures and are inheritly difficult to fit. I think we should keep this notebook more narrowly focused on dimensions and hierarchies.

…on because we'd need a zero sum normal and the prior class isn't design to interact with zero sum normals very well (I would like to be able to add two Prior classes together so we can have a seperate mean.)

review-notebook-app · 2025-08-06T22:56:45Z

Check out this pull request on

See visual diffs & provide feedback on Jupyter Notebooks.

Powered by ReviewNB

daniel-s-tccc · 2025-08-08T19:51:24Z

Alright, I got the model to converge again. The optimizer outputs are bit messed up though. I'll have to figure that out next.

… to be appropriate to the scale of this dataset.

daniel-saunders-phil · 2025-08-08T22:00:26Z

@cetagostini Hey mate, if you have a chance to review, that would be much appreciated.

I got a bit cavalier with ripping out features that were making it unduly difficult to fit the model so open to discussing those choices. But hey, it fits in 70 seconds now 😃

Also, I couldn't find a idata compression small enough nor is it synced with the yaml file anymore. Lemme know if that's a blocker and I'll try again.

review-notebook-app · 2025-08-09T07:13:37Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-08-09T07:13:37Z
----------------------------------------------------------------

Line #12.        extend_idata=True,

Shall we use nutpie?

cetagostini

I like the notebook a lot, thanks for all the help here mate, amazing contribution to make the notebook look better the small details, we can log them as issues and solve soon.

Respect to Nutpie, I'll say optional, and up to you if you want to change it 🙌🏻

Amazing @daniel-saunders-phil 🔥

juanitorduz · 2025-08-10T20:20:12Z

@daniel-saunders-phil @cetagostini I suggest we merge #1832 and then you can rebase and merge this one? @daniel-s-tccc do we need to also update the other notebooks that rely on this example (i.e. budget allocations ?)

cetagostini · 2025-08-10T20:21:29Z

@juanitorduz you are right, that should be only to run the notebook as budget allocation, and allocation assessment.

cetagostini

Lets update the notebooks and continue but work have been amazing!

daniel-s-tccc · 2025-08-11T21:18:44Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-08-09T07:13:37Z ----------------------------------------------------------------

Line #12. extend_idata=True,
Shall we use nutpie?

Nutpie (numba) is broken on the MMM class right now. The error is pretty obscure so haven't had time to figure out why. I would prefer to use the default sampler because I don't want to point people toward nutpie until we resolve that.

codecov · 2025-08-11T21:57:45Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 52.47%. Comparing base (bca55e4) to head (9cc75f6).
⚠️ Report is 1 commits behind head on main.

❗ There is a different number of reports uploaded between BASE (bca55e4) and HEAD (9cc75f6). Click for more details.

HEAD has 4 uploads less than BASE

Flag BASE (bca55e4) HEAD (9cc75f6)

23 19

Additional details and impacted files

@@             Coverage Diff             @@
##             main    #1876       +/-   ##
===========================================
- Coverage   91.86%   52.47%   -39.39%     
===========================================
  Files          64       64               
  Lines        7546     7546               
===========================================
- Hits         6932     3960     -2972     
- Misses        614     3586     +2972

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

daniel-s-tccc · 2025-08-11T22:08:35Z

I'm working through the rebase and it's pretty time-consuming. @carlosagostini @juanitorduz

The trouble is that everything is so tightly coupled - the text in the optimizer notebook wants to call out specific behaviours that old might exhibited or relies on features of the old dataset. Plus the diffs on notebooks are challenging to parse so I cannot quite tell what I need to keep track of and what is new in Carlo's work.

What do we think decoupling the multi-dimensional mmm notebook from the optimizer notebooks? It would be easy to do - we just don't save the result of the multi-dimensional notebook on each run. Then we can use old idata, old data, old yaml to run the optimizer notebooks and their results would be preserved. It would help make the notebook suite a bit more modular - you can improve one notebook without having to rewrite all the others.

cetagostini · 2025-08-12T08:33:09Z

@daniel-s-tccc I definitely see the benefit to separate notebooks, at the same time, I could be scare then if they are two detach we could bring issues. Ideally, our models in production can use all notebook functions.

I'll jump in your branch, and take a look to your issues and try to give you a hand!

review-notebook-app · 2025-08-12T08:53:25Z

View / edit / reply to this conversation on ReviewNB

cetagostini commented on 2025-08-12T08:53:24Z
----------------------------------------------------------------

Respect to previous one, channels estimations look quite similar. From storytelling perspective, I'm not sure about the new model, both channels are massively uncertainty, and we don't have clear winners. Previous model definitely was showing different adstocks, and preference to x1.

Could we use set better params for x1? and probably take signals from x2, so values are only in the lower range bringing uncertainty to it?

…act. Update notebook with code for parameter recovery.

daniel-saunders-phil · 2025-08-14T00:50:21Z

Hey hey, I saw the notebook today and we are in track!

I think, if we add:

Parameter recovery

Recommendations to decrease the size of .nc file

We are good to go! 🔥

I agree, that's a good plan to finish it. Parameter recovery is hitting some bumps so I'm gonna keep digging to figure out what is happening.

…c-labs/pymc-marketing into correct_multi_dimensional_nb

daniel-saunders-phil · 2025-08-14T18:16:34Z

Alright guys, I think we are good here :)

juanitorduz

Small comments on the script ;)

scripts/data_generators/multidimensional_data_generation.py

review-notebook-app · 2025-08-15T21:06:18Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-08-15T21:06:17Z
----------------------------------------------------------------

Can we remove the output ...

python

File "c:\Users\dsaun\miniforge3\envs\pymc-marketing-dev\Lib\importlib\__init__.py", line 128, in reload

raise ModuleNotFoundError(f"spec not found for the module {name!r}", name=name)

ModuleNotFoundError: spec not found for the module 'cutils_ext'

review-notebook-app · 2025-08-15T21:06:18Z

View / edit / reply to this conversation on ReviewNB

juanitorduz commented on 2025-08-15T21:06:18Z
----------------------------------------------------------------

are we missing some bar plot on the lower right panel?

daniel-saunders-phil commented on 2025-08-15T21:58:27Z
----------------------------------------------------------------

The optimizer just didn't put any money there because it's the worst channel. I agree it looks weird so I pushed the budget up a bit.

juanitorduz · 2025-08-15T21:07:47Z

Thanks @daniel-saunders-phil :) I left some minor comments. I think there are some open comments from @carlosagostini (or are they resolved?), once we close them, let's merge this one! I like the new content a lot! Thanks again for making it much better 🙌

…ion. Get rid of the reloader error. Remove stray channel variance diagnostic I left in.

daniel-saunders-phil · 2025-08-15T21:58:28Z

The optimizer just didn't put any money there because it's the worst channel. I agree it looks weird so I pushed the budget up a bit.

View entire conversation on ReviewNB

daniel-saunders-phil · 2025-08-15T22:00:42Z

@juanitorduz tided up a few things. I think we are all clear on Carlos' suggestions - they mostly pertained to the optimizer notebook except the two items he put in a checklist for me.

…c-labs/pymc-marketing into correct_multi_dimensional_nb

cetagostini · 2025-08-18T17:53:34Z

@juanitorduz tided up a few things. I think we are all clear on Carlos' suggestions - they mostly pertained to the optimizer notebook except the two items he put in a checklist for me.

Yes, we are good to go here from my side!

daniel-saunders-phil · 2025-08-18T19:45:31Z

@juanitorduz tided up a few things. I think we are all clear on Carlos' suggestions - they mostly pertained to the optimizer notebook except the two items he put in a checklist for me.

Yes, we are good to go here from my side!

When you have a second, could you approve? I think it's blocked until you say yes because you put in a request for changes last week.

juanitorduz · 2025-08-18T19:47:28Z

I approved! I think we need @carlosagostini 's blessing :)

juanitorduz · 2025-08-18T19:58:03Z

yay! Thanks @daniel-saunders-phil

daniel-saunders-phil added 6 commits August 5, 2025 17:37

Make beta a hierarchical parameter. In a previous version of the note…

2f016dd

…book, all the hierarchical parameters were removed in favour of fully pooled parameters over geographies. However the text still discusses hierarchies so I'm restoring them.

rewrote some of the body to be explicit about what the difference is …

c8d753a

…between partial, full and no pooling models. We now have examples of all three so it's a bit more a friendly introduction.

removed unnecessary sigma_sigma parameter.

292e17f

checkmarking progress - I'm blocked from improving the parameterizati…

0a13dec

…on because we'd need a zero sum normal and the prior class isn't design to interact with zero sum normals very well (I would like to be able to add two Prior classes together so we can have a seperate mean.)

daniel-saunders-phil marked this pull request as draft August 6, 2025 22:56

github-actions bot added docs Improvements or additions to documentation MMM labels Aug 6, 2025

daniel-saunders-phil changed the title ~~Correct multi dimensional nb~~ Improvements on multidimensional notebook Aug 7, 2025

daniel-saunders-phil added 2 commits August 8, 2025 11:52

Update notebook - model converges now.

9656c8c

write text for dataset.

61abba2

optimizer is working properly again - just had to increase the budget…

e8e3118

… to be appropriate to the scale of this dataset.

daniel-saunders-phil marked this pull request as ready for review August 8, 2025 21:56

Merge branch 'main' into correct_multi_dimensional_nb

c62feb2

juanitorduz requested a review from cetagostini August 9, 2025 07:10

juanitorduz added this to the 0.16.0 milestone Aug 9, 2025

cetagostini approved these changes Aug 9, 2025

View reviewed changes

cetagostini requested changes Aug 10, 2025

View reviewed changes

add a script to generate data and save off a parameter recovery artif…

efa1444

…act. Update notebook with code for parameter recovery.

daniel-saunders-phil and others added 4 commits August 14, 2025 10:55

tweaks to get the mmm to sample without divergences.

72a6500

Merge branch 'correct_multi_dimensional_nb' of https://github.com/pym…

bf90ccf

…c-labs/pymc-marketing into correct_multi_dimensional_nb

add true parameters to git.

5ceff7b

Merge branch 'main' into correct_multi_dimensional_nb

eca4d97

daniel-saunders-phil requested review from cetagostini and juanitorduz August 14, 2025 18:15

juanitorduz requested changes Aug 15, 2025

View reviewed changes

scripts/data_generators/multidimensional_data_generation.py Outdated Show resolved Hide resolved

scripts/data_generators/multidimensional_data_generation.py Outdated Show resolved Hide resolved

daniel-saunders-phil added 2 commits August 15, 2025 14:55

add type hints and update doc string style to match style guide.

c980501

add a bit to the optimizer budget so all four channels get an allocat…

706b52a

…ion. Get rid of the reloader error. Remove stray channel variance diagnostic I left in.

Merge branch 'main' into correct_multi_dimensional_nb

a3d3779

daniel-saunders-phil and others added 2 commits August 15, 2025 15:00

Merge branch 'correct_multi_dimensional_nb' of https://github.com/pym…

e84b048

…c-labs/pymc-marketing into correct_multi_dimensional_nb

Merge branch 'main' into correct_multi_dimensional_nb

b80e354

Merge branch 'main' into correct_multi_dimensional_nb

80c93fa

juanitorduz approved these changes Aug 18, 2025

View reviewed changes

daniel-saunders-phil enabled auto-merge (squash) August 18, 2025 19:44

cetagostini approved these changes Aug 18, 2025

View reviewed changes

daniel-saunders-phil merged commit cfb4171 into main Aug 18, 2025
9 checks passed

daniel-saunders-phil deleted the correct_multi_dimensional_nb branch August 18, 2025 19:56

Improvements on multidimensional notebook #1876

Improvements on multidimensional notebook #1876

Conversation

daniel-saunders-phil commented Aug 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented Aug 6, 2025

Uh oh!

daniel-s-tccc commented Aug 8, 2025

Uh oh!

daniel-saunders-phil commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented Aug 9, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cetagostini left a comment

Choose a reason for hiding this comment

Uh oh!

juanitorduz commented Aug 10, 2025

Uh oh!

cetagostini commented Aug 10, 2025

Uh oh!

cetagostini left a comment

Choose a reason for hiding this comment

Uh oh!

daniel-s-tccc commented Aug 11, 2025

Uh oh!

codecov bot commented Aug 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

daniel-s-tccc commented Aug 11, 2025

Uh oh!

cetagostini commented Aug 12, 2025

Uh oh!

review-notebook-app bot commented Aug 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

daniel-saunders-phil commented Aug 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

daniel-saunders-phil commented Aug 14, 2025

Uh oh!

juanitorduz left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

review-notebook-app bot commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

review-notebook-app bot commented Aug 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

juanitorduz commented Aug 15, 2025

Uh oh!

daniel-saunders-phil commented Aug 15, 2025

Uh oh!

daniel-saunders-phil commented Aug 15, 2025

Uh oh!

cetagostini commented Aug 18, 2025

Uh oh!

daniel-saunders-phil commented Aug 18, 2025

Uh oh!

juanitorduz commented Aug 18, 2025

Uh oh!

Uh oh!

juanitorduz commented Aug 18, 2025

Uh oh!

Uh oh!

daniel-saunders-phil commented Aug 6, 2025 •

edited

Loading

daniel-saunders-phil commented Aug 8, 2025 •

edited

Loading

review-notebook-app bot commented Aug 9, 2025 •

edited

Loading

codecov bot commented Aug 11, 2025 •

edited

Loading

review-notebook-app bot commented Aug 12, 2025 •

edited

Loading

daniel-saunders-phil commented Aug 14, 2025 •

edited

Loading

review-notebook-app bot commented Aug 15, 2025 •

edited

Loading

review-notebook-app bot commented Aug 15, 2025 •

edited

Loading