Skip to content

Conversation

cetagostini
Copy link
Contributor

@cetagostini cetagostini commented Oct 3, 2025

Description

Upper funnel is becoming more and more a topic, and how to model activities which don't fully impact the target or are mediated by different channels is becoming important. Here, I show how to make it following a simple causal approach, which anyone can follow if the DAG is being identified.

Related Issue

Checklist


📚 Documentation preview 📚: https://pymc-marketing--1971.org.readthedocs.build/en/1971/

@cetagostini cetagostini self-assigned this Oct 3, 2025
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@github-actions github-actions bot added docs Improvements or additions to documentation MMM labels Oct 3, 2025
@cetagostini cetagostini added enhancement New feature or request maintenance causal inference and removed docs Improvements or additions to documentation MMM labels Oct 3, 2025
@cetagostini cetagostini added this to the 0.17.0 milestone Oct 3, 2025
@juanitorduz
Copy link
Collaborator

This looks very exciting @carlosagostini ! I will provide a deep review once is ready 🚀 !

@cetagostini cetagostini added the docs Improvements or additions to documentation label Oct 5, 2025
@github-actions github-actions bot added the MMM label Oct 5, 2025
@cetagostini cetagostini marked this pull request as ready for review October 8, 2025 19:55
@daniel-saunders-phil
Copy link
Contributor

daniel-saunders-phil commented Oct 9, 2025

I'm feeling very old-man-yells-at-cloud but I left reviews on the notebooknb thingy. They have not appear here yet but I hope they do 🤷

Anyway, I have two big picture suggestions:

  1. In linear DAGs, you can just calculate the coefficient of x4 on x1 by hand. I think that would provide a clearer target to recover than the full time-series contribution computed through counterfactuals. I see the appeal in non-linear cases but if you can focus on one number, it's much easier to track success (and understand the causes of failure).

Suppose we had a system like:

x3 = 4*x4 + Norm(0,1)
x2 = 2*x3 + Norm(0,1)
x1 = 0.25*x2 + Norm(0,1)

that implies

x2 = 8*x4 + Norm(0,2) + Norm(0,1)
x1 = 2*x4 + Norm(0,0.5) + Norm(0,1) + Norm(0,1)

and you can simplify all the noise terms too:

x1 = 2*x4 + Norm(0,2.5)
  1. Try to put a little bit of the theory of the correct way to do earlier in the notebook. If the reader doesn't understand backdoor criteria thinking, I don't know if they would walk away from this notebook understanding why your model works and another doesn't. If they do understand the backdoor criteria, they don't really need the first estimation example to get your point. They can see from the graph alone that putting all variables in the model will give the wrong result. So that reader would just care more about how to code the correct answer. Either way, explain what approach you intend to take and why would make this more digestible (and maybe shorter!)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

causal inference docs Improvements or additions to documentation enhancement New feature or request maintenance MMM

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants