Skip to content

Commit b16a03c

Browse files
authored
Add tutorial for hillstrom dataset (#65)
1 parent 07ecc7d commit b16a03c

File tree

6 files changed

+294
-0
lines changed

6 files changed

+294
-0
lines changed
225 KB
Loading
193 KB
Loading
180 KB
Loading

docs/source/index.rst

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,7 @@ For theoretical foundations, see:
4343

4444
installation
4545
get_started
46+
tutorials
4647
api_reference
4748
contributing
4849

docs/source/tutorials.rst

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,14 @@
1+
Tutorials: Analyzing Famous Randomized Control Trials
2+
=====================================================
3+
4+
This section provides comprehensive tutorials showing how to use the ``dte_adj`` library to analyze distributional treatment effects in famous randomized control trials. These examples demonstrate the power of looking beyond average treatment effects to understand how interventions affect entire outcome distributions.
5+
6+
Available Tutorials
7+
-------------------
8+
9+
.. toctree::
10+
:maxdepth: 1
11+
12+
tutorials/hillstrom
13+
14+
The tutorials demonstrate practical applications of the ``dte_adj`` library using real-world datasets from famous randomized experiments. Each tutorial provides complete code examples, visualizations, and interpretations of distributional treatment effects.
Lines changed: 279 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,279 @@
1+
Hillstrom Email Marketing Experiment
2+
====================================
3+
4+
The Hillstrom email marketing dataset is a classic example from digital marketing, involving 64,000 customers randomly assigned to receive either a men's merchandise email, women's merchandise email, or no email (control). This experiment allows us to examine which email campaign strategy is most effective using revenue as the outcome.
5+
6+
**Background**: Kevin Hillstrom provided this dataset to demonstrate email marketing analytics. Customers who purchased within the last 12 months were randomly divided into three groups to test targeted email campaigns against a control group.
7+
8+
**Research Question**: Which email campaign performed best - the men's version or the women's version - and how do the effects vary across the revenue distribution?
9+
10+
Data Setup and Loading
11+
~~~~~~~~~~~~~~~~~~~~~~~
12+
13+
.. code-block:: python
14+
15+
import numpy as np
16+
import pandas as pd
17+
import matplotlib.pyplot as plt
18+
from sklearn.linear_model import LinearRegression
19+
from sklearn.preprocessing import LabelEncoder
20+
import dte_adj
21+
from dte_adj.plot import plot
22+
23+
# Load the real Hillstrom dataset
24+
url = "http://www.minethatdata.com/Kevin_Hillstrom_MineThatData_E-MailAnalytics_DataMiningChallenge_2008.03.20.csv"
25+
df = pd.read_csv(url)
26+
27+
print(f"Dataset shape: {df.shape}")
28+
print(f"Average spend by segment:\n{df.groupby('segment')['spend'].mean()}")
29+
30+
# Prepare the data for dte_adj analysis
31+
# Create treatment indicator: 0=No E-Mail, 1=Mens E-Mail, 2=Women E-Mail
32+
treatment_mapping = {'No E-Mail': 0, 'Mens E-Mail': 1, 'Women E-Mail': 2}
33+
D = df['segment'].map(treatment_mapping).values
34+
35+
# Use spend as the outcome variable (revenue)
36+
revenue = df['spend'].values
37+
38+
zip_code_mapping = {'Surburban': 0, 'Rural': 1, 'Urban': 2} # Note: typo in original data
39+
channel_mapping = {'Phone': 0, 'Web': 1, 'Multichannel': 2}
40+
41+
# Create feature matrix
42+
features = pd.DataFrame({
43+
'recency': df['recency'],
44+
'history': df['history'],
45+
'history_segment': df['history_segment'].map(lambda s: int(s[0])),
46+
'mens': df['mens'],
47+
'women': df['women'],
48+
'zip_code': df['zip_code'].map(zip_code_mapping),
49+
'newbie': df['newbie'],
50+
'channel': df['channel'].map(channel_mapping)
51+
})
52+
53+
X = features.values
54+
55+
print(f"\nDataset size: {len(D):,} customers")
56+
print(f"Control group (No Email): {(D==0).sum():,} ({(D==0).mean():.1%})")
57+
print(f"Men's Email group: {(D==1).sum():,} ({(D==1).mean():.1%})")
58+
print(f"Women's Email group: {(D==2).sum():,} ({(D==2).mean():.1%})")
59+
print("Average Spend by Treatment:")
60+
print(f"No Email: ${revenue[D==0].mean():.2f}")
61+
print(f"Men's Email: ${revenue[D==1].mean():.2f}")
62+
print(f"Women's Email: ${revenue[D==2].mean():.2f}")
63+
64+
# Also show conversion rates
65+
print("\nConversion Rates:")
66+
print(f"No Email: {df[df['segment']=='No E-Mail']['conversion'].mean():.3f}")
67+
print(f"Men's Email: {df[df['segment']=='Mens E-Mail']['conversion'].mean():.3f}")
68+
print(f"Women's Email: {df[df['segment']=='Women E-Mail']['conversion'].mean():.3f}")
69+
70+
Comparing Men's vs Women's Email Campaigns
71+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
72+
73+
.. code-block:: python
74+
75+
print(f"Email campaign comparison sample: {len(D_email):,} customers")
76+
print(f"Men's Email: {(D_email==0).sum():,}")
77+
print(f"Women's Email: {(D_email==1).sum():,}")
78+
79+
# Initialize estimators for email comparison
80+
simple_email = dte_adj.SimpleDistributionEstimator()
81+
ml_email = dte_adj.AdjustedDistributionEstimator(
82+
LinearRegression(),
83+
folds=5
84+
)
85+
86+
# Fit estimators
87+
simple_email.fit(X, D, revenue)
88+
ml_email.fit(X, D, revenue)
89+
90+
# Define revenue evaluation points
91+
revenue_locations = np.linspace(0, 500, 51)
92+
93+
# Compute DTE: Women's vs Men's email campaigns
94+
dte_simple, lower_simple, upper_simple = simple_email.predict_dte(
95+
target_treatment_arm=2, # Women's email
96+
control_treatment_arm=1, # Men's email (as "control")
97+
locations=revenue_locations,
98+
variance_type="moment"
99+
)
100+
101+
dte_ml, lower_ml, upper_ml = ml_email.predict_dte(
102+
target_treatment_arm=2, # Women's email
103+
control_treatment_arm=1, # Men's email
104+
locations=revenue_locations,
105+
variance_type="moment"
106+
)
107+
108+
Distribution Treatment Effects Analysis
109+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
110+
111+
.. code-block:: python
112+
113+
# Visualize the distribution treatment effects using dte_adj's built-in plot function
114+
115+
# Simple estimator
116+
plot(revenue_locations, dte_simple, lower_simple, upper_simple,
117+
title="Email Campaign Comparison: Women's vs Men's (Simple Estimator)",
118+
xlabel="Spending ($)", ylabel="Distribution Treatment Effect")
119+
120+
# ML-adjusted estimator
121+
plot(revenue_locations, dte_ml, lower_ml, upper_ml,
122+
title="Email Campaign Comparison: Women's vs Men's (ML-Adjusted Estimator)",
123+
xlabel="Spending ($)", ylabel="Distribution Treatment Effect")
124+
125+
# Statistical summary
126+
positive_dte = (dte_ml > 0).mean()
127+
significant_dte = ((lower_ml > 0) | (upper_ml < 0)).mean()
128+
129+
print(f"\nDistributional Analysis Results:")
130+
print(f"Locations where Women's > Men's: {positive_dte:.1%}")
131+
print(f"Statistically significant differences: {significant_dte:.1%}")
132+
print(f"Average DTE: {dte_ml.mean():.3f}")
133+
134+
The analysis produces the following distribution treatment effects visualization:
135+
136+
.. image:: ../_static/hillstorm_dte.png
137+
:alt: Hillstrom Email Marketing DTE Analysis
138+
:width: 800px
139+
:align: center
140+
141+
**Interpreting the Results**: The plot shows the distribution treatment effects (DTE) comparing Women's vs Men's email campaigns across different spending levels. Key observations:
142+
143+
- **Positive DTE values** (above zero line) indicate that Women's email campaign increases the probability of spending at that level compared to Men's campaign
144+
- **Confidence intervals** (shaded areas) show statistical uncertainty - where intervals don't cross zero, effects are statistically significant
145+
- **Heterogeneous effects** across spending distribution reveal that campaign effectiveness varies by customer spending levels
146+
- **ML-adjusted estimator** (bottom panel) typically provides more precise estimates with tighter confidence intervals than the simple estimator (top panel)
147+
148+
The distributional analysis reveals nuanced patterns that would be missed by simply comparing average spending between campaigns.
149+
150+
Revenue Category Analysis with PTE
151+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
152+
153+
.. code-block:: python
154+
155+
# Compute Probability Treatment Effects
156+
pte_simple, pte_lower_simple, pte_upper_simple = simple_email.predict_pte(
157+
target_treatment_arm=2, # Women's email
158+
control_treatment_arm=1, # Men's email
159+
locations=revenue_locations,
160+
variance_type="moment"
161+
)
162+
163+
pte_ml, pte_lower_ml, pte_upper_ml = ml_email.predict_pte(
164+
target_treatment_arm=2, # Women's email
165+
control_treatment_arm=1, # Men's email
166+
locations=revenue_locations,
167+
variance_type="moment"
168+
)
169+
170+
# Visualize PTE results using dte_adj's plot function with bar chart
171+
172+
# Simple estimator
173+
plot(revenue_locations[:-1], pte_simple, pte_lower_simple, pte_upper_simple,
174+
chart_type="bar",
175+
title="Spending Category Effects: Women's vs Men's (Simple Estimator)",
176+
xlabel="Spending Category", ylabel="Probability Treatment Effect", color="purple")
177+
178+
# ML-adjusted estimator
179+
plot(revenue_locations[:-1], pte_ml, pte_lower_ml, pte_upper_ml,
180+
chart_type="bar",
181+
title="Spending Category Effects: Women's vs Men's (ML-Adjusted Estimator)",
182+
xlabel="Spending Category", ylabel="Probability Treatment Effect")
183+
184+
The Probability Treatment Effects analysis produces the following visualization:
185+
186+
.. image:: ../_static/hillstorm_pte.png
187+
:alt: Hillstrom Email Marketing PTE Analysis
188+
:width: 800px
189+
:align: center
190+
191+
**Interpreting the PTE Results**: The bar charts show probability treatment effects across different spending intervals, revealing which spending ranges are most affected by the Women's vs Men's email campaigns:
192+
193+
- **Positive bars** indicate spending ranges where Women's email campaign increases the probability of customers spending in that range compared to Men's email
194+
- **Negative bars** show ranges where Men's email campaign is more effective
195+
- **Error bars** represent confidence intervals - bars that don't cross zero are statistically significant
196+
- **Different patterns** between simple (top) and ML-adjusted (bottom) estimators show how machine learning adjustment can provide more precise estimates
197+
198+
**Key PTE Findings**:
199+
200+
1. **Low spending ranges** ($0-$25): Women's campaign may be more effective at driving small purchases
201+
2. **Medium spending ranges** ($25-$100): Effects vary, showing differential campaign effectiveness
202+
3. **High spending ranges** ($100+): Reveals which campaign is better at generating high-value customers
203+
4. **Statistical significance**: Confidence intervals show where differences are reliable vs. due to chance
204+
205+
This granular analysis helps marketers understand not just which campaign generates more revenue overall, but specifically which spending behaviors each campaign drives.
206+
207+
Control vs Email Campaigns Analysis
208+
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
209+
210+
.. code-block:: python
211+
212+
dte_mens_ctrl, lower_mens_ctrl, upper_mens_ctrl = simple_email.predict_dte(
213+
target_treatment_arm=1, control_treatment_arm=0,
214+
locations=revenue_locations, variance_type="moment"
215+
)
216+
217+
dte_women_ctrl, lower_women_ctrl, upper_women_ctrl = simple_email.predict_dte(
218+
target_treatment_arm=2, control_treatment_arm=0,
219+
locations=revenue_locations, variance_type="moment"
220+
)
221+
222+
# Visualize both campaigns vs control using dte_adj's plot function
223+
224+
# Men's vs Control
225+
plot(revenue_locations, dte_mens_ctrl, lower_mens_ctrl, upper_mens_ctrl,
226+
title="Men's Email Campaign vs Control",
227+
xlabel="Spending ($)", ylabel="Distribution Treatment Effect", color="purple")
228+
229+
# Women's vs Control
230+
plot(revenue_locations, dte_women_ctrl, lower_women_ctrl, upper_women_ctrl,
231+
title="Women's Email Campaign vs Control",
232+
xlabel="Spending ($)", ylabel="Distribution Treatment Effect")
233+
234+
The control vs email campaigns analysis produces the following comparison:
235+
236+
.. image:: ../_static/hillstorm_dte_control.png
237+
:alt: Hillstrom Email Campaigns vs Control Analysis
238+
:width: 800px
239+
:align: center
240+
241+
**Interpreting the Control Comparison Results**: These side-by-side plots show how each email campaign performs against the no-email control group across different spending levels:
242+
243+
**Men's Email vs Control (Top Panel)**:
244+
- **Positive DTE values** indicate that Men's email campaign increases the probability of spending at those levels compared to no email
245+
- **Distribution pattern** shows where Men's email is most effective in driving customer spending
246+
- **Confidence intervals** reveal statistical significance of the treatment effects
247+
248+
**Women's Email vs Control (Bottom Panel)**:
249+
- **Comparative effectiveness** can be assessed by comparing the magnitude and patterns of effects
250+
- **Different spending ranges** may show varying campaign effectiveness
251+
- **Statistical significance** indicated by confidence intervals not crossing zero
252+
253+
**Key Control Analysis Findings**:
254+
255+
1. **Campaign Effectiveness**: Both campaigns show positive effects compared to no email, confirming that email marketing drives incremental spending
256+
257+
2. **Differential Patterns**: The shape and magnitude of effects differ between campaigns, revealing:
258+
- Which campaign has stronger overall effects
259+
- Different spending ranges where each campaign excels
260+
- Varying confidence in treatment effects across spending levels
261+
262+
3. **Business Implications**:
263+
- **ROI Assessment**: Compare effect sizes to determine which campaign provides better return on investment
264+
- **Customer Segmentation**: Identify spending ranges where each campaign is most/least effective
265+
- **Resource Allocation**: Data-driven decisions on campaign budget allocation
266+
267+
4. **Statistical Rigor**: Confidence intervals provide guidance on where observed differences are statistically reliable vs. potentially due to sampling variation
268+
269+
This analysis answers the fundamental question: "Do email campaigns work?" and more importantly, "Which one works better and for which customer segments?"
270+
271+
**Key Findings**: Using the real Hillstrom dataset with 64,000 customers, the distributional analysis reveals nuanced patterns in how email campaigns affect customer spending. The analysis goes beyond simple average comparisons to show how treatment effects vary across the entire spending distribution, providing insights into which customer segments respond best to different campaign types. This demonstrates the power of distribution treatment effect analysis for understanding heterogeneous responses in digital marketing experiments.
272+
273+
Next Steps
274+
~~~~~~~~~~
275+
276+
- Try with your own randomized experiment data
277+
- Experiment with different ML models (XGBoost, Neural Networks) for adjustment
278+
- Explore stratified estimators for covariate-adaptive randomization designs
279+
- Use multi-task learning (``is_multi_task=True``) for computational efficiency with many locations

0 commit comments

Comments
 (0)