Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
321 changes: 321 additions & 0 deletions docs/source/tutorials/hillstrom.rst
Original file line number Diff line number Diff line change
Expand Up @@ -332,6 +332,327 @@ The side-by-side bar charts show probability treatment effects across different

**Conclusion**: Using the real Hillstrom dataset with 64,000 customers, the distributional analysis reveals nuanced patterns in how email campaigns affect customer spending. The analysis goes beyond simple average comparisons to show how treatment effects vary across the entire spending distribution, providing insights into which customer segments respond best to different campaign types. This demonstrates the power of distribution treatment effect analysis for understanding heterogeneous responses in digital marketing experiments.

Subgroup Analysis by Purchase History
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

Beyond comparing email campaigns overall, we can examine how campaign effectiveness varies by customer purchase history. This analysis segments customers based on their past purchasing behavior:

- **Male Purchaser Segment** (``mens=1``): Customers who previously purchased men's merchandise (35,266 customers, 55.1%)
- **Female Purchaser Segment** (``womens=1``): Customers who previously purchased women's merchandise (35,182 customers, 55.0%)

Comment on lines +338 to +342
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The subgroup labels "Male Purchaser Segment" / "Female Purchaser Segment" are misleading here: the mens / womens columns indicate prior purchases in men's/women's merchandise categories, not customer gender. Consider renaming the text to "Men's merchandise purchasers" and "Women's merchandise purchasers" to avoid confusion for readers.

Copilot uses AI. Check for mistakes.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yasui-salmon I think this comment is legit

Note that these segments overlap (6,448 customers purchased both categories), so a customer can appear in both analyses.
Comment thread
TomeHirata marked this conversation as resolved.

**Research Question**: Does the effectiveness of men's vs women's email campaigns vary by the type of merchandise customers have historically purchased?

Defining Subgroups
^^^^^^^^^^^^^^^^^^^

.. code-block:: python

# Define subgroup masks based on purchase history
male_purchasers = (df['mens'] == 1)
female_purchasers = (df['womens'] == 1)

Comment on lines +352 to +355
Copy link

Copilot AI Apr 20, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the code block, the mask variables are named male_purchasers / female_purchasers, but they actually represent purchasers of men's/women's merchandise categories. Renaming these variables to something like mens_purchasers / womens_purchasers would make the example clearer and consistent with the underlying columns.

Copilot uses AI. Check for mistakes.
print(f"Male purchaser segment: {male_purchasers.sum():,} customers")
print(f"Female purchaser segment: {female_purchasers.sum():,} customers")
print(f"Overlap: {(male_purchasers & female_purchasers).sum():,} customers")

Average Treatment Effects by Subgroup
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Let's first compute the average treatment effects (ATEs) to quantify the overall impact:

.. code-block:: python

# Compute ATEs for each campaign-subgroup combination
# Women's Email Campaign
ate_women_male = (revenue[(D==2) & male_purchasers].mean() -
revenue[(D==0) & male_purchasers].mean())
ate_women_female = (revenue[(D==2) & female_purchasers].mean() -
revenue[(D==0) & female_purchasers].mean())

# Men's Email Campaign
ate_men_male = (revenue[(D==1) & male_purchasers].mean() -
revenue[(D==0) & male_purchasers].mean())
ate_men_female = (revenue[(D==1) & female_purchasers].mean() -
revenue[(D==0) & female_purchasers].mean())

print("Average Treatment Effects by Subgroup:")
print("\nWomen's Email Campaign:")
print(f" Male Purchasers: ATE = ${ate_women_male:.4f}")
print(f" Female Purchasers: ATE = ${ate_women_female:.4f}")
print("\nMen's Email Campaign:")
print(f" Male Purchasers: ATE = ${ate_men_male:.4f}")
print(f" Female Purchasers: ATE = ${ate_men_female:.4f}")

Expected output::

Average Treatment Effects by Subgroup:

Women's Email Campaign:
Male Purchasers: ATE = $0.2564
Female Purchasers: ATE = $0.5442

Men's Email Campaign:
Male Purchasers: ATE = $0.8966
Female Purchasers: ATE = $0.8412

These results reveal important patterns:

- **Women's Email Campaign**: Shows 2× stronger effect for female purchasers ($0.54) vs male purchasers ($0.26)
- **Men's Email Campaign**: Demonstrates consistent strong effects across both segments ($0.84-$0.89)

While these averages provide a useful summary, they don't tell us *how* customer spending distributions change. The distributional and probability treatment effect analyses that follow reveal the complete picture of campaign effectiveness.

Distribution Treatment Effects: Women's Email Campaign
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Beyond the average effects, let's examine how the Women's Email campaign shifts the entire spending distribution for each subgroup:

.. code-block:: python

# Analyze male purchaser segment
estimator_male = dte_adj.SimpleDistributionEstimator()
estimator_male.fit(X[male_purchasers], D[male_purchasers], revenue[male_purchasers])

# Analyze female purchaser segment
estimator_female = dte_adj.SimpleDistributionEstimator()
estimator_female.fit(X[female_purchasers], D[female_purchasers], revenue[female_purchasers])

# Define evaluation points
locations = np.linspace(0, 500, 51)

# Compute DTE for Women's Email vs Control in each subgroup
dte_women_male, lower_women_male, upper_women_male = estimator_male.predict_dte(
target_treatment_arm=2, # Women's Email
control_treatment_arm=0, # No Email
locations=locations,
variance_type="moment"
)

dte_women_female, lower_women_female, upper_women_female = estimator_female.predict_dte(
target_treatment_arm=2, # Women's Email
control_treatment_arm=0, # No Email
locations=locations,
variance_type="moment"
)

# Visualize side-by-side
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

plot(locations, dte_women_male, lower_women_male, upper_women_male,
title="Women's Email vs Control\nMale Purchaser Segment",
xlabel="Spending ($)", ylabel="Distribution Treatment Effect",
color="purple", ax=ax1)
ax1.axhline(y=0, color='black', linestyle='--', linewidth=0.8, alpha=0.5)

plot(locations, dte_women_female, lower_women_female, upper_women_female,
Comment thread
TomeHirata marked this conversation as resolved.
title="Women's Email vs Control\nFemale Purchaser Segment",
xlabel="Spending ($)", ylabel="Distribution Treatment Effect",
color="green", ax=ax2)
ax2.axhline(y=0, color='black', linestyle='--', linewidth=0.8, alpha=0.5)

plt.tight_layout()
plt.show()

.. image:: ../_static/hillstorm_subgroup_women_dte.png
:alt: Women's Email Campaign Subgroup Analysis
:width: 800px
:align: center

**Key Finding for Women's Email Campaign**: The distributional treatment effects reveal that women's email campaigns are significantly more effective for the female purchaser segment (right panel) compared to the male purchaser segment (left panel). The DTE curves show that women's emails reduce the probability of low spending levels (negative DTE at lower thresholds) for female purchasers, indicating a shift toward higher spending. In contrast, the male purchaser segment shows minimal or non-significant effects across most of the spending distribution, with confidence intervals overlapping zero.

Distribution Treatment Effects: Men's Email Campaign
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Now let's examine how the Men's Email campaign affects spending distributions:

.. code-block:: python

# Compute DTE for Men's Email vs Control in each subgroup
dte_men_male, lower_men_male, upper_men_male = estimator_male.predict_dte(
target_treatment_arm=1, # Men's Email
control_treatment_arm=0, # No Email
locations=locations,
variance_type="moment"
)

dte_men_female, lower_men_female, upper_men_female = estimator_female.predict_dte(
target_treatment_arm=1, # Men's Email
control_treatment_arm=0, # No Email
locations=locations,
variance_type="moment"
)

# Visualize side-by-side
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

plot(locations, dte_men_male, lower_men_male, upper_men_male,
title="Men's Email vs Control\nMale Purchaser Segment",
xlabel="Spending ($)", ylabel="Distribution Treatment Effect",
color="purple", ax=ax1)
ax1.axhline(y=0, color='black', linestyle='--', linewidth=0.8, alpha=0.5)

plot(locations, dte_men_female, lower_men_female, upper_men_female,
title="Men's Email vs Control\nFemale Purchaser Segment",
xlabel="Spending ($)", ylabel="Distribution Treatment Effect",
color="green", ax=ax2)
ax2.axhline(y=0, color='black', linestyle='--', linewidth=0.8, alpha=0.5)

plt.tight_layout()
plt.show()

.. image:: ../_static/hillstorm_subgroup_men_dte.png
:alt: Men's Email Campaign Subgroup Analysis
:width: 800px
:align: center

**Key Finding for Men's Email Campaign**: In contrast to women's email campaigns, men's email campaigns show consistent effectiveness across both purchase history segments. The DTE curves in both panels show similar patterns, with negative values at lower spending levels indicating reduced probability of low spending for both male and female purchasers. This suggests that men's emails have broad appeal regardless of whether customers historically purchased men's or women's merchandise.

Probability Treatment Effects: Women's Email Campaign
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

While DTE shows how cumulative distributions shift, Probability Treatment Effects (PTE) reveal which specific spending intervals are most affected by the campaign. PTE measures the change in probability mass within each spending category:

.. code-block:: python

# Compute PTE for Women's Email vs Control in each subgroup
pte_locations = np.insert(locations, 0, -1) # Add -1 at beginning for intervals

pte_women_male, pte_lower_women_male, pte_upper_women_male = estimator_male.predict_pte(
target_treatment_arm=2, # Women's Email
control_treatment_arm=0, # No Email
locations=pte_locations,
variance_type="moment"
)

pte_women_female, pte_lower_women_female, pte_upper_women_female = estimator_female.predict_pte(
target_treatment_arm=2, # Women's Email
control_treatment_arm=0, # No Email
locations=pte_locations,
variance_type="moment"
)

# Visualize side-by-side with bar charts
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

plot(locations, pte_women_male, pte_lower_women_male, pte_upper_women_male,
chart_type="bar",
title="Women's Email vs Control\nMale Purchaser Segment",
xlabel="Spending Category ($)", ylabel="Probability Treatment Effect",
color="purple", ax=ax1)
ax1.axhline(y=0, color='black', linestyle='--', linewidth=0.8, alpha=0.5)

plot(locations, pte_women_female, pte_lower_women_female, pte_upper_women_female,
chart_type="bar",
title="Women's Email vs Control\nFemale Purchaser Segment",
xlabel="Spending Category ($)", ylabel="Probability Treatment Effect",
color="green", ax=ax2)
ax2.axhline(y=0, color='black', linestyle='--', linewidth=0.8, alpha=0.5)

plt.tight_layout()
plt.show()

.. image:: ../_static/hillstorm_subgroup_women_pte.png
:alt: Women's Email Campaign PTE Subgroup Analysis
:width: 800px
:align: center

**Interval-Specific Insights**: The PTE bar charts reveal the mechanism behind the average treatment effect. For female purchasers (right panel), women's emails significantly reduce the probability of zero spending (non-purchasers converting to purchasers), which is the primary driver of the positive ATE. However, no significant increase in high spending categories is observed. For male purchasers (left panel), the effects are much smaller and less consistent, confirming the limited impact suggested by the ATE and DTE analyses.

Probability Treatment Effects: Men's Email Campaign
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Let's examine which spending categories are most affected by men's email campaigns:

.. code-block:: python

# Compute PTE for Men's Email vs Control in each subgroup
pte_men_male, pte_lower_men_male, pte_upper_men_male = estimator_male.predict_pte(
target_treatment_arm=1, # Men's Email
control_treatment_arm=0, # No Email
locations=pte_locations,
variance_type="moment"
)

pte_men_female, pte_lower_men_female, pte_upper_men_female = estimator_female.predict_pte(
target_treatment_arm=1, # Men's Email
control_treatment_arm=0, # No Email
locations=pte_locations,
variance_type="moment"
)

# Visualize side-by-side with bar charts
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(15, 6))

plot(locations, pte_men_male, pte_lower_men_male, pte_upper_men_male,
chart_type="bar",
title="Men's Email vs Control\nMale Purchaser Segment",
xlabel="Spending Category ($)", ylabel="Probability Treatment Effect",
color="purple", ax=ax1)
ax1.axhline(y=0, color='black', linestyle='--', linewidth=0.8, alpha=0.5)

plot(locations, pte_men_female, pte_lower_men_female, pte_upper_men_female,
chart_type="bar",
title="Men's Email vs Control\nFemale Purchaser Segment",
xlabel="Spending Category ($)", ylabel="Probability Treatment Effect",
color="green", ax=ax2)
ax2.axhline(y=0, color='black', linestyle='--', linewidth=0.8, alpha=0.5)

plt.tight_layout()
plt.show()

.. image:: ../_static/hillstorm_subgroup_men_pte.png
:alt: Men's Email Campaign PTE Subgroup Analysis
:width: 800px
:align: center

**Interval-Specific Insights**: Men's email campaigns show similar PTE patterns across both segments (left and right panels). The key mechanism is twofold: (1) significant reduction in zero spending probability (converting non-purchasers to purchasers), and (2) increased probability in the $40-100 spending range. This dual effect—both purchase conversion and mid-range spending increases—occurs consistently across both male and female purchaser segments, confirming the broad effectiveness of men's campaigns.

Key Insights from Subgroup Analysis
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Combining Average Treatment Effects (ATE), Distribution Treatment Effects (DTE), and Probability Treatment Effects (PTE) provides a comprehensive understanding of campaign effectiveness:

**1. Campaign Targeting Effectiveness (from ATE)**

- Women's email campaigns show 2× stronger average effects for female purchasers ($0.54) vs male purchasers ($0.26)
- Men's email campaigns demonstrate consistent strong effects across both segments ($0.89-$0.84)
- This suggests women's campaigns benefit from precise targeting, while men's campaigns have broader appeal

**2. Distributional Shifts Beyond Averages (from DTE)**

- For women's emails, the female purchaser segment shows negative DTE at lower spending thresholds, indicating a systematic shift away from low-spending behavior
- Male purchasers show minimal distributional changes from women's emails, with confidence intervals overlapping zero at most thresholds
- Men's emails produce similar distributional patterns across both segments, confirming broad effectiveness

**3. Spending Category Changes (from PTE)**

- PTE analysis reveals *which specific spending intervals* change in response to campaigns, particularly identifying the mechanisms behind average effects
- **Women's emails**: For female purchasers, the primary effect is converting non-purchasers to purchasers (significant reduction in zero spending probability). No significant increase in high spending categories was observed.
- **Men's emails**: Show a dual mechanism across both segments: (1) converting non-purchasers to purchasers (zero spending reduction), and (2) increasing purchases in the $40-100 range
- PTE enables identification of behavioral change mechanisms that are invisible in average treatment effects alone—specifically revealing that lift comes primarily from purchase conversion (0→1 effect) rather than spending increases among existing purchasers

**4. Strategic Implications**

Based on these findings, several practical implications emerge:

- **For Women's Campaigns**: Target customers with history of purchasing women's merchandise to maximize ROI. The PTE analysis reveals that effectiveness comes primarily from converting non-purchasers to purchasers among female purchaser segments, rather than increasing spending among existing buyers.
- **For Men's Campaigns**: Deploy broadly as they produce consistent positive effects across diverse customer segments. Both male and female purchasers show both purchase conversion and mid-range spending increases, suggesting broader appeal.
- **Resource Allocation**: One practical implication is to prioritize precise targeting for gender-specific content (women's emails) but invest confidently in broad deployment for broadly appealing content (men's emails).

**5. Methodological Value**

This three-tier analysis demonstrates why distributional methods matter:

- **ATE alone** would show that both campaigns have positive effects, but with varying magnitudes across subgroups
- **Adding DTE** reveals *how* spending distributions shift, not just average changes
- **Adding PTE** pinpoints *which spending categories* are most affected, enabling precise business decisions

By examining effects at average, distributional, and interval-specific levels, we gain actionable insights that would be invisible to traditional mean-comparison approaches. This demonstrates the power of distribution treatment effect methods for understanding heterogeneous responses in digital marketing experiments.

For the complete reproducible code including helper functions and visualizations, see `example/hillstrom.ipynb <https://github.com/CyberAgentAILab/python-dte-adjustment/blob/main/example/hillstrom.ipynb>`_.

Next Steps
~~~~~~~~~~

Expand Down
Loading
Loading