Skip to content

Conversation

juanitorduz
Copy link
Collaborator

@juanitorduz juanitorduz commented Oct 5, 2025

Prompt:

Please read the @mmm.py file and the base.py file as well. The @mmm.py code is olds and abit convoluted. Please refactor the @mmm.py code under the following requirements:

  • Do not change any function or method signature or docstrings.
  • Just optimize the code when needed.
  • Prioritize readability.

Think about the plan and then execute.

  • Cleaner and more modular and understandable code.
  • Test and tests on edge cases
  • Re-run example notebook and all works as expected.

Refactoring Plan

After analyzing the code, I've identified these areas for improvement:

  1. Extract repeated scaling logic into helper methods
  2. Simplify complex methods like build_model, _data_setter, sample_posterior_predictive
  3. Extract magic numbers to constants
  4. Reduce nested conditionals and improve readability
  5. Group related helper methods together
  6. Consolidate transformer fallback logic (new scaling vs old transformer approach)

Let me execute this refactoring:

[8 tools called]

Now let me check for any linting errors:

[4 tools called]

Let me now simplify other helper methods and add a summary of the refactoring work done:

[1 tool called]

Perfect! Let me create a summary of the refactoring work completed:

Summary of Refactoring

I've successfully refactored the mmm.py file to improve readability and maintainability while preserving all function signatures and docstrings. Here's what was done:

1. Added Constants (Lines 67-72)

  • Extracted magic numbers to named constants:
    • DEFAULT_HDI_PROB = 0.94
    • DEFAULT_LOWER_QUANTILE = 0.025
    • DEFAULT_UPPER_QUANTILE = 0.975
    • DEFAULT_QUANTILE = 0.5
    • DEFAULT_N_PERCENTILES = 30

2. Simplified Scaling Logic (Lines 404-454)

  • Created _compute_scale_for_data() helper method to eliminate code duplication
  • Refactored _compute_scales() to use the helper method
  • Improved error messages and type checking

3. Extracted Model Building Components (Lines 527-775)

  • Created _create_scaled_data_variables() - centralizes data scaling setup
  • Created _build_intercept() - handles time-varying and static intercept creation
  • Created _build_channel_contribution() - handles time-varying and static channel contributions
  • Created _build_control_contribution() - handles control variables
  • Created _build_yearly_seasonality_contribution() - handles seasonality
  • Created _add_original_scale_deterministics() - adds original scale variables

4. Simplified build_model() Method (Lines 777-908)

  • Reduced from ~200 lines to ~60 lines
  • Much clearer flow with logical sections
  • Easier to understand and maintain

5. Refactored Data Setting Logic (Lines 1070-1219)

  • Created _has_new_scaling() - checks scaling approach
  • Created _prepare_channel_data() - prepares channel data
  • Created _prepare_control_data() - prepares control data
  • Created _prepare_target_data() - prepares target data
  • Simplified _data_setter() from complex conditionals to clear, modular calls

6. Improved sample_posterior_predictive() Method (Lines 2349-2499)

  • Created _transform_to_original_scale_new() - handles new scaling approach
  • Created _transform_to_original_scale_legacy() - handles legacy transformer approach
  • Simplified main method by extracting transformation logic
  • Clearer separation of concerns

7. Refactored Synthetic Dataset Creation (Lines 2632-2790)

  • Created _validate_controls_for_synth_dataset() - validates controls
  • Created _generate_future_dates() - generates dates based on granularity
  • Simplified _create_synth_dataset() using helper methods
  • Improved error messages

8. Simplified Plotting Methods (Lines 1611-1776)

  • Created _get_intercept_for_plot() - extracts intercept preparation
  • Created _get_target_for_plot() - extracts target preparation
  • Refactored plot_components_contributions() to be more readable
  • Used constants instead of magic numbers
  • Clearer logic flow with early returns

Key Benefits:

No breaking changes - All public APIs preserved
Better readability - Complex methods broken into logical pieces
Reduced duplication - Common patterns extracted to helpers
Improved maintainability - Easier to test and modify individual components
Clearer intent - Helper method names document what code does
Better error messages - More descriptive validation errors

The refactoring follows best practices:

  • Single Responsibility Principle (each method does one thing)
  • DRY (Don't Repeat Yourself)
  • Clear naming conventions
  • Logical code organization
  • Improved testability through smaller, focused methods

📚 Documentation preview 📚: https://pymc-marketing--1974.org.readthedocs.build/en/1974/

@github-actions github-actions bot added the MMM label Oct 5, 2025
@juanitorduz juanitorduz marked this pull request as draft October 5, 2025 17:29
@juanitorduz juanitorduz self-assigned this Oct 5, 2025
Copy link

codecov bot commented Oct 5, 2025

Codecov Report

❌ Patch coverage is 90.21277% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 93.83%. Comparing base (54de250) to head (7315b1b).
⚠️ Report is 1 commits behind head on main.

Files with missing lines Patch % Lines
pymc_marketing/mmm/mmm.py 90.21% 23 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1974      +/-   ##
==========================================
+ Coverage   93.41%   93.83%   +0.42%     
==========================================
  Files          67       67              
  Lines        8521     8585      +64     
==========================================
+ Hits         7960     8056      +96     
+ Misses        561      529      -32     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@juanitorduz juanitorduz requested a review from Copilot October 5, 2025 17:32
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR refactors the mmm.py file to improve code readability and maintainability while preserving all function signatures and docstrings. The refactoring focuses on extracting repeated logic into helper methods, simplifying complex methods, and adding named constants to replace magic numbers.

Key changes include:

  • Added constants for default values (HDI probability, quantiles, etc.)
  • Extracted scaling logic into reusable helper methods
  • Broke down the large build_model() method into focused helper methods
  • Simplified data preparation logic in _data_setter() and related methods

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@juanitorduz juanitorduz requested a review from Copilot October 5, 2025 17:33
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 1 out of 1 changed files in this pull request and generated 2 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@github-actions github-actions bot added the tests label Oct 5, 2025
@juanitorduz juanitorduz requested a review from Copilot October 5, 2025 17:48
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comments suppressed due to low confidence (3)

pymc_marketing/mmm/mmm.py:1

  • Using union syntax float | np.floating requires Python 3.10+. Consider using Union[float, np.floating] from typing module for broader compatibility or ensure minimum Python version requirement is documented.
#   Copyright 2022 - 2025 The PyMC Labs Developers

tests/mmm/test_mmm.py:1

  • Using union syntax pd.Series | np.ndarray requires Python 3.10+. Consider using Union[pd.Series, np.ndarray] from typing module for broader compatibility.
#   Copyright 2022 - 2025 The PyMC Labs Developers

tests/mmm/test_mmm.py:1

  • Using union syntax pd.Series | np.ndarray requires Python 3.10+. Consider using Union[pd.Series, np.ndarray] from typing module for broader compatibility.
#   Copyright 2022 - 2025 The PyMC Labs Developers

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

Comments suppressed due to low confidence (1)

pymc_marketing/mmm/mmm.py:1

  • Using float | np.floating union syntax requires Python 3.10+. Consider using Union[float, np.floating] for broader compatibility or verify minimum Python version requirements.
#   Copyright 2022 - 2025 The PyMC Labs Developers

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@twiecki
Copy link
Contributor

twiecki commented Oct 5, 2025

You might want to explore having it write a spec first: https://www.linkedin.com/posts/twiecki_advanced-context-engineering-for-agents-activity-7379009225378930688-T5g3?utm_source=share&utm_medium=member_desktop&rcm=ACoAAAlPjQsB9khSgXTWTM2phblua2XcZIyWpxY

@juanitorduz
Copy link
Collaborator Author

@juanitorduz juanitorduz requested a review from Copilot October 5, 2025 18:14
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 4 comments.


Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@github-actions github-actions bot added the docs Improvements or additions to documentation label Oct 5, 2025
@juanitorduz juanitorduz marked this pull request as ready for review October 5, 2025 18:30
@cetagostini
Copy link
Contributor

I'll review during the week!

claude refactor

mypy

add tests

tests

optimization

add edge cases tests

rerun mmm example nb and compare results

more tests
@williambdean
Copy link
Contributor

looks like mu_effects 😁

Is there a timeline to deprecate this class? Is refactoring just to try out this tool?

@PabloRoque
Copy link
Contributor

I second Will's proposal to deprecate this class as a whole.

Should we mark 0.18.0 as the very last release with this around?

@juanitorduz
Copy link
Collaborator Author

looks like mu_effects 😁

Is there a timeline to deprecate this class? Is refactoring just to try out this tool?

The refactoring is a proof of concept on how we could improve the code on iterations with AI. Still, I suggest we merge it even if we will depreciate as the code is much better written now. So the review does not need to be so detailed as older tests pass and the api and function signatures are the same.

@juanitorduz
Copy link
Collaborator Author

I second Will's proposal to deprecate this class as a whole.

Should we mark 0.18.0 as the very last release with this around?

I think we should start adding depreciation warnings and start porting all the notebook examples to the new api. I will create an issue with tasks.

@twiecki
Copy link
Contributor

twiecki commented Oct 7, 2025

When do we release 1.0?

@juanitorduz
Copy link
Collaborator Author

When do we release 1.0?

Once we finally move the MMM Multidimensional completely (If I have to guess is around 0.19 or 0.20.0)

@ColtAllen
Copy link
Collaborator

When do we release 1.0?

Once we finally move the MMM Multidimensional completely (If I have to guess is around 0.19 or 0.20.0)

There are also CLV API changes I want to make which could probably be done in the same releases. This will bring some additional functionality to MMM because the redundant CLV internals can be consolidated into the ModelBuilder classes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

docs Improvements or additions to documentation MMM tests

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants