306 refactor data generators#331
306 refactor data generators#331JanTeichertKluge merged 12 commits intoJanTeichertKluge/issue272from
Conversation
There was a problem hiding this comment.
Pull Request Overview
This pull request refactors the data generator modules for the interactive regression models and updates documentation examples and imports across multiple modules. Key changes include the addition of new data generator functions under the doubleml/irm/datasets package, modifications to docstrings and sample code block formatting in multiple files, and adjustment of import paths in tests and examples.
Reviewed Changes
Copilot reviewed 73 out of 73 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| doubleml/irm/datasets/dgp_irm_data.py | New data generator for interactive regression models. |
| doubleml/irm/datasets/dgp_iivm_data.py | New data generator for interactive IV models. |
| doubleml/irm/datasets/dgp_heterogeneous_data.py | New data generator for heterogeneous treatment effects. |
| doubleml/irm/datasets/dgp_confounded_irm_data.py | New data generator for confounded IRM models with a proposed minor improvement to a logical condition. |
| doubleml/irm/cvar.py, doubleml/irm/apos.py, doubleml/double_ml.py, doubleml/did/* | Updated documentation examples and import paths. |
| doubleml/datasets/, doubleml/data/, .github/ISSUE_TEMPLATE/bug_report.yml | Updated example snippets’ formatting and module import paths. |
Comments suppressed due to low confidence (3)
doubleml/double_ml.py:1170
- The example code block formatting appears to concatenate two import statements on one line; please split them into separate lines to ensure proper formatting in documentation.
>>> import numpy as np >>> import doubleml as dml
doubleml/double_ml.py:1286
- The code snippet delimiter and the following command are merged on one line; adjust the formatting to ensure each command is on a separate line for clarity.
-------- >>> import numpy as np
.github/ISSUE_TEMPLATE/bug_report.yml:26
- The code block opening in the issue template is misformatted due to extra spacing; ensure the markdown syntax for code blocks is correctly placed on its own line.
Please provide a short reproducible code snippet. Example: ```python
| 1 / (m_long * (1 - m_long)) | ||
| ) | ||
| cf_d_atte = (np.mean(propensity_ratio_long) - np.mean(propensity_ratio_short)) / np.mean(propensity_ratio_long) | ||
| if (beta_a == 0) | (gamma_a == 0): |
There was a problem hiding this comment.
Consider using the logical 'or' operator instead of the bitwise '|' for scalar comparisons to improve code clarity (e.g., 'if beta_a == 0 or gamma_a == 0:').
| if (beta_a == 0) | (gamma_a == 0): | |
| if (beta_a == 0) or (gamma_a == 0): |
All tests passed, except [RDD, not tested at all, not changed] and one test for DID:
> agg_weights[selected_unique_e_values >= 0] = 1 / np.sum(selected_unique_e_values >= 0) E numpy.core._exceptions._UFuncInputCastingError: Cannot cast ufunc 'greater_equal' input 0 from dtype('<m8[M]') to dtype('<m8') with casting rule 'same_kind' doubleml\did\utils\_aggregation.py:203: UFuncTypeError