Skip to content

Move all step-based GS functionality to nodes#94

Closed
lena-kashtelyan wants to merge 107 commits intofacebookincubator:mainfrom
lena-kashtelyan:export-D80128678
Closed

Move all step-based GS functionality to nodes#94
lena-kashtelyan wants to merge 107 commits intofacebookincubator:mainfrom
lena-kashtelyan:export-D80128678

Conversation

@lena-kashtelyan
Copy link

Summary:
X-link: facebook/Ax#4746

X-link: facebook/Ax#4731

A lot going on here:

  1. Make GenerationStep a factory for GenerationNode by replacing its __init__(self, ...) constructor (which would have to return a GStep) with a __new__(cls, ...) construtor, which can return a GNode (magic, thanks Devmate!)
  2. Adapt storage: stop storing steps and just treat them as nodes. No step-only fields will be saved going forward. Backward compatibility is handled though.
  3. Change a bazillion tests and checks in downstream applications.

NOTE: need to remember next steps here: https://www.internalfb.com/diff/D86066476?dst_version_fbid=732904879835890&transaction_fbid=1400151778204976, cc mgarrard

sorrybigdiff

Reviewed By: mgarrard

Differential Revision: D80128678

TaXxER and others added 30 commits October 1, 2025 13:26
…t.py, thresholding.py, and torchscript_modules.py
…th Apple M1/M2 hardware, resulting in segmentation faults. We ignore affected tests.
…conduct_file_exists

Adding Code of Conduct file
… is not release yet on pypi. Here we undo the tuning change to make it compatible with 1.1.2.
- Fix spelling and grammar issues throughout documentation
- Enhance clarity with concrete examples and expanded acronyms
- Correct MCE definition from 'Maximum Calibration Error' to 'Multicalibration Error'
- Fix all code examples to match actual DataFrame-based API
- Add academic references for main paper (KDD 2026), metrics paper, and applications paper
- Update all documentation files with accurate method signatures and parameter names

Files modified:
- README.md: Add proper citations, fix code example API
- pyproject.toml: Fix subject-verb agreement
- website/docs/: Update all doc pages with correct API usage
- website/src/pages/index.js: Update homepage with correct citations and code example
The internal and external repositories are out of sync. This Pull Request attempts to brings them back in sync by patching the GitHub repository. Please carefully review this patch. You must disable ShipIt for your project in order to merge this pull request. DO NOT IMPORT this pull request. Instead, merge it directly on GitHub using the MERGE BUTTON. Re-enable ShipIt after merging.
The internal and external repositories are out of sync. This Pull Request attempts to brings them back in sync by patching the GitHub repository. Please carefully review this patch. You must disable ShipIt for your project in order to merge this pull request. DO NOT IMPORT this pull request. Instead, merge it directly on GitHub using the MERGE BUTTON. Re-enable ShipIt after merging.
The internal and external repositories are out of sync. This Pull Request attempts to brings them back in sync by patching the GitHub repository. Please carefully review this patch. You must disable ShipIt for your project in order to merge this pull request. DO NOT IMPORT this pull request. Instead, merge it directly on GitHub using the MERGE BUTTON. Re-enable ShipIt after merging.
Summary:
Pull Request resolved: facebookincubator/MCGrad#24

We're starting to see some "running out of disk space" errors on the machines that build Sphinx. We have plenty of budget, so let's just throw a bigger machine.

facebookincubator/MCGrad#18

Reviewed By: Lorenzo-Perini

Differential Revision: D88486737

fbshipit-source-id: 3ab67c816a15a6d54bc330106e962b3e1f315e70
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 7, 2026
@meta-codesync
Copy link

meta-codesync bot commented Jan 7, 2026

@lena-kashtelyan has exported this pull request. If you are a Meta employee, you can view the originating Diff in D80128678.

flinder and others added 12 commits January 8, 2026 01:30
Summary:
Pull Request resolved: facebookincubator/MCGrad#92

Applied the following code quality fixes:

### `tuning.py`

1. **Wrapped long comment** (line 101) - Split the 121-char comment into two lines for readability
2. **Replaced `assert` with proper exception** (line 166) - Changed `assert df_val is not None` to `raise ValueError(...)` for proper runtime validation instead of debug assertions
3. **Converted string concatenation to f-string** (line 221) - Changed `"lightgbm_autotuning" + str(uuid.uuid4())[:8]` to `f"lightgbm_autotuning_{uuid.uuid4().hex[:8]}"`

### `test_tuning.py`

1. **Fixed PascalCase variable** - `BoTorch_count` → `botorch_count` (Google style: snake_case for variables)
2. **Converted `.format()` to f-strings** - Updated 3 assertion messages to use f-strings
3. **Fixed data slice bug** - Changed `sample_data.iloc[:80]` to `sample_data.iloc[:40]` (the fixture only has 50 samples, so `[:80]` was incorrectly returning all rows instead of an 80/20 split)

Reviewed By: TaXxER

Differential Revision: D90259815

fbshipit-source-id: 5361978a57efd1a7d29111c364df763151b9fae4
Summary:
Pull Request resolved: facebookincubator/MCGrad#93

* Add missing docstring to public data class `ParameterConfig`
* Add :returns: documentation `tune_mcgrad_parameters`
* Add minimal explanation to internal helper `_suppress_logger`

Reviewed By: TaXxER

Differential Revision: D90260538

fbshipit-source-id: cd0ffe4e1e474f9fbd79191a94497f8f1be809c4
…#90)

Summary:
Pull Request resolved: facebookincubator/MCGrad#90

Remove old matplotlib based plotting functions.

Update website documentation examples to use the correct functions.

Remove dependency on matplotlib

Reviewed By: Lorenzo-Perini

Differential Revision: D90255309

fbshipit-source-id: d54784de98bf9f5e15f94706c023d9f0bf250411
Summary:
Pull Request resolved: facebookincubator/MCGrad#89

## Summary of Changes

### [`**plotting.py**`](command:code-compose.open?%5B%22%2Fdata%2Fsandcastle%2Fboxes%2Ffbsource%2Ffbcode%2Fmulticalibration%2Fplotting.py%22%2Cnull%5D "/data/sandcastle/boxes/fbsource/fbcode/multicalibration/plotting.py")

**Google Python Style Guide fixes:**

- **Imports**: Reorganized to follow import grouping conventions and use explicit imports (`Literal` type added)
- **Type annotations**: Added `BinningMethod` and `SegmentQuantity` type literals for stricter typing
- **Module-level constants**: Added `_MARKER_COLOR_CALIBRATED`, `_MARKER_COLOR_MISCALIBRATED`, `_VALID_SEGMENT_QUANTITIES`, and `_LEARNING_CURVE_COLORS` with leading underscore
- **Assertions replaced with proper exceptions**: Changed `assert binning_method in [...]` to use `_get_binning_function()` that raises `ValueError` with a descriptive message
- **Variable naming**: Renamed `sig_diff` → `is_miscalibrated`, `group1/group2` → `calibrated_bins/miscalibrated_bins`, `dicts` → `rows`, `this_dict` → `row`, `test_performance` → `valid_perf`, `extra_evaluation_due_to_early_stopping` → `extra_eval`
- **Magic values extracted**: Moved hardcoded color strings to named constants
- **DRY violations**: Extracted helper function `_get_binning_function()` instead of duplicating binning method selection logic
- **Long function refactored**: Broke `plot_learning_curve()` into smaller helper functions `_add_mce_threshold_annotations()` and `_update_x_axis_ticks()`
- **Unnecessary complexity removed**: Replaced the manual list-building loop in `plot_calibration_curve_by_segment` with cleaner vectorized operations
- **Redundant comments removed**: Cleaned up comments that just stated what the code does
- **Docstring formatting**: Fixed line wrapping for readability

### [`**test_plotting.py**`](command:code-compose.open?%5B%22%2Fdata%2Fsandcastle%2Fboxes%2Ffbsource%2Ffbcode%2Fmulticalibration%2Ftests%2Ftest_plotting.py%22%2Cnull%5D "/data/sandcastle/boxes/fbsource/fbcode/multicalibration/tests/test_plotting.py")

- **DRY**: Added `mce_with_all_segments` fixture to avoid repeated MCE creation
- **DRY**: Added `mcgrad_training_df` fixture and `_fit_mcgrad_model()` helper function
- **Test consolidation**: Combined similar tests into parameterized tests using `pytest.mark.parametrize`
- **Assertion update**: Updated test to expect `ValueError` instead of `AssertionError` after changing from assert to explicit exception

Reviewed By: Lorenzo-Perini

Differential Revision: D90255341

fbshipit-source-id: 1c9d509e7dfd1a860b4e92940b218e71d252ca98
Summary:
Pull Request resolved: facebookincubator/MCGrad#88

The public plotting functions were missing documentation, making API usage unclear. Added reST docstrings to `plot_global_calibration_curve` and `plot_calibration_curve_by_segment`, and fixed the incorrect docstring in `plot_segment_calibration_errors` which referenced "Kuiper statistic" instead of the actual ECCE quantities and was missing the `segment_ecces_absolute` option.

Reviewed By: Lorenzo-Perini

Differential Revision: D90255531

fbshipit-source-id: 0631f80ad1f282ce53ec3c768cab8034c7063340
…ze (#98)

Summary:
Pull Request resolved: facebookincubator/MCGrad#98

Replaced the hardcoded "mcboost" dictionary key in BaseMCGrad's serialize and deserialize methods with a configurable class attribute _SERIALIZATION_KEY.

This removes the "mcboost" name from the open-source MCGrad codebase.

Reviewed By: flinder

Differential Revision:
D90320797

Privacy Context Container: L1334583

fbshipit-source-id: 82103de43b57b0a8473e49cf305fee05bf6e445f
Summary:
Pull Request resolved: facebookincubator/MCGrad#96

Improve documentation coverage for the metrics module ahead of OSS release. The module had many public functions lacking docstrings, making it difficult for users to understand parameter usage and expected behavior without reading the implementation.

This change adds complete reST-formatted docstrings to 30+ public functions and fixes incomplete parameter documentation in existing docstrings (ndcg_score, kuiper_calibration, calibration_free_normalized_entropy).

Reviewed By: Lorenzo-Perini, TaXxER

Differential Revision: D90304975

fbshipit-source-id: 32775ed20d02d27a9a615028861a49abe7a73b8d
Summary:
Pull Request resolved: facebookincubator/MCGrad#95

Changes:

**segmentation.py**

* Group constants at top
* Remove redundant .astype(np.bool_)
* Fix type hint ordering

**test_segmentation.py**

* Add _count_segments() helper
* Fix stale comment
* Use [value] * 4 pattern

Reviewed By: TaXxER

Differential Revision: D90303064

fbshipit-source-id: 373b5eeef269c5acfe0b9f4a25974a51a65806c8
Summary:
Pull Request resolved: facebookincubator/MCGrad#97

Several public functions in segmentation.py lacked documentation, making it harder to understand their purpose and usage. This adds reST-formatted docstrings to enable Sphinx API documentation generation.

Reviewed By: TaXxER

Differential Revision: D90305727

fbshipit-source-id: 97209e697c7aa286c29cc96740a7355b6fb64493
Summary:
Pull Request resolved: facebookincubator/MCGrad#84

Code quality improvements for utils.py following Google Python style guide and Python Enhancement Proposals (PEPs):

**Type annotations (Python 3.9+ style) - PEP 585:**
- Removed unused imports (`Dict`, `Tuple` from typing)
- Updated `Dict` to lowercase `dict` for Python 3.9+ style
- Updated `Tuple` to lowercase `tuple` for Python 3.9+ style

**Replaced assert statements with proper exceptions - PEP 8/Google Style:**
- Production code should not use `assert` for validation since asserts can be disabled with `-O` flag
- Converted 2 `assert` statements to `raise ValueError()` in:
  - `positive_label_proportion`: NaN validation for predictions
  - `make_unjoined`: shape validation for x and y arrays

**Extracted magic numbers to module-level constants - PEP 8:**
- Added `BIN_EPSILON` (1e-8) for bin boundary calculations
- Added `MIN_LOGIT_EPSILON` (1e-304) for logit function epsilon
- Updated `logit()`, `make_equispaced_bins()`, `make_equisized_bins()`, and `BinningMethodInterface` Protocol to use these constants as defaults

**Added missing type annotations - PEP 484:**
- Added type annotations to `OrdinalEncoderWithUnknownSupport.__init__()`, `fit()`, `transform()`, `deserialize()`
- Added type annotations to `TrainTestSplitWrapper.split()` and `NoopSplitterWrapper.split()`
- Added type annotations to `check_range()` function

**PEP 257 docstring fix:**
- Converted inline comment to proper docstring for `check_range()` function

**PEP 8 line length fix:**
- Refactored long logger.info() f-string in `unshrink()` to use %-formatting with line breaks (also aligns with Google style for logging)

**Test cleanup:**
- Removed unused `# pyre-fixme` comment that was no longer suppressing any type errors
- Removed unnecessary (and wrong) type annotations from test function signature (project convention)

Reviewed By: flinder

Differential Revision: D90167695

fbshipit-source-id: d76067a4a93698cf79351ccda9400d2984f81f7a
Summary:
Pull Request resolved: facebookincubator/MCGrad#100

utils.py wasn't pyre-strict. Here we make the changes to make it pyre-strict.

Reviewed By: flinder

Differential Revision: D90245591

fbshipit-source-id: 66abdd157f3f206e1a5320b929d86545bddb57cd
Summary: Pull Request resolved: facebookincubator/MCGrad#101

Reviewed By: flinder

Differential Revision: D90390331

Pulled By: Lorenzo-Perini

fbshipit-source-id: 1883cda06187c5dc9def1b07ec8cdd5bb0267cc9
Summary:

X-link: facebook/Ax#4746

X-link: facebook/Ax#4731

A lot going on here:

1. Make `GenerationStep` a factory for `GenerationNode` by replacing its `__init__(self, ...)` constructor (which would have to return a `GStep`) with a `__new__(cls, ...)` construtor, which can return a `GNode` (magic, thanks Devmate!)
2. Adapt storage: stop storing steps and just treat them as nodes. No step-only fields will be saved going forward. Backward compatibility is handled though.
3. Change a bazillion tests and checks in downstream applications.

NOTE: need to remember next steps here: https://www.internalfb.com/diff/D86066476?dst_version_fbid=732904879835890&transaction_fbid=1400151778204976, cc mgarrard 

sorrybigdiff

Reviewed By: mgarrard

Differential Revision: D80128678
lena-kashtelyan pushed a commit to lena-kashtelyan/Ax that referenced this pull request Jan 12, 2026
Summary:
X-link: facebookincubator/MCGrad#94

Pull Request resolved: facebook#4746

Pull Request resolved: facebook#4731

A lot going on here:

1. Make `GenerationStep` a factory for `GenerationNode` by replacing its `__init__(self, ...)` constructor (which would have to return a `GStep`) with a `__new__(cls, ...)` construtor, which can return a `GNode` (magic, thanks Devmate!)
2. Adapt storage: stop storing steps and just treat them as nodes. No step-only fields will be saved going forward. Backward compatibility is handled though.
3. Change a bazillion tests and checks in downstream applications.

NOTE: need to remember next steps here: https://www.internalfb.com/diff/D86066476?dst_version_fbid=732904879835890&transaction_fbid=1400151778204976, cc mgarrard

sorrybigdiff

Reviewed By: mgarrard

Differential Revision: D80128678
lena-kashtelyan pushed a commit to lena-kashtelyan/Ax that referenced this pull request Jan 14, 2026
Summary:
X-link: facebookincubator/MCGrad#94

Pull Request resolved: facebook#4746

Pull Request resolved: facebook#4731

A lot going on here:

1. Make `GenerationStep` a factory for `GenerationNode` by replacing its `__init__(self, ...)` constructor (which would have to return a `GStep`) with a `__new__(cls, ...)` construtor, which can return a `GNode` (magic, thanks Devmate!)
2. Adapt storage: stop storing steps and just treat them as nodes. No step-only fields will be saved going forward. Backward compatibility is handled though.
3. Change a bazillion tests and checks in downstream applications.

NOTE: need to remember next steps here: https://www.internalfb.com/diff/D86066476?dst_version_fbid=732904879835890&transaction_fbid=1400151778204976, cc mgarrard

sorrybigdiff

Differential Revision: D80128678
lena-kashtelyan pushed a commit to lena-kashtelyan/Ax that referenced this pull request Jan 15, 2026
Summary:
X-link: facebookincubator/MCGrad#94

Pull Request resolved: facebook#4746

Pull Request resolved: facebook#4731

A lot going on here:

1. Make `GenerationStep` a factory for `GenerationNode` by replacing its `__init__(self, ...)` constructor (which would have to return a `GStep`) with a `__new__(cls, ...)` construtor, which can return a `GNode` (magic, thanks Devmate!)
2. Adapt storage: stop storing steps and just treat them as nodes. No step-only fields will be saved going forward. Backward compatibility is handled though.
3. Change a bazillion tests and checks in downstream applications.

NOTE: need to remember next steps here: https://www.internalfb.com/diff/D86066476?dst_version_fbid=732904879835890&transaction_fbid=1400151778204976, cc mgarrard

sorrybigdiff

Reviewed By: lena-kashtelyan

Differential Revision: D80128678
lena-kashtelyan pushed a commit to lena-kashtelyan/Ax that referenced this pull request Jan 20, 2026
Summary:
X-link: facebookincubator/MCGrad#94

Pull Request resolved: facebook#4746

Pull Request resolved: facebook#4731

A lot going on here:

1. Make `GenerationStep` a factory for `GenerationNode` by replacing its `__init__(self, ...)` constructor (which would have to return a `GStep`) with a `__new__(cls, ...)` construtor, which can return a `GNode` (magic, thanks Devmate!)
2. Adapt storage: stop storing steps and just treat them as nodes. No step-only fields will be saved going forward. Backward compatibility is handled though.
3. Change a bazillion tests and checks in downstream applications.

NOTE: need to remember next steps here: https://www.internalfb.com/diff/D86066476?dst_version_fbid=732904879835890&transaction_fbid=1400151778204976, cc mgarrard

sorrybigdiff

Differential Revision: D80128678
lena-kashtelyan pushed a commit to lena-kashtelyan/Ax that referenced this pull request Jan 20, 2026
Summary:
X-link: facebookincubator/MCGrad#94

Pull Request resolved: facebook#4746

Pull Request resolved: facebook#4731

A lot going on here:

1. Make `GenerationStep` a factory for `GenerationNode` by replacing its `__init__(self, ...)` constructor (which would have to return a `GStep`) with a `__new__(cls, ...)` construtor, which can return a `GNode` (magic, thanks Devmate!)
2. Adapt storage: stop storing steps and just treat them as nodes. No step-only fields will be saved going forward. Backward compatibility is handled though.
3. Change a bazillion tests and checks in downstream applications.

NOTE: need to remember next steps here: https://www.internalfb.com/diff/D86066476?dst_version_fbid=732904879835890&transaction_fbid=1400151778204976, cc mgarrard

sorrybigdiff

Differential Revision: D80128678
meta-codesync bot pushed a commit to facebook/Ax that referenced this pull request Jan 20, 2026
Summary:
X-link: facebookincubator/MCGrad#94

Pull Request resolved: #4746

Pull Request resolved: #4731

A lot going on here:

1. Make `GenerationStep` a factory for `GenerationNode` by replacing its `__init__(self, ...)` constructor (which would have to return a `GStep`) with a `__new__(cls, ...)` construtor, which can return a `GNode` (magic, thanks Devmate!)
2. Adapt storage: stop storing steps and just treat them as nodes. No step-only fields will be saved going forward. Backward compatibility is handled though.
3. Change a bazillion tests and checks in downstream applications.

NOTE: need to remember next steps here: https://www.internalfb.com/diff/D86066476?dst_version_fbid=732904879835890&transaction_fbid=1400151778204976, cc mgarrard

sorrybigdiff

Reviewed By: mgarrard

Differential Revision: D80128678

fbshipit-source-id: 32b4d01d5cd246135ee63c9ebc69c9bb4a580abe
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. fb-exported meta-exported

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants