feat: make rate_matching degradation factors configurable by liyuanzhe1991 · Pull Request #615 · ai-dynamo/aiconfigurator

liyuanzhe1991 · 2026-03-18T16:24:43Z

Convert module-level constants _RATE_MATCHING_PREFILL_DEGRADATION_FACTOR and _RATE_MATCHING_DECODE_DEGRADATION_FACTOR into configurable instance attributes on DisaggInferenceSession with a dedicated setter method. Propagate these parameters through TaskConfig.advanced_tuning_config and disagg_pareto() kwargs, eliminating the need for monkey-patching.
Add unit tests covering default values, setter behavior, and end-to-end parameter forwarding from TaskConfig to disagg_pareto.
Made-with: Cursor

Overview:

Previously, _RATE_MATCHING_PREFILL_DEGRADATION_FACTOR (0.9) and _RATE_MATCHING_DECODE_DEGRADATION_FACTOR (0.92) were module-level constants in picking.py, imported at module load time by inference_session.py. This made them impossible to override at runtime via monkey-patching—experiment scripts that attempted picking._RATE_MATCHING_*_DEGRADATION_FACTOR = 1.0 had no effect on already-imported copies inside DisaggInferenceSession.
This PR converts these constants into configurable instance attributes with a proper setter API (set_rate_matching_degradation_factors), and threads the values through TaskConfig.advanced_tuning_config → TaskRunner.run_disagg() → disagg_pareto() → DisaggInferenceSession, making them fully configurable without monkey-patching.

Details:

inference_session.py: Add _rate_matching_prefill_degradation_factor and _rate_matching_decode_degradation_factor as instance attributes on DisaggInferenceSession.__init__ (defaulting to the module constants). Add set_rate_matching_degradation_factors() setter. Replace all internal references to the module constants with self._rate_matching_* attributes in _get_disagg_summary_df and _find_best_result_under_constraints.
pareto_analysis.py: disagg_pareto() now accepts rate_matching_prefill_degradation_factor and rate_matching_decode_degradation_factor via **kwargs, and calls disagg_sess.set_rate_matching_degradation_factors() when either is provided.
task.py: Add rate_matching_prefill_degradation_factor: None and rate_matching_decode_degradation_factor: None to the default advanced_tuning_config. TaskRunner.run_disagg() forwards these values to disagg_pareto().
test_inference_session.py: Add TestRateMatchingDegradationFactors class (5 tests): default values, setter with both/partial args, and end-to-end impact on tokens/s/gpu output.
test_task.py: Add default-value assertions in test_taskconfig_disagg_default. Add TestRateMatchingFactorsForwarding class (3 tests): None forwarding, custom values, and partial override.

Where should the reviewer start?

src/aiconfigurator/sdk/inference_session.py — the core change: new instance attributes and setter method (lines 175–202), and the two call sites that now use self._rate_matching_* (lines 220–221, 765–766).
src/aiconfigurator/sdk/pareto_analysis.py — kwargs extraction and conditional setter call (lines 239–247).
src/aiconfigurator/sdk/task.py — default config addition (lines 494–495) and forwarding in run_disagg (lines 1346–1351).

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Relates to: disagg rate-matching configurability improvement

Summary by CodeRabbit

New Features
- Added customizable rate-matching degradation factors for disaggregated inference optimization. Users can now configure prefill and decode degradation factors through task configuration or runtime API calls to fine-tune performance analysis.

Convert module-level constants _RATE_MATCHING_PREFILL_DEGRADATION_FACTOR and _RATE_MATCHING_DECODE_DEGRADATION_FACTOR into configurable instance attributes on DisaggInferenceSession with a dedicated setter method. Propagate these parameters through TaskConfig.advanced_tuning_config and disagg_pareto() kwargs, eliminating the need for monkey-patching. Add unit tests covering default values, setter behavior, and end-to-end parameter forwarding from TaskConfig to disagg_pareto. Signed-off-by: Yuanzhe Li <yuanli@nvidia.com> Made-with: Cursor

copy-pr-bot · 2026-03-18T16:24:47Z

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

coderabbitai · 2026-03-18T16:29:50Z

Walkthrough

The pull request introduces runtime-configurable rate-matching degradation factors for disaggregated inference sessions. New instance-level attributes replace hardcoded constants, with a public setter method enabling customization. Changes propagate through the inference session, Pareto analysis, and task configuration layers, with comprehensive test coverage validating the forwarding mechanism across all components.

Changes

Cohort / File(s)	Summary
Inference Session Core Logic `src/aiconfigurator/sdk/inference_session.py`	Added `_rate_matching_prefill_degradation_factor` and `_rate_matching_decode_degradation_factor` instance fields initialized from module constants. Introduced `set_rate_matching_degradation_factors()` public method for runtime customization. Updated internal methods to use instance attributes instead of constants.
Configuration & Analysis Threading `src/aiconfigurator/sdk/task.py`, `src/aiconfigurator/sdk/pareto_analysis.py`	Extended task config with optional degradation factor fields in `advanced_tuning_config`. Threaded factors through disaggregated worker config construction and Pareto analysis invocation. Added conditional parameter passing for rate-matching customization.
Test Coverage `tests/unit/sdk/task/test_task.py`, `tests/unit/sdk/test_inference_session.py`	Comprehensive test suites validating degradation factor defaults, custom value forwarding from task config to Pareto analysis, partial overrides, and propagation into inference result comparison logic.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Poem

🐰 Degradation factors now dance at runtime's command,
No longer trapped in constants, but malleable and grand!
Through sessions, tasks, and analysis they thread with grace,
Customizable throughput metrics in every place. ✨

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 78.95% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title directly and clearly describes the main change: making rate_matching degradation factors configurable, which is the primary objective of the PR.
Description check	✅ Passed	The description follows the required template with all sections completed: Overview explains the motivation, Details breaks down changes by file with line references, Where should the reviewer start identifies key files, and Related Issues is included.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

📝 Coding Plan

Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Tip

CodeRabbit can use your project's `ruff` configuration to improve the quality of Python code reviews.

Add a Ruff configuration file to your project to customize how CodeRabbit runs ruff.

coderabbitai

🧹 Nitpick comments (1)

src/aiconfigurator/sdk/inference_session.py (1)

187-203: Add input validation for degradation factors in the public setter.

Line 201 and Line 202 currently accept invalid values (e.g., <= 0 or NaN), which can silently distort or eliminate valid rate-matching outcomes.

Proposed guardrails

 def set_rate_matching_degradation_factors(
     self,
     prefill_degradation_factor: float = _RATE_MATCHING_PREFILL_DEGRADATION_FACTOR,
     decode_degradation_factor: float = _RATE_MATCHING_DECODE_DEGRADATION_FACTOR,
 ):
@@
+    for name, value in (
+        ("prefill_degradation_factor", prefill_degradation_factor),
+        ("decode_degradation_factor", decode_degradation_factor),
+    ):
+        if (
+            not isinstance(value, (int, float))
+            or isinstance(value, bool)
+            or pd.isna(value)
+            or value <= 0
+        ):
+            raise ValueError(f"{name} must be a positive finite number, got {value!r}")
+
     self._rate_matching_prefill_degradation_factor = prefill_degradation_factor
     self._rate_matching_decode_degradation_factor = decode_degradation_factor

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed.

In `@src/aiconfigurator/sdk/inference_session.py` around lines 187 - 203, The
public setter set_rate_matching_degradation_factors currently assigns values
that may be <=0 or NaN/Inf; add input validation in that method to ensure
prefill_degradation_factor and decode_degradation_factor are floats, greater
than 0 and finite (use math.isfinite or equivalent) and raise a ValueError with
a clear message if validation fails, and only assign to
self._rate_matching_prefill_degradation_factor and
self._rate_matching_decode_degradation_factor after the checks pass.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@src/aiconfigurator/sdk/inference_session.py`:
- Around line 187-203: The public setter set_rate_matching_degradation_factors
currently assigns values that may be <=0 or NaN/Inf; add input validation in
that method to ensure prefill_degradation_factor and decode_degradation_factor
are floats, greater than 0 and finite (use math.isfinite or equivalent) and
raise a ValueError with a clear message if validation fails, and only assign to
self._rate_matching_prefill_degradation_factor and
self._rate_matching_decode_degradation_factor after the checks pass.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

Run ID: fbc04f3e-d4f4-4c89-b021-69e8a4157eb9

📥 Commits

Reviewing files that changed from the base of the PR and between 57857b5 and 60ee9e9.

📒 Files selected for processing (5)

src/aiconfigurator/sdk/inference_session.py
src/aiconfigurator/sdk/pareto_analysis.py
src/aiconfigurator/sdk/task.py
tests/unit/sdk/task/test_task.py
tests/unit/sdk/test_inference_session.py

tianhaox · 2026-03-19T08:36:48Z

src/aiconfigurator/sdk/pareto_analysis.py

    disagg_sess = DisaggInferenceSession(prefill_database, prefill_backend, decode_database, decode_backend)
    disagg_sess.set_latency_correction_scales(prefill_latency_correction_scale, decode_latency_correction_scale)

+    rate_matching_prefill = kwargs.pop("rate_matching_prefill_degradation_factor", None)


should we make it to 1.0 by default?

liyuanzhe1991 added the feat label Mar 18, 2026

liyuanzhe1991 requested review from a team, AichenF, Arsene12358, Harrilee, YijiaZhao, ilyasher, jasonqinzhou, simone-chen, tianhaox and xutizhou as code owners March 18, 2026 16:24

coderabbitai bot reviewed Mar 18, 2026

View reviewed changes

tianhaox reviewed Mar 19, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: make rate_matching degradation factors configurable#615

feat: make rate_matching degradation factors configurable#615
liyuanzhe1991 wants to merge 1 commit intoai-dynamo:mainfrom
liyuanzhe1991:feat/configurable-rate-matching

liyuanzhe1991 commented Mar 18, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

copy-pr-bot bot commented Mar 18, 2026

Uh oh!

coderabbitai bot commented Mar 18, 2026

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Uh oh!

tianhaox Mar 19, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

liyuanzhe1991 commented Mar 18, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Overview:

Details:

Where should the reviewer start?

Related Issues: (use one of the action keywords Closes / Fixes / Resolves / Relates to)

Summary by CodeRabbit

Uh oh!

copy-pr-bot bot commented Mar 18, 2026

Uh oh!

coderabbitai bot commented Mar 18, 2026

Walkthrough

Changes

Estimated code review effort

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

tianhaox Mar 19, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

liyuanzhe1991 commented Mar 18, 2026 •

edited by coderabbitai bot

Loading