feat: add --service-tier-dist for per-request service_tier distribution by ajcasagrande · Pull Request #675 · ai-dynamo/aiperf

ajcasagrande · 2026-02-11T21:16:08Z

Allows users to specify a distribution of service_tier values (e.g. default:50;flex:30;priority:20) that get sampled per turn and sent in OpenAI chat/completions payloads. The response service_tier is extracted into record metadata for downstream analysis.

Summary by CodeRabbit

New Features
- Added a CLI option to specify a distribution of service tiers for API requests (format: tier:percentage;tier:percentage).
- Service tier is now sent with requests when present and captured in response metrics.
Validation
- CLI option is disallowed alongside explicit per-request service_tier and restricted to chat/completions endpoints.
Tests
- Added unit and endpoint tests covering distribution parsing, sampling, payloads, and response extraction.
Documentation
- CLI docs updated with the new option and usage examples.

Allows users to specify a distribution of service_tier values (e.g. `default:50;flex:30;priority:20`) that get sampled per turn and sent in OpenAI chat/completions payloads. The response service_tier is extracted into record metadata for downstream analysis. Signed-off-by: Anthony Casagrande <acasagrande@nvidia.com>

github-actions · 2026-02-11T21:16:18Z

Try out this PR

Quick install:

pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@808d652a48346325d7a339dc757fa6b39efc1ea5

Recommended with virtual environment (using uv):

uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@808d652a48346325d7a339dc757fa6b39efc1ea5

Last updated for commit: 808d652 • Browse code

ajcasagrande · 2026-02-11T21:20:47Z

Note: currently does not add support for grouping the results by service_tier. I think that one fits in more with my MetricsAccumulator refactoring

coderabbitai · 2026-02-11T21:21:13Z

Walkthrough

Adds service-tier distribution support: a CLI option to specify tier probabilities, models and parser for distributions with sampling, config validators and accessors, endpoints updated to send/receive service_tier, dataset composer assigns sampled tiers to turns, metadata recorded, and tests/docs added.

Changes

Cohort / File(s)	Summary
Configuration `src/aiperf/common/config/input_config.py`, `src/aiperf/common/config/user_config.py`	Added `--service-tier-dist` CLI field and accessor, mutual-exclusivity and endpoint-type validators (CHAT/COMPLETIONS restriction).
Distribution Model `src/aiperf/common/models/service_tier_distribution.py`	New `ServiceTierEntry`, `ServiceTierDistribution`, and `ServiceTierDistributionParser` implementing validated parsing, probability-sum checks, cumulative distribution, and O(log n) sampling.
Data Models `src/aiperf/common/models/dataset_models.py`, `src/aiperf/common/models/record_models.py`	Added optional `service_tier` field to `Turn` and `MetricRecordMetadata`, propagated in copy/creation paths.
Endpoints `src/aiperf/endpoints/openai_chat.py`, `src/aiperf/endpoints/openai_completions.py`	Include `service_tier` from turns in outgoing payloads and extract `service_tier` from responses into `ParsedResponse` metadata.
Dataset Composer `src/aiperf/dataset/composer/base.py`	Initialize service-tier distribution from config and sample/assign `turn.service_tier` during turn finalization.
Record Processing `src/aiperf/records/record_processor_service.py`	Extract `service_tier` from response metadata when building metric record metadata.
Docs `docs/cli_options.md`	Documented `--service-tier-dist` format and accepted tier names with examples.
Tests `tests/unit/common/models/test_service_tier_distribution.py`, `tests/unit/endpoints/test_openai_chat_completions.py`, `tests/unit/endpoints/test_openai_completions.py`	Added unit tests for distribution parsing/validation/sampling and endpoint payload/response metadata behavior.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~25 minutes

Poem

🐰 I nibble strings of tier and prob,
I shuffle seeds in a tiny gob,
I sample swift with a hop and twirl,
Assign a tier, then give a whirl,
Hooray — the requests now dance and bob!

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 55.36% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: adding a new CLI option --service-tier-dist for sampling service_tier values per request from a distribution.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

No actionable comments were generated in the recent review. 🎉

🧹 Recent nitpick comments

src/aiperf/common/models/service_tier_distribution.py (1)
63-91: Nit: redundant list() conversion on Line 86.

_validate_probability_sum accepts list[ServiceTierEntry] but self._entries is already a tuple of the same type. Since the function only iterates via sum(), accepting Sequence or just passing the tuple would avoid the copy.
Proposed fix

Either widen the type hint of _validate_probability_sum:
-def _validate_probability_sum(entries: list[ServiceTierEntry]) -> None:
+def _validate_probability_sum(entries: list[ServiceTierEntry] | tuple[ServiceTierEntry, ...]) -> None:
and drop the conversion:
-        _validate_probability_sum(list(self._entries))
+        _validate_probability_sum(self._entries)
Or use Sequence[ServiceTierEntry] from collections.abc.

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 4

🤖 Fix all issues with AI agents

In `@docs/cli_options.md`:
- Around line 251-253: Update the docs and/or validation to make behavior
consistent: either relax the documentation text in the CLI docs and the
Field(description=...) used in input_config.py to say "Common tiers" or "Example
tiers" (listing auto, default, flex, scale, priority) OR add whitelist
validation inside ServiceTierEntry to only accept those specific strings; ensure
both the docs string (the Field(description=...) in input_config.py) and the CLI
docs entry for --service-tier-dist match the chosen approach so behavior and
documentation are consistent with the ServiceTierEntry validation.

In `@src/aiperf/common/config/input_config.py`:
- Around line 377-384: Add an explicit return type to
get_service_tier_distribution: annotate it as returning
Optional[ServiceTierDistribution] (e.g. def get_service_tier_distribution(self)
-> Optional["ServiceTierDistribution"]:) and import typing.Optional; either
import ServiceTierDistribution at module top from
aiperf.common.models.service_tier_distribution or use a forward-reference string
to avoid top-level import; keep the existing local use of
ServiceTierDistributionParser.parse and return the parsed
ServiceTierDistribution or None.

In `@src/aiperf/endpoints/openai_chat.py`:
- Around line 205-210: The current truthiness check "if service_tier :=
json_obj.get('service_tier')" in the block that builds metadata can silently
drop falsy but valid values (e.g., empty string); update the check to mirror
format_payload's behavior by testing for None (e.g., use "is not None") so
service_tier is included whenever it exists in json_obj, then return
ParsedResponse(perf_ns=response.perf_ns, data=data, usage=usage,
metadata=metadata) as before.

In `@src/aiperf/endpoints/openai_completions.py`:
- Around line 93-98: The truthiness check for service_tier uses "if service_tier
:= json_obj.get('service_tier'):" which mismatches the explicit None-check used
elsewhere (e.g., format_payload); change this to "if
json_obj.get('service_tier') is not None" (or assign first then check "is not
None") so you only include service_tier when present and allow falsy-but-valid
values (e.g., empty string or 0); update the block that builds metadata and
returns ParsedResponse (reference variables: json_obj, service_tier, metadata,
ParsedResponse) accordingly.

🧹 Nitpick comments (1)

src/aiperf/common/models/service_tier_distribution.py (1)
89-91: Consider using lazy lambda for the debug log.

Per the coding guidelines, expensive logs should use lambda: logger.debug(lambda: f"..."). While this only runs once at construction time, maintaining consistency with the guideline avoids accidental f-string evaluation when debug logging is disabled.
Suggested change
-        logger.debug(
-            f"Created service tier distribution with {len(self._entries)} entries: {self}"
-        )
+        logger.debug(
+            lambda: f"Created service tier distribution with {len(self._entries)} entries: {self}"
+        )
As per coding guidelines: src/**/*.py: Use lambda for expensive logs: self.debug(lambda: f"{self._x()}").

docs/cli_options.md

src/aiperf/common/config/input_config.py

src/aiperf/endpoints/openai_chat.py

src/aiperf/endpoints/openai_completions.py

codecov · 2026-02-11T21:22:58Z

Codecov Report

❌ Patch coverage is 81.08108% with 21 lines in your changes missing coverage. Please review.

Files with missing lines	Patch %	Lines
src/aiperf/common/config/input_config.py	50.00%	5 Missing and 2 partials ⚠️
.../aiperf/common/models/service_tier_distribution.py	92.75%	3 Missing and 2 partials ⚠️
src/aiperf/common/config/user_config.py	42.85%	3 Missing and 1 partial ⚠️
src/aiperf/records/record_processor_service.py	25.00%	2 Missing and 1 partial ⚠️
src/aiperf/dataset/composer/base.py	33.33%	1 Missing and 1 partial ⚠️

📢 Thoughts on this report? Let us know!

matthewkotila · 2026-02-12T22:26:07Z

docs/cli_options.md

Regarding multi-turn: does anyone switch service tiers mid-conversation?

I would think not, in which case sampling per-conversation seems to make more sense than per-turn.

Thoughts?

github-actions bot added the feat label Feb 11, 2026

ajcasagrande requested a review from matthewkotila February 11, 2026 21:18

coderabbitai bot reviewed Feb 11, 2026

View reviewed changes

docs/cli_options.md Outdated Show resolved Hide resolved

src/aiperf/common/config/input_config.py Outdated Show resolved Hide resolved

src/aiperf/endpoints/openai_chat.py Show resolved Hide resolved

src/aiperf/endpoints/openai_completions.py Show resolved Hide resolved

ajcasagrande added 2 commits February 11, 2026 13:25

address code rabbit

cd52f60

Merge branch 'main' into ajc/vip-only

808d652

matthewkotila reviewed Feb 12, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: add --service-tier-dist for per-request service_tier distribution#675

feat: add --service-tier-dist for per-request service_tier distribution#675
ajcasagrande wants to merge 3 commits intomainfrom
ajc/vip-only

ajcasagrande commented Feb 11, 2026 •

edited by coderabbitai bot

Loading

Uh oh!

github-actions bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

ajcasagrande commented Feb 11, 2026

Uh oh!

coderabbitai bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

coderabbitai bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Feb 11, 2026 •

edited

Loading

Uh oh!

matthewkotila Feb 12, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ajcasagrande commented Feb 11, 2026 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

github-actions bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Try out this PR

Uh oh!

ajcasagrande commented Feb 11, 2026

Uh oh!

coderabbitai bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

codecov bot commented Feb 11, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

matthewkotila Feb 12, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

ajcasagrande commented Feb 11, 2026 •

edited by coderabbitai bot

Loading

github-actions bot commented Feb 11, 2026 •

edited

Loading

coderabbitai bot commented Feb 11, 2026 •

edited

Loading

codecov bot commented Feb 11, 2026 •

edited

Loading