Skip to content

feat: add session step replay dataset loader and session-level metrics#743

Open
ajcasagrande wants to merge 2 commits intomainfrom
ajc/session-step-replay
Open

feat: add session step replay dataset loader and session-level metrics#743
ajcasagrande wants to merge 2 commits intomainfrom
ajc/session-step-replay

Conversation

@ajcasagrande
Copy link
Contributor

Add support for replaying captured agent sessions with candidate prompt selection. Includes SessionStepReplayDatasetLoader, session metrics processor, deterministic candidate selection via credit-issued random seeds, and completions endpoint multi-turn support.

@github-actions
Copy link

github-actions bot commented Mar 7, 2026

Try out this PR

Quick install:

pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@a7eefa2d0c2485c00ff10c1ba7dbf2d0901eb929

Recommended with virtual environment (using uv):

uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@a7eefa2d0c2485c00ff10c1ba7dbf2d0901eb929

Last updated for commit: a7eefa2Browse code

@github-actions github-actions bot added the feat label Mar 7, 2026
@ajcasagrande ajcasagrande force-pushed the ajc/session-step-replay branch from 5336ffb to f0127f6 Compare March 7, 2026 04:46
@ajcasagrande ajcasagrande force-pushed the ajc/conversationp-context-mode branch from 87946ec to 59b6e84 Compare March 7, 2026 04:46
Introduce ConversationContextMode enum (accumulate_all, drop_responses,
standalone) to control how prior turns are accumulated in multi-turn
conversations. Modes resolve with conversation > dataset default >
accumulate_all precedence. Standalone replaces turn_list with only the
current turn; drop_responses skips storing assistant responses.

Signed-off-by: Anthony Casagrande <acasagrande@nvidia.com>
@ajcasagrande ajcasagrande force-pushed the ajc/conversationp-context-mode branch from 59b6e84 to 27536c2 Compare March 7, 2026 04:47
Add support for replaying captured agent sessions with candidate prompt
selection. Includes SessionReplayDatasetLoader, session metrics processor,
deterministic candidate selection via credit-issued random seeds, and
completions endpoint multi-turn support.

Signed-off-by: Anthony Casagrande <acasagrande@nvidia.com>
@ajcasagrande ajcasagrande force-pushed the ajc/session-step-replay branch from f0127f6 to a7eefa2 Compare March 7, 2026 04:51
@codecov
Copy link

codecov bot commented Mar 7, 2026

Codecov Report

❌ Patch coverage is 93.85246% with 15 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
src/aiperf/workers/inference_client.py 35.29% 9 Missing and 2 partials ⚠️
src/aiperf/dataset/loader/session_step_replay.py 96.29% 2 Missing ⚠️
...st_processors/session_metrics_results_processor.py 96.66% 0 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

@ajcasagrande ajcasagrande force-pushed the ajc/conversationp-context-mode branch from 27536c2 to 9544238 Compare March 16, 2026 01:35
Base automatically changed from ajc/conversationp-context-mode to main March 18, 2026 18:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant