Conversation
…ture Eliminate the composer abstraction layer, consolidating dataset composition logic directly into DatasetManager and the loaders themselves. This reduces indirection and makes the data flow easier to follow. - Remove composer/ directory (base, custom, synthetic, synthetic_rankings) - Replace base_loader.py and base_public_dataset.py with simplified base.py - Add file/ and synthetic/ loader subdirectories - Extract model_selection_strategies, output_tokens_sampler, public_datasets - Update plugin registry with new loader categories and schema - Update docs and tests to reflect new architecture Signed-off-by: Anthony Casagrande <acasagrande@nvidia.com>
Loaders now receive a DatasetBackingStoreProtocol and write conversations as they're built, eliminating the ~2x peak memory from materializing the full dataset in-memory before writing to mmap. Signed-off-by: Anthony Casagrande <acasagrande@nvidia.com>
…tions Loaders are now pure producers — load() yields finalized Conversation objects instead of pushing them into a backing store. DatasetManager drives storage and metadata collection via async for. Signed-off-by: Anthony Casagrande <acasagrande@nvidia.com>
…dependencies Move session_id_generator, prompt_generator, model_selector, output_tokens_sampler, and seq_distribution into LoaderContext. Loaders now access shared state via self.ctx instead of holding individual references, reducing constructor complexity and coupling. Signed-off-by: Anthony Casagrande <acasagrande@nvidia.com>
Replace the hardcoded PublicDataset dataclass registry with a proper public_dataset plugin category. PublicDatasetLoader now delegates to file loaders specified in plugin metadata, unifying public datasets with the rest of the plugin architecture. Also fixes sampling strategy checks to consistently use model_fields_set. Signed-off-by: Anthony Casagrande <acasagrande@nvidia.com>
Try out this PRQuick install: pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@e00b7c6410bb3eb6d442d88177d816fb0bd36198Recommended with virtual environment (using uv): uv venv --python 3.12 && source .venv/bin/activate
uv pip install --upgrade --force-reinstall git+https://github.com/ai-dynamo/aiperf.git@e00b7c6410bb3eb6d442d88177d816fb0bd36198Last updated for commit: |
WalkthroughThis pull request refactors the dataset loading architecture from a composer-based system to a loader-based system. It renames the CLI flag Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 12
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
src/aiperf/dataset/loader/random_pool.py (1)
88-198:⚠️ Potential issue | 🟠 MajorUse async I/O in
parse_and_validate()to avoid blocking the event loop.The
parse_and_validate()method is called synchronously within the asyncload()method (BaseFileLoader line 44), blocking the event loop while reading files. Both_validate_path()and_load_dataset_from_file()useopen()for synchronous file I/O.Convert to async by either:
- Making
parse_and_validate()async and usingaiofilesfor file operations, or- Using
asyncio.to_thread()to execute blocking I/O off the event loopPer coding guidelines: "Use async/await for ALL I/O operations - no blocking calls."
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/dataset/loader/random_pool.py` around lines 88 - 198, Change parse_and_validate to be asynchronous and ensure all blocking file operations are moved off the event loop: make parse_and_validate an async def and either (preferred minimal-change) call the existing blocking helpers (_validate_path and _load_dataset_from_file) via asyncio.to_thread, or refactor file reads in _validate_path and _load_dataset_from_file to use aiofiles and await them. Update can_load_directory and any callers that expect parse_and_validate to be sync (e.g., BaseFileLoader.load) to await the new async parse_and_validate, and ensure ValidationError behavior is preserved; reference the methods RandomPoolDatasetLoader.parse_and_validate, RandomPoolDatasetLoader._validate_path, and RandomPoolDatasetLoader._load_dataset_from_file when making the changes.src/aiperf/dataset/loader/mooncake_trace.py (1)
96-101:⚠️ Potential issue | 🟠 MajorUse async file I/O to prevent blocking the event loop in parse_and_validate.
parse_and_validate()is called directly from the asyncload()method (BaseFileLoader line 44) without awaiting, blocking the event loop with synchronousopen(). Replace withaiofiles.open()for non-blocking file reads, or useasyncio.to_thread()to offload the operation if refactoring to async is not feasible. The codebase uses both patterns (see aiofiles in post_processors and asyncio.to_thread in dataset_manager, record_processor_service).🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/dataset/loader/mooncake_trace.py` around lines 96 - 101, parse_and_validate currently uses synchronous open(self.filename) which blocks the event loop because load() in BaseFileLoader calls it from async code; change the file read to non-blocking I/O by using aiofiles.open in parse_and_validate (or wrap the existing blocking loop in asyncio.to_thread) so each line is read asynchronously, then call MooncakeTrace.model_validate_json for each non-empty stripped line as before; ensure you import aiofiles and update parse_and_validate to be async (or if using asyncio.to_thread, run the whole parsing lambda in to_thread) and keep references to self.filename and MooncakeTrace.model_validate_json unchanged.
🧹 Nitpick comments (19)
tests/unit/common/models/test_sequence_distribution.py (1)
687-697: Extract sharedMockLoaderto a class-level fixture to eliminate duplication
MockLoaderis defined identically in both test methods. Factor it out to a class-level@pytest.fixture(or a simple nested class) to honor DRY and reduce maintenance surface.♻️ Proposed refactor
class TestSequenceCaching: """Test turn-level sequence length caching for ISL/OSL consistency.""" + + `@pytest.fixture` + def MockLoader(self): + from aiperf.dataset.loader.base import BaseDatasetLoader + from aiperf.common.models import Conversation + from aiperf.plugin.enums import DatasetSamplingStrategy + + class _MockLoader(BaseDatasetLoader): + def load(self) -> list[Conversation]: + return [] + + `@classmethod` + def can_load(cls, _config) -> bool: + return False + + `@classmethod` + def get_preferred_sampling_strategy(cls) -> DatasetSamplingStrategy: + return DatasetSamplingStrategy.SEQUENTIAL + + return _MockLoader - def test_turn_sequence_caching(self): + def test_turn_sequence_caching(self, MockLoader): ... - # Create mock loader - class MockLoader(BaseDatasetLoader): - ... - - def test_different_turns_get_different_cache_entries(self): + def test_different_turns_get_different_cache_entries(self, MockLoader): ... - # Create mock loader - class MockLoader(BaseDatasetLoader): - ...Also applies to: 739-749
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/common/models/test_sequence_distribution.py` around lines 687 - 697, MockLoader is duplicated in two tests; extract it out as a shared class-level fixture or a nested class to follow DRY. Create a single reusable MockLoader (subclassing BaseDatasetLoader) that implements load, can_load, and get_preferred_sampling_strategy, and then reference that fixture/class from both test methods instead of redefining it; ensure the symbol names remain MockLoader, load, can_load, and get_preferred_sampling_strategy so existing tests pick it up.tests/unit/dataset/conftest.py (1)
75-87:empty_dataset_managerdoesn't need to beasync.This fixture has no
awaitcalls. While pytest-asyncio handles async fixtures fine, marking itasyncis misleading. Consider making it a regulardeffor clarity.Suggested change
`@pytest.fixture` -async def empty_dataset_manager(user_config: UserConfig) -> DatasetManager: +def empty_dataset_manager(user_config: UserConfig) -> DatasetManager: """Create a DatasetManager instance with empty dataset (no backing store/client).""" manager = DatasetManager( service_config=ServiceConfig(),🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/dataset/conftest.py` around lines 75 - 87, The fixture empty_dataset_manager is declared async but contains no awaits; change its definition from async def empty_dataset_manager(...) to a regular def so it's a synchronous pytest fixture. Update the signature while keeping the return type DatasetManager and ensure the body that constructs DatasetManager, sets manager.dataset_metadata using DatasetMetadata and DatasetSamplingStrategy.SEQUENTIAL, and returns manager remains unchanged.src/aiperf/dataset/loader/context.py (1)
23-24: EmptyTYPE_CHECKINGblock is dead code.The
if TYPE_CHECKING: passblock serves no purpose. Remove it or populate it with the conditional imports it was intended for.Suggested removal
-if TYPE_CHECKING: - pass -🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/dataset/loader/context.py` around lines 23 - 24, Remove the dead `if TYPE_CHECKING: pass` block in the module and either delete it entirely or replace it with the intended conditional imports; locate the `if TYPE_CHECKING` block in the context loader module (symbol: TYPE_CHECKING) and remove the no-op, or populate it with any forward-only imports required by type hints (e.g., the specific classes/types used only for typing), ensuring no runtime imports are introduced.src/aiperf/dataset/dataset_manager.py (1)
144-192:_generate_input_payloadsfetches every conversation individually — potential N+1 concern.Line 161 calls
await self._dataset_client.get_conversation(conv_meta.conversation_id)inside a loop over all conversations. If the dataset client involves any I/O per call (e.g., mmap seek + deserialization), this could be slow for large datasets. Consider whether a batch retrieval API would be beneficial.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/dataset/dataset_manager.py` around lines 144 - 192, _generate_input_payloads currently performs an N+1 fetch by calling self._dataset_client.get_conversation(conv_meta.conversation_id) inside the loop over self.dataset_metadata.conversations; change this to either (a) add and use a batch retrieval method on the dataset client (e.g., get_conversations(conversation_ids: list[str])) and fetch all conversations in one call before iterating, or (b) if batching is not supported, collect all conversation IDs then concurrently fetch them with asyncio.gather to avoid serial I/O; update usage sites in _generate_input_payloads (replace per-loop get_conversation calls) and keep the rest of the logic (session_payloads_map, RequestInfo creation, endpoint.format_payload) unchanged.src/aiperf/dataset/loader/synthetic/__init__.py (1)
4-6: Consider also exporting the concrete synthetic loaders from this subpackage.
SyntheticMultiModalLoaderandSyntheticRankingsLoaderlive in sibling modules undersynthetic/but are not re-exported here. Consumers who navigate directly toaiperf.dataset.loader.syntheticwould not find them. The parentloader/__init__.pyre-exports them, but making the subpackage self-contained is idiomatic.♻️ Proposed addition
from aiperf.dataset.loader.synthetic.base import BaseSyntheticLoader +from aiperf.dataset.loader.synthetic.multimodal import SyntheticMultiModalLoader +from aiperf.dataset.loader.synthetic.rankings import SyntheticRankingsLoader -__all__ = ["BaseSyntheticLoader"] +__all__ = ["BaseSyntheticLoader", "SyntheticMultiModalLoader", "SyntheticRankingsLoader"]🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/dataset/loader/synthetic/__init__.py` around lines 4 - 6, The synthetic subpackage __init__ currently only exports BaseSyntheticLoader, which hides the concrete loaders; update src/aiperf/dataset/loader/synthetic/__init__.py to also import and include SyntheticMultiModalLoader and SyntheticRankingsLoader in __all__ so that consumers can access those classes directly from aiperf.dataset.loader.synthetic (reference the symbols SyntheticMultiModalLoader, SyntheticRankingsLoader and BaseSyntheticLoader when modifying the module exports).tests/unit/common/config/test_user_config_mooncake_trace.py (1)
43-108: Consider@pytest.mark.parametrizefor the repeated_count_dataset_entriesscenariosFive tests in
TestMooncakeTraceRequestCount(lines 43–145) share identical scaffolding — same@patchdecorators, sameUserConfig/InputConfigconstruction, onlymock_file_contentand the asserted count differ. Per coding guidelines this pattern should use@pytest.mark.parametrize.♻️ Sketch of a parametrized replacement
- `@patch`("pathlib.Path.exists", return_value=True) - `@patch`("pathlib.Path.is_file", return_value=True) - def test_mooncake_trace_uses_dataset_size(self, mock_is_file, mock_exists): - ... - `@patch`("pathlib.Path.exists", return_value=True) - `@patch`("pathlib.Path.is_file", return_value=True) - def test_mooncake_trace_skips_empty_lines(self, mock_is_file, mock_exists): - ... - # (and three more near-identical tests) + `@pytest.mark.parametrize`( + "file_content,expected_count", + [ + ( + '{"input_length": 100, "hash_ids": [1], "timestamp": 1000}\n' + '{"input_length": 200, "hash_ids": [2], "timestamp": 2000}\n' + '{"input_length": 300, "hash_ids": [3], "timestamp": 3000}\n', + 3, + ), + ( # empty lines skipped + '{"input_length": 100}\n\n \n{"input_length": 200}\n{"input_length": 300}\n', + 3, + ), + ("", 0), # empty file + ], + ) + `@patch`("pathlib.Path.exists", return_value=True) + `@patch`("pathlib.Path.is_file", return_value=True) + def test_count_dataset_entries_mooncake_trace_returns_expected_count( + self, mock_is_file, mock_exists, file_content, expected_count + ): + config = UserConfig( + endpoint=EndpointConfig(model_names=["test-model"]), + loadgen=LoadGeneratorConfig(request_count=999), + input=InputConfig( + file="/fake/path/test.jsonl", + dataset_type=DatasetLoaderType.MOONCAKE_TRACE, + ), + ) + with patch("builtins.open", mock_open(read_data=file_content)): + assert config._count_dataset_entries() == expected_countAs per coding guidelines: "Use
@pytest.mark.parametrizefor data-driven tests."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/common/config/test_user_config_mooncake_trace.py` around lines 43 - 108, Multiple tests (test_mooncake_trace_uses_dataset_size, test_mooncake_trace_skips_empty_lines, test_mooncake_trace_empty_file_returns_zero) duplicate the same setup and only vary mock_file_content and expected count; refactor them into a single parametrized test using pytest.mark.parametrize that supplies (mock_file_content, expected_count) and reuses the existing patches for pathlib.Path.exists and is_file and the same UserConfig/InputConfig construction, then call config._count_dataset_entries() and assert equality for each parameter set.src/aiperf/plugin/schema/schemas.py (1)
314-334:PublicDatasetMetadata(and the whole file) usespydantic.BaseModelinstead ofAIPerfBaseModelAll eight classes in this file, including the new
PublicDatasetMetadata, inherit directly frompydantic.BaseModel. Per coding guidelines,src/**/*.pydata models should useAIPerfBaseModel.If this file is intentionally exempt (e.g., to avoid circular imports with the plugin bootstrap path), consider adding a module-level comment to document the rationale so future contributors don't inadvertently add
AIPerfBaseModel-based classes here.As per coding guidelines: "Use
AIPerfBaseModelfor data models andBaseConfigfor configuration."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/plugin/schema/schemas.py` around lines 314 - 334, PublicDatasetMetadata and the other model classes in this module inherit from pydantic.BaseModel instead of the project's AIPerfBaseModel; change the base class for PublicDatasetMetadata (and the other seven classes) from BaseModel to AIPerfBaseModel so they follow the coding guideline for data models (refer to class PublicDatasetMetadata and other model class definitions), and if you intentionally must avoid AIPerfBaseModel (e.g., to prevent import cycles with the plugin bootstrap), add a clear module-level comment explaining the exemption and rationale so future contributors understand why BaseModel was used.src/aiperf/dataset/loader/models.py (2)
254-256: Consider renamingCustomDatasetTto align with the new naming convention.The TypeVar is still named
CustomDatasetTwhile the enum it relates to has been renamed fromCustomDatasetTypetoDatasetLoaderType. A name likeDatasetModelTwould be more consistent.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/dataset/loader/models.py` around lines 254 - 256, Rename the TypeVar CustomDatasetT to follow the updated naming convention (e.g., DatasetModelT) so it aligns with DatasetLoaderType; update the declaration that currently reads CustomDatasetT = TypeVar("CustomDatasetT", bound=SingleTurn | MultiTurn | RandomPool | MooncakeTrace) to use the new identifier (DatasetModelT) and adjust any references/usages of CustomDatasetT across the module to the new name to ensure consistency with DatasetLoaderType and avoid unresolved names.
24-24:typefields lackField(description=...)per coding guidelines.The
typediscriminator fields on lines 24, 103, 130, and 206 are modified but don't useField(description=...). While these are structural discriminators, the guideline applies to every Pydantic field.Example fix for line 24
- type: Literal[DatasetLoaderType.SINGLE_TURN] = DatasetLoaderType.SINGLE_TURN + type: Literal[DatasetLoaderType.SINGLE_TURN] = Field(DatasetLoaderType.SINGLE_TURN, description="Discriminator for single-turn dataset type.")As per coding guidelines:
src/**/*.py: AddField(description="...")on EVERY Pydantic field.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/dataset/loader/models.py` at line 24, The Pydantic `type` discriminator fields (e.g., the field declared as `type: Literal[DatasetLoaderType.SINGLE_TURN] = DatasetLoaderType.SINGLE_TURN`) must include a Field(...) with a description per guidelines; update each `type` declaration (the ones at the locations using `DatasetLoaderType.SINGLE_TURN` and the other discriminator literals around lines 103, 130, and 206) to use Field(description="...") giving a short description of the discriminator, and ensure `Field` is imported from `pydantic` if not already present.docs/dataset-loaders.md (1)
428-430: Minor: Missing horizontal rule before "Synthetic Loaders" section.All other top-level loader sections (single_turn, multi_turn, random_pool, mooncake_trace, sharegpt) end with a
---separator. The sharegpt section at line 428 is missing one before the "## Synthetic Loaders" heading at line 430.Proposed fix
**Default sampling strategy:** Sequential +--- + ## Synthetic Loaders🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@docs/dataset-loaders.md` around lines 428 - 430, Add the missing horizontal rule separator before the "## Synthetic Loaders" heading: insert a line with `---` immediately after the "Default sampling strategy: Sequential" (or at the end of the sharegpt section) so the "## Synthetic Loaders" heading is preceded by the same `---` top-level separator used elsewhere.tests/unit/dataset/test_dataset_manager.py (1)
91-98: Suppress Ruff ARG001 for mock helper parameters.Ruff flags
argsandkwargsas unused (ARG001). These are intentionally ignored since this is a patch replacement forBaseFileLoader.loadwhich receivesselfwhen called as an instance method. Consider adding an inline suppression or using_prefixes.Suggested fix
-async def _mock_load_dummy_conversation(*args, **kwargs): +async def _mock_load_dummy_conversation(*_args, **_kwargs):🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/dataset/test_dataset_manager.py` around lines 91 - 98, The helper function _mock_load_dummy_conversation is intentionally written to accept the instance args passed when patching BaseFileLoader.load but Ruff flags unused args/kwargs (ARG001); update the signature to accept `_ *args` and `**_kwargs` (or rename to `_args, **_kwargs`) or add an inline noqa comment (e.g., `# noqa: ARG001`) to the function definition so Ruff no longer complains while preserving its behavior when used as a replacement for BaseFileLoader.load.src/aiperf/dataset/loader/multi_turn.py (1)
116-138: Blocking file I/O inparse_and_validateruns on the event loop.
open(self.filename)at line 127 is synchronous blocking I/O, called from theasync def load()pipeline inBaseFileLoader. For large JSONL files this could block the event loop. This is a pre-existing pattern carried from the oldload_datasetand is an architectural constraint of theBaseFileLoaderdesign. Consider wrapping withasyncio.to_threador usingaiofilesif this becomes a bottleneck.As per coding guidelines, "Use async/await for ALL I/O operations - no
time.sleep, no blocking calls."🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/dataset/loader/multi_turn.py` around lines 116 - 138, parse_and_validate is performing synchronous file I/O by calling open(self.filename) inside an async pipeline (see async def load in BaseFileLoader); change the file reading to run off the event loop (either wrap the blocking work in asyncio.to_thread or switch to an async file reader like aiofiles) so parsing remains non-blocking, i.e., move the loop that reads lines and calls MultiTurn.model_validate_json into a to_thread call or replace open/read with aiofiles and await the reads while keeping session_id generation via self.ctx.session_id_generator.next() intact.src/aiperf/dataset/loader/synthetic/multimodal.py (1)
28-37: Video modality excluded from the "at least one payload" validation.The constructor validates that at least one of
prompt,image, oraudiois enabled, butinclude_videois not checked. A user enabling only video generation would hit this error. The warning at line 102–106 has the same omission. If video-only usage is not a supported configuration, this is fine — but consider documenting that explicitly.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/dataset/loader/synthetic/multimodal.py` around lines 28 - 37, The constructor's "at least one payload" validation and the subsequent warning block omit include_video, causing video-only configs to be rejected; update the validation in the class constructor to include self.include_video in the boolean check (so the ValueError only fires when prompt, image, audio, and video are all False), and update the corresponding warning message block to include include_video in its condition and text so video-only generation is accepted and the error/warning message accurately lists video as a valid enabled modality.src/aiperf/dataset/loader/synthetic/base.py (3)
37-38: Stale "composer" naming in RNG derivation keys.The seed strings
"composer.conversation.turn_count"and"composer.conversation.turn_delay"are leftovers from the old composer architecture. They still work functionally but are inconsistent with the loader-based refactoring. Note: changing them would alter RNG sequences for users relying on seeded reproducibility.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/dataset/loader/synthetic/base.py` around lines 37 - 38, The RNG derivation keys use stale "composer" prefixes; update the seed strings passed to rng.derive for _turn_sampler_rng and _delay_sampler_rng from "composer.conversation.turn_count" and "composer.conversation.turn_delay" to the new loader-consistent keys (e.g. "loader.conversation.turn_count" and "loader.conversation.turn_delay") by changing the arguments in the calls that create self._turn_sampler_rng and self._delay_sampler_rng; be mindful this will change RNG sequences for seeded runs.
107-118: Inconsistent null-guard on video vs. image/audio payload generation.
_generate_video_payloads(line 116) guards withif data:before appending, but_generate_image_payloadsand_generate_audio_payloadsappend unconditionally. If a generator can return a falsy value, the same guard should be applied consistently; otherwise, the guard is dead code.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/dataset/loader/synthetic/base.py` around lines 107 - 118, The video loader currently checks for falsy returns before appending (in _generate_video_payloads -> video_generator.generate + video.contents.append) while _generate_image_payloads and _generate_audio_payloads append unconditionally; make these consistent: either remove the `if data:` guard from _generate_video_payloads or add the same `if data:` guard to _generate_image_payloads and _generate_audio_payloads around their generator calls (image_generator.generate / audio_generator.generate) and their contents.append calls, depending on whether generators can legitimately return falsy values—update all three methods so they use the same null-guard behavior.
138-143: Consider standardizing modality check logic for consistency.
include_imageandinclude_audioboth check.mean > 0, butinclude_videousesbool(width and height). While this works correctly (video width/height areint | None, not config objects), the inconsistent checking pattern makes the logic harder to follow. Either document why video differs from other modalities, or refactor to use a consistent pattern.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/dataset/loader/synthetic/base.py` around lines 138 - 143, The modality checks are inconsistent: include_image/include_audio use ".mean > 0" while include_video checks width/height truthiness; update include_video to follow the same pattern or explicitly document the difference. Modify the include_video property to first try the same mean-based check (e.g., video_cfg = self.ctx.config.input.video; return bool(getattr(video_cfg, "mean", 0) > 0) or fallback to bool(video_cfg.width and video_cfg.height)), keeping references to include_video and self.ctx.config.input.video.width/height so behavior is consistent and clear across modalities.tests/unit/dataset/loader/test_can_load.py (2)
19-31: Missing error wrapping compared to production_infer_type.The production
DatasetManager._infer_type()(from the relevant snippet atdataset_manager.pylines 369-422) wraps the explicit type resolution intry/except (ValueError, KeyError)to produce a user-friendly error message for invalid type values. This test helper on line 24 does not —DatasetLoaderType(data["type"])will raise a rawValueErrorfor unknown type strings. This is fine for test inputs, but worth noting the behavioral divergence.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/dataset/loader/test_can_load.py` around lines 19 - 31, The test helper _infer_type differs from production by not wrapping explicit type resolution; adjust _infer_type to catch ValueError and KeyError raised by DatasetLoaderType(data["type"]) or plugins.get_class(...) and re-raise a ValueError with a user-friendly message (mirroring DatasetManager._infer_type) that includes the invalid type string, then proceed to call LoaderClass.can_load(data, filename) and raise the existing error if it cannot handle the data; reference _infer_type, DatasetLoaderType, plugins.get_class, and LoaderClass.can_load when locating the change.
19-46: Test helper_infer_typeduplicates production logic fromDatasetManager._infer_type().This standalone helper mirrors the production inference logic for testability without instantiating
DatasetManager. This is a reasonable pattern, but the duplicate can drift from the production implementation over time. Consider adding a comment noting the source of truth, or in a future iteration, extracting the inference logic into a shared utility that bothDatasetManagerand tests can use.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@tests/unit/dataset/loader/test_can_load.py` around lines 19 - 46, The test helper _infer_type duplicates production logic from DatasetManager._infer_type; to prevent drift, add a clear comment in the test above _infer_type that points to DatasetManager._infer_type as the source of truth (mention DatasetLoaderType, plugins.iter_all and PluginType.DATASET_LOADER), and optionally note that the logic should be extracted to a shared utility in the future; if you prefer immediate consolidation, move the inference code into a small internal utility function (used by both DatasetManager._infer_type and the test) and update references in the test to call that utility instead of duplicating logic.src/aiperf/dataset/loader/single_turn.py (1)
90-110: Blocking file I/O inparse_and_validate.Line 101 uses synchronous
open(self.filename)which violates the coding guideline requiring async I/O for all operations. However, the parentBaseFileLoader.load()calls this synchronously within an async generator, so this appears to be an intentional architectural decision for file-based loaders (file parsing is CPU-bound and typically fast).If large files are expected, consider wrapping in
asyncio.to_threador usingaiofilesin a future iteration.As per coding guidelines,
src/**/*.py: "Use async/await for ALL I/O operations - notime.sleep, no blocking calls".🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/aiperf/dataset/loader/single_turn.py` around lines 90 - 110, parse_and_validate is doing blocking file I/O with open(self.filename) which violates the async I/O guideline; make it async and perform file reads without blocking the event loop. Change parse_and_validate to async def parse_and_validate(...) and either (preferred) use aiofiles.open(self.filename) and iterate lines with async for (preserving the SingleTurn.model_validate_json and session id generation), or wrap the synchronous read in await asyncio.to_thread(read_and_parse_fn, self.filename) if you prefer a minimal change; also update the caller (e.g., BaseFileLoader.load) to await parse_and_validate so the async flow remains correct and ensure you preserve usage of self.ctx.session_id_generator.next() and returned dict[str, list[SingleTurn]].
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@mkdocs.yml`:
- Around line 28-33: The navigation has a naming collision: two "Development"
entries (one linking to Development.md under Reference and one as a top-level
section containing dev/creating-dataset-loaders.md). Rename the top-level
section to a unique label (e.g., "Plugin Development", "Extending AIPerf", or
"Contributing") or move dev/creating-dataset-loaders.md under the existing
"Development" entry in mkdocs.yml, updating the nav key for the section that
contains dev/creating-dataset-loaders.md and ensuring the other link to
Development.md remains unchanged.
In `@src/aiperf/dataset/dataset_manager.py`:
- Around line 334-368: The _infer_dataset_type function performs blocking file
I/O via open(file_path) which can block the event loop when called from async
paths (_configure_dataset → _create_loader → _create_file_loader); change the
implementation to perform the file read on a thread (use asyncio.to_thread) or
convert _infer_dataset_type/_create_file_loader to async and use aiofiles so the
loop isn't blocked; specifically, move the logic that opens and iterates lines
(and calls load_json_str) into a callable passed to asyncio.to_thread (or an
aiofiles-based async read) and keep the rest of _infer_type(filename=...,
data=...) unchanged so callers like _create_loader/_configure_dataset remain
non-blocking.
In `@src/aiperf/dataset/loader/context.py`:
- Around line 121-124: The property seq_distribution is missing a return type
hint; change its signature to return SequenceLengthDistribution | None and add
SequenceLengthDistribution to the import list from aiperf.common.models so the
property (def seq_distribution(self) -> SequenceLengthDistribution | None) and
its use of self._seq_distribution are correctly typed.
In `@src/aiperf/dataset/loader/sharegpt.py`:
- Around line 59-61: The synchronous file read and JSON parsing in
parse_and_validate/load_json_str are blocking the async event loop called from
BaseFileLoader.load; change the implementation so the blocking work runs off the
event loop — either make parse_and_validate async using aiofiles or (preferred
to match repo patterns) call the blocking parts (open/read and load_json_str)
inside asyncio.to_thread from the loader's async entrypoint (e.g., within
ShareGPT loader's parse_and_validate or from BaseFileLoader.load) so that the
file open/read and JSON parsing happen on a threadpool instead of blocking the
event loop.
In `@src/aiperf/dataset/model_selection_strategies.py`:
- Around line 32-40: RoundRobinModelSelectionStrategy.__init__ (and likewise
RandomModelSelectionStrategy.__init__ and
ShuffleModelSelectionStrategy.__init__) must defensively validate the incoming
model_names list to prevent empty lists; add a guard that checks if model_names
is truthy and has length > 0 and raise a clear ValueError (or TypeError if not a
list) with a helpful message when invalid, so select() cannot raise
ZeroDivisionError/IndexError at runtime; update the constructors to validate
model_names before assigning to self._model_names and initializing self._counter
or shuffle state.
In `@src/aiperf/dataset/public_datasets.py`:
- Around line 78-99: The async function _download uses blocking filesystem calls
on cache_filepath (Path.exists(), cache_filepath.parent.mkdir, and
open(...).write) which can block the event loop; change these to async
equivalents by either using aiofiles for file write (open/write) and using
asyncio.to_thread for Path.exists() and mkdir (or use asyncio.to_thread for all
three) so all filesystem I/O in _download is non-blocking; ensure you reference
AIPERF_DATASET_CACHE_DIR and cache_filepath in the changes and preserve the
existing error handling that raises DatasetLoaderError on save failures.
- Around line 64-68: Remove the unused fragile classmethod
get_preferred_sampling_strategy from the dataset loader base in
public_datasets.py; callers already use the instance-level delegation path
(get_delegate_preferred_sampling_strategy) or call the method on concrete loader
instances, so deleting the classmethod eliminates the risky class-level contract
without changing runtime behavior—remove the def
get_preferred_sampling_strategy(cls) -> DatasetSamplingStrategy: ... block and
keep existing instance methods like get_delegate_preferred_sampling_strategy and
concrete implementations intact.
- Around line 39-49: The _download() helper currently performs blocking
filesystem calls (Path.mkdir and built-in open/write) from async code called by
_get_delegate(); change _download() to create directories with
asyncio.to_thread(Path.mkdir, parents=True, exist_ok=True) and perform file
writes using aiofiles (await aiofiles.open(...) and await file.write(...)) so
all I/O is non-blocking, and keep awaiting the async operations before returning
the local path used by _get_delegate().
In `@tests/unit/common/config/test_input_config.py`:
- Around line 138-148: The test docstring in
test_dataset_type_without_file_raises_error still references the old validator
name `validate_custom_dataset_file`; update the docstring to use the new
validator name (e.g., the current validator on InputConfig) so it reflects the
refactor—locate the test function `test_dataset_type_without_file_raises_error`
and replace the stale reference with the new validator identifier while keeping
the rest of the docstring intact.
In `@tests/unit/common/models/test_sequence_distribution.py`:
- Line 682: Remove the unused Conversation import from the test module: in the
import line that currently reads "from aiperf.common.models import Conversation,
Turn" used by test_turn_sequence_caching and
test_different_turns_get_different_cache_entries, drop Conversation so only Turn
is imported; update the import statement accordingly to silence the linter and
leave test logic (Turn usage) unchanged.
- Line 679: The test method names do not follow the required
test_<function>_<scenario>_<expected> convention; rename the methods to be
explicit: change test_turn_sequence_caching to
test_get_turn_sequence_lengths_same_turn_id_returns_cached_value and change
test_different_turns_get_different_cache_entries to
test_get_turn_sequence_lengths_distinct_turn_ids_cached_independently so the
function under test (get_turn_sequence_lengths) and the scenario/expected
outcome are encoded in the name; update any references/imports or test markers
that use those names accordingly (e.g., in
tests/unit/common/models/test_sequence_distribution.py where
test_turn_sequence_caching and test_different_turns_get_different_cache_entries
are defined).
- Around line 699-705: The test is calling nonexistent loader methods
`_get_turn_sequence_lengths` and `_clear_turn_cache` on MockLoader; fix by using
the correct public API on the context or implement the methods on the loader:
either change the test to call the LoaderContext public method
`ctx.get_turn_sequence_lengths(turn_id)` (and use the context to clear any cache
instead of calling `_clear_turn_cache`) and ensure `_seq_distribution` and
`_turn_sequence_cache` are set on the `LoaderContext`, or add explicit
implementations of `_get_turn_sequence_lengths` and `_clear_turn_cache` on
`BaseDatasetLoader` that delegate to the loader's `LoaderContext` storage
(`_seq_distribution`, `_turn_sequence_cache`) so the existing test calls
succeed; update references to `MockLoader`, `BaseDatasetLoader`,
`LoaderContext`, `ctx.get_turn_sequence_lengths`, `_seq_distribution`, and
`_turn_sequence_cache` accordingly.
---
Outside diff comments:
In `@src/aiperf/dataset/loader/mooncake_trace.py`:
- Around line 96-101: parse_and_validate currently uses synchronous
open(self.filename) which blocks the event loop because load() in BaseFileLoader
calls it from async code; change the file read to non-blocking I/O by using
aiofiles.open in parse_and_validate (or wrap the existing blocking loop in
asyncio.to_thread) so each line is read asynchronously, then call
MooncakeTrace.model_validate_json for each non-empty stripped line as before;
ensure you import aiofiles and update parse_and_validate to be async (or if
using asyncio.to_thread, run the whole parsing lambda in to_thread) and keep
references to self.filename and MooncakeTrace.model_validate_json unchanged.
In `@src/aiperf/dataset/loader/random_pool.py`:
- Around line 88-198: Change parse_and_validate to be asynchronous and ensure
all blocking file operations are moved off the event loop: make
parse_and_validate an async def and either (preferred minimal-change) call the
existing blocking helpers (_validate_path and _load_dataset_from_file) via
asyncio.to_thread, or refactor file reads in _validate_path and
_load_dataset_from_file to use aiofiles and await them. Update
can_load_directory and any callers that expect parse_and_validate to be sync
(e.g., BaseFileLoader.load) to await the new async parse_and_validate, and
ensure ValidationError behavior is preserved; reference the methods
RandomPoolDatasetLoader.parse_and_validate,
RandomPoolDatasetLoader._validate_path, and
RandomPoolDatasetLoader._load_dataset_from_file when making the changes.
---
Nitpick comments:
In `@docs/dataset-loaders.md`:
- Around line 428-430: Add the missing horizontal rule separator before the "##
Synthetic Loaders" heading: insert a line with `---` immediately after the
"Default sampling strategy: Sequential" (or at the end of the sharegpt section)
so the "## Synthetic Loaders" heading is preceded by the same `---` top-level
separator used elsewhere.
In `@src/aiperf/dataset/dataset_manager.py`:
- Around line 144-192: _generate_input_payloads currently performs an N+1 fetch
by calling self._dataset_client.get_conversation(conv_meta.conversation_id)
inside the loop over self.dataset_metadata.conversations; change this to either
(a) add and use a batch retrieval method on the dataset client (e.g.,
get_conversations(conversation_ids: list[str])) and fetch all conversations in
one call before iterating, or (b) if batching is not supported, collect all
conversation IDs then concurrently fetch them with asyncio.gather to avoid
serial I/O; update usage sites in _generate_input_payloads (replace per-loop
get_conversation calls) and keep the rest of the logic (session_payloads_map,
RequestInfo creation, endpoint.format_payload) unchanged.
In `@src/aiperf/dataset/loader/context.py`:
- Around line 23-24: Remove the dead `if TYPE_CHECKING: pass` block in the
module and either delete it entirely or replace it with the intended conditional
imports; locate the `if TYPE_CHECKING` block in the context loader module
(symbol: TYPE_CHECKING) and remove the no-op, or populate it with any
forward-only imports required by type hints (e.g., the specific classes/types
used only for typing), ensuring no runtime imports are introduced.
In `@src/aiperf/dataset/loader/models.py`:
- Around line 254-256: Rename the TypeVar CustomDatasetT to follow the updated
naming convention (e.g., DatasetModelT) so it aligns with DatasetLoaderType;
update the declaration that currently reads CustomDatasetT =
TypeVar("CustomDatasetT", bound=SingleTurn | MultiTurn | RandomPool |
MooncakeTrace) to use the new identifier (DatasetModelT) and adjust any
references/usages of CustomDatasetT across the module to the new name to ensure
consistency with DatasetLoaderType and avoid unresolved names.
- Line 24: The Pydantic `type` discriminator fields (e.g., the field declared as
`type: Literal[DatasetLoaderType.SINGLE_TURN] = DatasetLoaderType.SINGLE_TURN`)
must include a Field(...) with a description per guidelines; update each `type`
declaration (the ones at the locations using `DatasetLoaderType.SINGLE_TURN` and
the other discriminator literals around lines 103, 130, and 206) to use
Field(description="...") giving a short description of the discriminator, and
ensure `Field` is imported from `pydantic` if not already present.
In `@src/aiperf/dataset/loader/multi_turn.py`:
- Around line 116-138: parse_and_validate is performing synchronous file I/O by
calling open(self.filename) inside an async pipeline (see async def load in
BaseFileLoader); change the file reading to run off the event loop (either wrap
the blocking work in asyncio.to_thread or switch to an async file reader like
aiofiles) so parsing remains non-blocking, i.e., move the loop that reads lines
and calls MultiTurn.model_validate_json into a to_thread call or replace
open/read with aiofiles and await the reads while keeping session_id generation
via self.ctx.session_id_generator.next() intact.
In `@src/aiperf/dataset/loader/single_turn.py`:
- Around line 90-110: parse_and_validate is doing blocking file I/O with
open(self.filename) which violates the async I/O guideline; make it async and
perform file reads without blocking the event loop. Change parse_and_validate to
async def parse_and_validate(...) and either (preferred) use
aiofiles.open(self.filename) and iterate lines with async for (preserving the
SingleTurn.model_validate_json and session id generation), or wrap the
synchronous read in await asyncio.to_thread(read_and_parse_fn, self.filename) if
you prefer a minimal change; also update the caller (e.g., BaseFileLoader.load)
to await parse_and_validate so the async flow remains correct and ensure you
preserve usage of self.ctx.session_id_generator.next() and returned dict[str,
list[SingleTurn]].
In `@src/aiperf/dataset/loader/synthetic/__init__.py`:
- Around line 4-6: The synthetic subpackage __init__ currently only exports
BaseSyntheticLoader, which hides the concrete loaders; update
src/aiperf/dataset/loader/synthetic/__init__.py to also import and include
SyntheticMultiModalLoader and SyntheticRankingsLoader in __all__ so that
consumers can access those classes directly from aiperf.dataset.loader.synthetic
(reference the symbols SyntheticMultiModalLoader, SyntheticRankingsLoader and
BaseSyntheticLoader when modifying the module exports).
In `@src/aiperf/dataset/loader/synthetic/base.py`:
- Around line 37-38: The RNG derivation keys use stale "composer" prefixes;
update the seed strings passed to rng.derive for _turn_sampler_rng and
_delay_sampler_rng from "composer.conversation.turn_count" and
"composer.conversation.turn_delay" to the new loader-consistent keys (e.g.
"loader.conversation.turn_count" and "loader.conversation.turn_delay") by
changing the arguments in the calls that create self._turn_sampler_rng and
self._delay_sampler_rng; be mindful this will change RNG sequences for seeded
runs.
- Around line 107-118: The video loader currently checks for falsy returns
before appending (in _generate_video_payloads -> video_generator.generate +
video.contents.append) while _generate_image_payloads and
_generate_audio_payloads append unconditionally; make these consistent: either
remove the `if data:` guard from _generate_video_payloads or add the same `if
data:` guard to _generate_image_payloads and _generate_audio_payloads around
their generator calls (image_generator.generate / audio_generator.generate) and
their contents.append calls, depending on whether generators can legitimately
return falsy values—update all three methods so they use the same null-guard
behavior.
- Around line 138-143: The modality checks are inconsistent:
include_image/include_audio use ".mean > 0" while include_video checks
width/height truthiness; update include_video to follow the same pattern or
explicitly document the difference. Modify the include_video property to first
try the same mean-based check (e.g., video_cfg = self.ctx.config.input.video;
return bool(getattr(video_cfg, "mean", 0) > 0) or fallback to
bool(video_cfg.width and video_cfg.height)), keeping references to include_video
and self.ctx.config.input.video.width/height so behavior is consistent and clear
across modalities.
In `@src/aiperf/dataset/loader/synthetic/multimodal.py`:
- Around line 28-37: The constructor's "at least one payload" validation and the
subsequent warning block omit include_video, causing video-only configs to be
rejected; update the validation in the class constructor to include
self.include_video in the boolean check (so the ValueError only fires when
prompt, image, audio, and video are all False), and update the corresponding
warning message block to include include_video in its condition and text so
video-only generation is accepted and the error/warning message accurately lists
video as a valid enabled modality.
In `@src/aiperf/plugin/schema/schemas.py`:
- Around line 314-334: PublicDatasetMetadata and the other model classes in this
module inherit from pydantic.BaseModel instead of the project's AIPerfBaseModel;
change the base class for PublicDatasetMetadata (and the other seven classes)
from BaseModel to AIPerfBaseModel so they follow the coding guideline for data
models (refer to class PublicDatasetMetadata and other model class definitions),
and if you intentionally must avoid AIPerfBaseModel (e.g., to prevent import
cycles with the plugin bootstrap), add a clear module-level comment explaining
the exemption and rationale so future contributors understand why BaseModel was
used.
In `@tests/unit/common/config/test_user_config_mooncake_trace.py`:
- Around line 43-108: Multiple tests (test_mooncake_trace_uses_dataset_size,
test_mooncake_trace_skips_empty_lines,
test_mooncake_trace_empty_file_returns_zero) duplicate the same setup and only
vary mock_file_content and expected count; refactor them into a single
parametrized test using pytest.mark.parametrize that supplies
(mock_file_content, expected_count) and reuses the existing patches for
pathlib.Path.exists and is_file and the same UserConfig/InputConfig
construction, then call config._count_dataset_entries() and assert equality for
each parameter set.
In `@tests/unit/common/models/test_sequence_distribution.py`:
- Around line 687-697: MockLoader is duplicated in two tests; extract it out as
a shared class-level fixture or a nested class to follow DRY. Create a single
reusable MockLoader (subclassing BaseDatasetLoader) that implements load,
can_load, and get_preferred_sampling_strategy, and then reference that
fixture/class from both test methods instead of redefining it; ensure the symbol
names remain MockLoader, load, can_load, and get_preferred_sampling_strategy so
existing tests pick it up.
In `@tests/unit/dataset/conftest.py`:
- Around line 75-87: The fixture empty_dataset_manager is declared async but
contains no awaits; change its definition from async def
empty_dataset_manager(...) to a regular def so it's a synchronous pytest
fixture. Update the signature while keeping the return type DatasetManager and
ensure the body that constructs DatasetManager, sets manager.dataset_metadata
using DatasetMetadata and DatasetSamplingStrategy.SEQUENTIAL, and returns
manager remains unchanged.
In `@tests/unit/dataset/loader/test_can_load.py`:
- Around line 19-31: The test helper _infer_type differs from production by not
wrapping explicit type resolution; adjust _infer_type to catch ValueError and
KeyError raised by DatasetLoaderType(data["type"]) or plugins.get_class(...) and
re-raise a ValueError with a user-friendly message (mirroring
DatasetManager._infer_type) that includes the invalid type string, then proceed
to call LoaderClass.can_load(data, filename) and raise the existing error if it
cannot handle the data; reference _infer_type, DatasetLoaderType,
plugins.get_class, and LoaderClass.can_load when locating the change.
- Around line 19-46: The test helper _infer_type duplicates production logic
from DatasetManager._infer_type; to prevent drift, add a clear comment in the
test above _infer_type that points to DatasetManager._infer_type as the source
of truth (mention DatasetLoaderType, plugins.iter_all and
PluginType.DATASET_LOADER), and optionally note that the logic should be
extracted to a shared utility in the future; if you prefer immediate
consolidation, move the inference code into a small internal utility function
(used by both DatasetManager._infer_type and the test) and update references in
the test to call that utility instead of duplicating logic.
In `@tests/unit/dataset/test_dataset_manager.py`:
- Around line 91-98: The helper function _mock_load_dummy_conversation is
intentionally written to accept the instance args passed when patching
BaseFileLoader.load but Ruff flags unused args/kwargs (ARG001); update the
signature to accept `_ *args` and `**_kwargs` (or rename to `_args, **_kwargs`)
or add an inline noqa comment (e.g., `# noqa: ARG001`) to the function
definition so Ruff no longer complains while preserving its behavior when used
as a replacement for BaseFileLoader.load.
| - Dataset Loaders: dataset-loaders.md | ||
| - Benchmark Datasets: benchmark_datasets.md | ||
| - Feature Comparison: genai-perf-feature-comparison.md | ||
| - Development: Development.md | ||
| - Development: | ||
| - Creating Dataset Loaders: dev/creating-dataset-loaders.md |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
echo "=== Reading mkdocs.yml to verify navigation structure ==="
cat -n mkdocs.yml
echo ""
echo "=== Checking for dataset-loaders.md ==="
find docs -name "dataset-loaders.md" -o -name "dataset-loaders.md" 2>/dev/null || echo "Not found"
echo ""
echo "=== Checking for creating-dataset-loaders.md ==="
find docs -name "creating-dataset-loaders.md" 2>/dev/null || echo "Not found"
echo ""
echo "=== Checking full 'dev/' directory contents ==="
find docs/dev -type f -name "*.md" 2>/dev/null || echo "dev directory not found or empty"
echo ""
echo "=== Checking all doc files with 'loader' in name ==="
find docs -type f -name "*loader*" 2>/dev/null || echo "No files with 'loader' found"Repository: ai-dynamo/aiperf
Length of output: 2569
Naming collision in navigation structure.
Lines 31 and 32-33 create two navigation entries named "Development" at different hierarchy levels: one under "Reference" pointing to Development.md, and one as a top-level section containing Creating Dataset Loaders. This results in duplicate "Development" items in the navigation menu, creating confusion for users. Consider renaming the new top-level section (e.g., Plugin Development:, Extending AIPerf:, or Contributing:) or merging its content under an existing section.
Both referenced documentation files (dataset-loaders.md and dev/creating-dataset-loaders.md) exist and will not cause build failures.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@mkdocs.yml` around lines 28 - 33, The navigation has a naming collision: two
"Development" entries (one linking to Development.md under Reference and one as
a top-level section containing dev/creating-dataset-loaders.md). Rename the
top-level section to a unique label (e.g., "Plugin Development", "Extending
AIPerf", or "Contributing") or move dev/creating-dataset-loaders.md under the
existing "Development" entry in mkdocs.yml, updating the nav key for the section
that contains dev/creating-dataset-loaders.md and ensuring the other link to
Development.md remains unchanged.
| def _infer_dataset_type(self, file_path: str) -> DatasetLoaderType: | ||
| """Infer the dataset type from the input file. | ||
|
|
||
| Queries all registered loaders to check if they can handle the data format. | ||
|
|
||
| Args: | ||
| file_path: Path to the JSONL file or directory | ||
|
|
||
| Returns: | ||
| DatasetLoaderType if successfully inferred | ||
|
|
||
| Raises: | ||
| ValueError: If no loader can handle the data format | ||
| """ | ||
| path = Path(file_path) | ||
|
|
||
| # If it's a directory, use path-based detection only | ||
| if path.is_dir(): | ||
| return self._infer_type(data=None, filename=file_path) | ||
|
|
||
| # For files, read first non-empty line and use both content and path detection | ||
| try: | ||
| with open(file_path) as f: | ||
| for line in f: | ||
| if not (line := line.strip()): | ||
| continue | ||
| data = load_json_str(line) | ||
| return self._infer_type(data=data, filename=file_path) | ||
| except ValueError as e: | ||
| self.exception( | ||
| f"Error inferring dataset type from file: {file_path}: {e!r}" | ||
| ) | ||
| raise | ||
|
|
||
| raise ValueError(f"Empty file: {file_path}. Cannot infer dataset type.") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, locate the file and check its existence
find . -name "dataset_manager.py" -type fRepository: ai-dynamo/aiperf
Length of output: 99
🏁 Script executed:
# Read the relevant section to verify the code
sed -n '334,368p' src/aiperf/dataset/dataset_manager.pyRepository: ai-dynamo/aiperf
Length of output: 1326
🏁 Script executed:
# Check the _create_file_loader method to see if it calls _infer_dataset_type
rg -n '_create_file_loader|_infer_dataset_type' src/aiperf/dataset/dataset_manager.py -A 10 -B 2Repository: ai-dynamo/aiperf
Length of output: 1668
🏁 Script executed:
# Check _create_loader to see if it's async and calls _create_file_loader
rg -n '_create_loader' src/aiperf/dataset/dataset_manager.py -A 15 -B 2Repository: ai-dynamo/aiperf
Length of output: 1590
🏁 Script executed:
# Check _configure_dataset to see if it calls _create_loader
rg -n '_configure_dataset' src/aiperf/dataset/dataset_manager.py -A 30 -B 2 | head -80Repository: ai-dynamo/aiperf
Length of output: 3548
🏁 Script executed:
# Verify the method signatures (async vs sync)
rg -n 'async def.*_configure_dataset|async def.*_create_loader|def _create_file_loader|def _infer_dataset_type' src/aiperf/dataset/dataset_manager.pyRepository: ai-dynamo/aiperf
Length of output: 387
🏁 Script executed:
# Check if asyncio.to_thread is already used in the file
rg -n 'asyncio\.to_thread|aiofiles' src/aiperf/dataset/dataset_manager.pyRepository: ai-dynamo/aiperf
Length of output: 113
Wrap blocking file I/O in asyncio.to_thread for event loop compatibility.
The open(file_path) at line 356 performs synchronous file I/O. This method is called from _configure_dataset (async) → _create_loader (async) → _create_file_loader (sync) → _infer_dataset_type (sync), running on the event loop thread and blocking it during file reading.
The file already uses asyncio.to_thread at line 136 for the tokenizer (a similar blocking operation). Apply the same pattern here: either wrap the file read in asyncio.to_thread or make _infer_dataset_type and _create_file_loader async and use aiofiles or asyncio.to_thread.
Per coding guidelines: "Use async/await for ALL I/O operations - no blocking calls."
🧰 Tools
🪛 Ruff (0.15.1)
[warning] 368-368: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/aiperf/dataset/dataset_manager.py` around lines 334 - 368, The
_infer_dataset_type function performs blocking file I/O via open(file_path)
which can block the event loop when called from async paths (_configure_dataset
→ _create_loader → _create_file_loader); change the implementation to perform
the file read on a thread (use asyncio.to_thread) or convert
_infer_dataset_type/_create_file_loader to async and use aiofiles so the loop
isn't blocked; specifically, move the logic that opens and iterates lines (and
calls load_json_str) into a callable passed to asyncio.to_thread (or an
aiofiles-based async read) and keep the rest of _infer_type(filename=...,
data=...) unchanged so callers like _create_loader/_configure_dataset remain
non-blocking.
| @property | ||
| def seq_distribution(self): | ||
| """The sequence distribution for ISL/OSL pairing, or None.""" | ||
| return self._seq_distribution |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Read the specific lines in the file
sed -n '110,140p' src/aiperf/dataset/loader/context.py | cat -nRepository: ai-dynamo/aiperf
Length of output: 1340
🏁 Script executed:
#!/bin/bash
# Search for get_sequence_distribution definition and its return type
rg "def get_sequence_distribution" --type py -A 5Repository: ai-dynamo/aiperf
Length of output: 643
🏁 Script executed:
#!/bin/bash
# Find where _seq_distribution is assigned in the context.py file
rg "_seq_distribution" src/aiperf/dataset/loader/context.py -B 2 -A 2Repository: ai-dynamo/aiperf
Length of output: 984
🏁 Script executed:
#!/bin/bash
# Find DistributionParser class and its parse method
rg "class DistributionParser" --type py -A 20Repository: ai-dynamo/aiperf
Length of output: 2103
🏁 Script executed:
#!/bin/bash
# Also search for parse method definition within DistributionParser
rg "def parse" src/aiperf/common/models/sequence_distribution.py -B 2 -A 10Repository: ai-dynamo/aiperf
Length of output: 641
🏁 Script executed:
#!/bin/bash
# Check imports at the top of context.py
head -50 src/aiperf/dataset/loader/context.py | cat -nRepository: ai-dynamo/aiperf
Length of output: 2130
🏁 Script executed:
#!/bin/bash
# Search for SequenceLengthDistribution imports in the codebase
rg "from.*SequenceLengthDistribution|import.*SequenceLengthDistribution" --type pyRepository: ai-dynamo/aiperf
Length of output: 42
🏁 Script executed:
#!/bin/bash
# Check how SequenceLengthDistribution is used in other files
rg "SequenceLengthDistribution" --type py -B 2 | head -40Repository: ai-dynamo/aiperf
Length of output: 3177
🏁 Script executed:
#!/bin/bash
# Check if SequenceLengthDistribution is exported from aiperf.common.models
grep -n "SequenceLengthDistribution" src/aiperf/common/models/__init__.pyRepository: ai-dynamo/aiperf
Length of output: 132
🏁 Script executed:
#!/bin/bash
# Check the structure of aiperf.common.models
cat src/aiperf/common/models/__init__.pyRepository: ai-dynamo/aiperf
Length of output: 5912
Add return type annotation to seq_distribution property.
Per coding guidelines, all functions require type hints on parameters and return types. The correct return type is SequenceLengthDistribution | None (imported from aiperf.common.models), not the generic object | None.
Suggested fix
`@property`
-def seq_distribution(self):
+def seq_distribution(self) -> SequenceLengthDistribution | None:
"""The sequence distribution for ISL/OSL pairing, or None."""
return self._seq_distributionAdd SequenceLengthDistribution to the existing import from aiperf.common.models on line 11.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/aiperf/dataset/loader/context.py` around lines 121 - 124, The property
seq_distribution is missing a return type hint; change its signature to return
SequenceLengthDistribution | None and add SequenceLengthDistribution to the
import list from aiperf.common.models so the property (def
seq_distribution(self) -> SequenceLengthDistribution | None) and its use of
self._seq_distribution are correctly typed.
| with open(self.filename) as f: | ||
| content = f.read() | ||
| dataset = load_json_str(content) |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find the parse_and_validate method in sharegpt.py
rg -n "def parse_and_validate" src/aiperf/dataset/loader/sharegpt.py -A 10Repository: ai-dynamo/aiperf
Length of output: 455
🏁 Script executed:
# Check the class structure and async context
rg -n "class.*Loader\|async def\|def parse_and_validate" src/aiperf/dataset/loader/sharegpt.py -B 2 -A 5Repository: ai-dynamo/aiperf
Length of output: 42
🏁 Script executed:
# Find existing async filesystem patterns in the repo
rg -n "aiofiles|to_thread|run_in_executor|async.*open|read_text" src/aiperf -t py | head -30Repository: ai-dynamo/aiperf
Length of output: 3107
🏁 Script executed:
# Check how the loader is instantiated and called
rg -n "ShareGPT|parse_and_validate" src/aiperf --type py -B 2 -A 2 | head -50Repository: ai-dynamo/aiperf
Length of output: 3527
🏁 Script executed:
# Check the base loader class to understand the architecture
find src/aiperf/dataset/loader -name "*.py" -exec basename {} \; | sortRepository: ai-dynamo/aiperf
Length of output: 250
🏁 Script executed:
# Examine BaseFileLoader class to understand the architecture
rg -n "class BaseFileLoader" src/aiperf/dataset/loader -A 50Repository: ai-dynamo/aiperf
Length of output: 4073
🏁 Script executed:
# Find where parse_and_validate is called
rg -n "parse_and_validate" src/aiperf --type py -B 3 -A 3Repository: ai-dynamo/aiperf
Length of output: 6804
🏁 Script executed:
# Check how loaders are invoked - look for load/initialize patterns
rg -n "def load|async def load|\.parse_and_validate" src/aiperf/dataset/loader/base.py -A 5Repository: ai-dynamo/aiperf
Length of output: 300
Blocking file read will block the async event loop in the loader pipeline.
The open().read() at lines 59-60 performs synchronous blocking I/O, and parse_and_validate() is called directly (without await or wrapping) from the async load() method in BaseFileLoader (line 44 in base.py). This will block the event loop.
Recommended fix: either wrap the parse_and_validate() call with asyncio.to_thread() in the base loader's load() method, or convert parse_and_validate() to async using aiofiles. The repo already uses asyncio.to_thread() in similar contexts (e.g., record_processor_service.py, dataset_manager.py).
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/aiperf/dataset/loader/sharegpt.py` around lines 59 - 61, The synchronous
file read and JSON parsing in parse_and_validate/load_json_str are blocking the
async event loop called from BaseFileLoader.load; change the implementation so
the blocking work runs off the event loop — either make parse_and_validate async
using aiofiles or (preferred to match repo patterns) call the blocking parts
(open/read and load_json_str) inside asyncio.to_thread from the loader's async
entrypoint (e.g., within ShareGPT loader's parse_and_validate or from
BaseFileLoader.load) so that the file open/read and JSON parsing happen on a
threadpool instead of blocking the event loop.
| def __init__(self, model_names: list[str]) -> None: | ||
| self._model_names = model_names | ||
| self._counter = 0 | ||
|
|
||
| def select(self, turn: Turn) -> str: | ||
| """Select the next model in round-robin order.""" | ||
| model_name = self._model_names[self._counter % len(self._model_names)] | ||
| self._counter += 1 | ||
| return model_name |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, let's locate and examine the model_selection_strategies.py file
find . -name "model_selection_strategies.py" -type fRepository: ai-dynamo/aiperf
Length of output: 110
🏁 Script executed:
# Once we find the file, read its full content to understand all three strategies
cat -n src/aiperf/dataset/model_selection_strategies.pyRepository: ai-dynamo/aiperf
Length of output: 3042
🏁 Script executed:
# Search for where ModelSelectionStrategy is instantiated
rg -n "ModelSelectionStrategy\(" --type=py -B 2 -A 2Repository: ai-dynamo/aiperf
Length of output: 368
🏁 Script executed:
# Check _create_model_selector or similar factory methods
rg -n "_create_model_selector" --type=py -A 15Repository: ai-dynamo/aiperf
Length of output: 1929
🏁 Script executed:
# Search for validation of model_names in config or dataset context
rg -n "model_names" --type=py -B 3 -A 3 | head -100Repository: ai-dynamo/aiperf
Length of output: 7813
🏁 Script executed:
# Check EndpointConfig definition and validation
rg -n "class EndpointConfig" --type=py -A 30Repository: ai-dynamo/aiperf
Length of output: 2557
🏁 Script executed:
# Search for model_names field definition in EndpointConfig
rg -n "model_names" --type=py src/aiperf/common/config | head -50Repository: ai-dynamo/aiperf
Length of output: 324
🏁 Script executed:
# Check if there's any validation in EndpointConfig
fd -e py -x grep -l "EndpointConfig" {} \; | head -5 | xargs -I {} grep -l "model_names" {}Repository: ai-dynamo/aiperf
Length of output: 292
🏁 Script executed:
# Let's find the actual EndpointConfig definition
find . -name "*.py" -type f -exec grep -l "class EndpointConfig" {} \;Repository: ai-dynamo/aiperf
Length of output: 105
🏁 Script executed:
# Read the full EndpointConfig to check for validation
cat -n src/aiperf/common/config/endpoint_config.pyRepository: ai-dynamo/aiperf
Length of output: 11093
🏁 Script executed:
# Check if there's validation for empty model_names in config_validators
grep -n "parse_str_or_list" src/aiperf/common/config/config_validators.py -A 10Repository: ai-dynamo/aiperf
Length of output: 1828
🏁 Script executed:
# Check if there are any tests for empty model_names in strategies
rg -n "model_selection_strategies" --type=py tests/ -lRepository: ai-dynamo/aiperf
Length of output: 42
🏁 Script executed:
# Check for any tests that might use empty model_names
rg -n "model_names\s*=\s*\[\s*\]" --type=pyRepository: ai-dynamo/aiperf
Length of output: 42
🏁 Script executed:
# Double-check parse_str_or_list to see if it validates non-empty
cat -n src/aiperf/common/config/config_validators.py | head -50Repository: ai-dynamo/aiperf
Length of output: 2254
All three strategies crash on empty model_names, with no upstream validation to prevent it.
RoundRobinModelSelectionStrategy raises ZeroDivisionError (line 38), RandomModelSelectionStrategy raises IndexError (line 52), and ShuffleModelSelectionStrategy raises IndexError after a no-op reshuffle (line 78). While EndpointConfig.model_names is a required field, there is no min_length constraint (unlike urls which uses min_length=1), and the validator parse_str_or_list does not enforce non-empty lists. Since these are public plugin-registered strategies, adding defensive validation in each constructor would prevent cryptic runtime errors.
🛡️ Proposed fix: validate in constructors
For example, in RoundRobinModelSelectionStrategy:
def __init__(self, model_names: list[str]) -> None:
+ if not model_names:
+ raise ValueError("model_names must not be empty")
self._model_names = model_names
self._counter = 0Apply the same guard to RandomModelSelectionStrategy.__init__ and ShuffleModelSelectionStrategy.__init__.
🧰 Tools
🪛 Ruff (0.15.1)
[warning] 36-36: Unused method argument: turn
(ARG002)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/aiperf/dataset/model_selection_strategies.py` around lines 32 - 40,
RoundRobinModelSelectionStrategy.__init__ (and likewise
RandomModelSelectionStrategy.__init__ and
ShuffleModelSelectionStrategy.__init__) must defensively validate the incoming
model_names list to prevent empty lists; add a guard that checks if model_names
is truthy and has length > 0 and raise a clear ValueError (or TypeError if not a
list) with a helpful message when invalid, so select() cannot raise
ZeroDivisionError/IndexError at runtime; update the constructors to validate
model_names before assigning to self._model_names and initializing self._counter
or shuffle state.
| async def _download(name: str, url: str, remote_filename: str) -> Path: | ||
| """Download a public dataset (or return cached path).""" | ||
| cache_filepath = AIPERF_DATASET_CACHE_DIR / remote_filename | ||
|
|
||
| if cache_filepath.exists(): | ||
| return cache_filepath | ||
|
|
||
| http_client = AioHttpClient(timeout=Environment.DATASET.PUBLIC_DATASET_TIMEOUT) | ||
| try: | ||
| record: RequestRecord = await http_client.get_request( | ||
| url, headers={"Accept": "application/json"} | ||
| ) | ||
| content = record.responses[0].text | ||
| finally: | ||
| await http_client.close() | ||
|
|
||
| try: | ||
| cache_filepath.parent.mkdir(parents=True, exist_ok=True) | ||
| with open(cache_filepath, "w") as f: | ||
| f.write(content) | ||
| except Exception as e: | ||
| raise DatasetLoaderError(f"Error saving dataset to local cache: {e}") from e |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check the actual file and surrounding context
cat -n src/aiperf/dataset/public_datasets.pyRepository: ai-dynamo/aiperf
Length of output: 4537
🏁 Script executed:
# Search for async filesystem patterns in the codebase
rg -n "aiofiles|to_thread|run_in_executor|asyncio\.to_thread|write_text|read_text" src/aiperf --type pyRepository: ai-dynamo/aiperf
Length of output: 3424
🏁 Script executed:
# Find where _download is called from to understand the context
rg -n "_download\(|from.*public_datasets import" src/aiperf --type py -A 2Repository: ai-dynamo/aiperf
Length of output: 649
🏁 Script executed:
# Check for BaseFileLoader or similar async patterns in the dataset module
rg -n "async def.*open|async.*write|async.*mkdir|class.*FileLoader" src/aiperf/dataset --type pyRepository: ai-dynamo/aiperf
Length of output: 655
Use async file operations for filesystem I/O inside async function.
The _download function at lines 78–99 is async but uses blocking filesystem calls (Path.exists(), mkdir(), open().write()), which stall the event loop during dataset downloads. The codebase has established patterns: use aiofiles (as seen in raw_record_writer_processor.py, buffered_jsonl_writer_mixin.py, memory_map_utils.py) or asyncio.to_thread() (as used throughout for blocking operations like inference_result_parser.py, dataset_manager.py).
Rewrite the file operations (lines 82, 95–97) to use aiofiles or wrap with asyncio.to_thread().
🧰 Tools
🪛 Ruff (0.15.1)
[warning] 78-78: Unused function argument: name
(ARG001)
[warning] 99-99: Avoid specifying long messages outside the exception class
(TRY003)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/aiperf/dataset/public_datasets.py` around lines 78 - 99, The async
function _download uses blocking filesystem calls on cache_filepath
(Path.exists(), cache_filepath.parent.mkdir, and open(...).write) which can
block the event loop; change these to async equivalents by either using aiofiles
for file write (open/write) and using asyncio.to_thread for Path.exists() and
mkdir (or use asyncio.to_thread for all three) so all filesystem I/O in
_download is non-blocking; ensure you reference AIPERF_DATASET_CACHE_DIR and
cache_filepath in the changes and preserve the existing error handling that
raises DatasetLoaderError on save failures.
| def test_dataset_type_without_file_raises_error(): | ||
| """ | ||
| Test that setting custom_dataset_type without a file raises ValidationError. | ||
| Test that setting dataset_type without a file raises ValidationError. | ||
|
|
||
| This validates the validate_custom_dataset_file model validator. | ||
| """ | ||
| with pytest.raises(ValidationError) as exc: | ||
| InputConfig(custom_dataset_type=CustomDatasetType.SINGLE_TURN, file=None) | ||
| InputConfig(dataset_type=DatasetLoaderType.SINGLE_TURN, file=None) | ||
|
|
||
| assert "Custom dataset type requires --input-file to be provided" in str(exc.value) | ||
| assert "Dataset type requires --input-file to be provided" in str(exc.value) | ||
|
|
There was a problem hiding this comment.
Stale docstring reference to old validator name.
Line 142 still references validate_custom_dataset_file, which was renamed in this refactor. Update to match the new validator name.
📝 Suggested fix
def test_dataset_type_without_file_raises_error():
"""
Test that setting dataset_type without a file raises ValidationError.
- This validates the validate_custom_dataset_file model validator.
+ This validates the validate_dataset_type_file model validator.
"""🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/common/config/test_input_config.py` around lines 138 - 148, The
test docstring in test_dataset_type_without_file_raises_error still references
the old validator name `validate_custom_dataset_file`; update the docstring to
use the new validator name (e.g., the current validator on InputConfig) so it
reflects the refactor—locate the test function
`test_dataset_type_without_file_raises_error` and replace the stale reference
with the new validator identifier while keeping the rest of the docstring
intact.
| @@ -679,30 +679,39 @@ class TestSequenceCaching: | |||
| def test_turn_sequence_caching(self): | |||
There was a problem hiding this comment.
Test names don't follow the required test_<function>_<scenario>_<expected> convention
Both test method names omit the scenario and expected-outcome components mandated by the project conventions.
Suggested renames:
test_turn_sequence_caching→test_get_turn_sequence_lengths_same_turn_id_returns_cached_valuetest_different_turns_get_different_cache_entries→test_get_turn_sequence_lengths_distinct_turn_ids_cached_independently
As per coding guidelines: "Name tests as test_<function>_<scenario>_<expected> (e.g., test_parse_config_missing_field_raises_error)".
Also applies to: 731-731
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/common/models/test_sequence_distribution.py` at line 679, The test
method names do not follow the required test_<function>_<scenario>_<expected>
convention; rename the methods to be explicit: change test_turn_sequence_caching
to test_get_turn_sequence_lengths_same_turn_id_returns_cached_value and change
test_different_turns_get_different_cache_entries to
test_get_turn_sequence_lengths_distinct_turn_ids_cached_independently so the
function under test (get_turn_sequence_lengths) and the scenario/expected
outcome are encoded in the name; update any references/imports or test markers
that use those names accordingly (e.g., in
tests/unit/common/models/test_sequence_distribution.py where
test_turn_sequence_caching and test_different_turns_get_different_cache_entries
are defined).
|
|
||
| from aiperf.common.models import Turn | ||
| from aiperf.dataset.composer.base import BaseDatasetComposer | ||
| from aiperf.common.models import Conversation, Turn |
There was a problem hiding this comment.
Conversation is imported but never used in either test method
Both test_turn_sequence_caching (line 682) and test_different_turns_get_different_cache_entries (line 734) import Conversation from aiperf.common.models, but neither method instantiates it — only Turn is used. Remove the unused import to silence the linter.
🛠️ Proposed fix
- from aiperf.common.models import Conversation, Turn
+ from aiperf.common.models import Turn(apply to both test methods)
Also applies to: 734-734
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/common/models/test_sequence_distribution.py` at line 682, Remove
the unused Conversation import from the test module: in the import line that
currently reads "from aiperf.common.models import Conversation, Turn" used by
test_turn_sequence_caching and test_different_turns_get_different_cache_entries,
drop Conversation so only Turn is imported; update the import statement
accordingly to silence the linter and leave test logic (Turn usage) unchanged.
| # Set up loader with distribution | ||
| loader = MockLoader.__new__(MockLoader) | ||
| dist = DistributionParser.parse("128,64:50;256,128:50") | ||
| composer._seq_distribution = dist | ||
| composer._turn_sequence_cache = {} | ||
| loader._seq_distribution = dist | ||
| loader._turn_sequence_cache = {} | ||
| # Use the global RNG instead of _seq_rng | ||
| composer._rng = rng.derive("test_composer") | ||
| loader._rng = rng.derive("test_loader") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# First, find the file structure and locate BaseDatasetLoader
find . -type f -name "*.py" | grep -E "(loader|dataset)" | head -20Repository: ai-dynamo/aiperf
Length of output: 874
🏁 Script executed:
# Search for BaseDatasetLoader class definition and __init__
rg -n "class BaseDatasetLoader" --type py -A 50Repository: ai-dynamo/aiperf
Length of output: 3467
🏁 Script executed:
# Search for the two methods being tested
rg -n "def _get_turn_sequence_lengths" --type py -A 20
rg -n "def _clear_turn_cache" --type py -A 15Repository: ai-dynamo/aiperf
Length of output: 42
🏁 Script executed:
# Also check what MockLoader looks like and where it's defined
rg -n "class MockLoader" --type py -B 2 -A 30Repository: ai-dynamo/aiperf
Length of output: 6277
🏁 Script executed:
# Search for _get_turn_sequence_lengths method
rg -n "def _get_turn_sequence_lengths" --type py -A 20 -B 2Repository: ai-dynamo/aiperf
Length of output: 42
🏁 Script executed:
# Search for _clear_turn_cache method
rg -n "def _clear_turn_cache" --type py -A 15 -B 2Repository: ai-dynamo/aiperf
Length of output: 42
🏁 Script executed:
# Also check if these methods are defined in BaseDatasetLoader
rg -n "_get_turn_sequence_lengths\|_clear_turn_cache" --type py -B 5 -A 3Repository: ai-dynamo/aiperf
Length of output: 42
🏁 Script executed:
# Read the test file lines 680-730 to see the full test setup and execution
sed -n '680,730p' tests/unit/common/models/test_sequence_distribution.pyRepository: ai-dynamo/aiperf
Length of output: 1966
🏁 Script executed:
# Read the second test around lines 740-780
sed -n '740,780p' tests/unit/common/models/test_sequence_distribution.pyRepository: ai-dynamo/aiperf
Length of output: 1523
🏁 Script executed:
# Search everywhere for these method names
rg -n "get_turn_sequence_lengths" --type py
rg -n "clear_turn_cache" --type pyRepository: ai-dynamo/aiperf
Length of output: 1327
🏁 Script executed:
# Check the full BaseDatasetLoader class to see all its methods
cat -n src/aiperf/dataset/loader/base.py | head -100Repository: ai-dynamo/aiperf
Length of output: 2618
🏁 Script executed:
# Check if there's a mixin that might provide these methods
rg -n "class.*Mixin" src/aiperf/dataset/loader/ --type py -A 5Repository: ai-dynamo/aiperf
Length of output: 3364
🏁 Script executed:
# Check LoaderContext.get_turn_sequence_lengths
cat -n src/aiperf/dataset/loader/context.py | head -150Repository: ai-dynamo/aiperf
Length of output: 6587
🏁 Script executed:
# Search for _seq_distribution usage in BaseDatasetLoader or context
rg -n "_seq_distribution" --type py -B 2 -A 2Repository: ai-dynamo/aiperf
Length of output: 3046
Test calls non-existent methods _get_turn_sequence_lengths and _clear_turn_cache on MockLoader
These methods do not exist in BaseDatasetLoader or any parent class. The test instantiates MockLoader via __new__ and manually assigns _seq_distribution, _turn_sequence_cache, and _rng, then immediately calls loader._get_turn_sequence_lengths(turn_id) at lines 712, 713, 714, 728, 768, 769 and loader._clear_turn_cache(turn_id) at lines 725 and 780, which will fail with AttributeError.
Note that _seq_distribution and _turn_sequence_cache belong to LoaderContext, not to loaders. The public method is ctx.get_turn_sequence_lengths(), not a loader method. If these private methods are intended as new functionality on BaseDatasetLoader, they must be implemented first.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@tests/unit/common/models/test_sequence_distribution.py` around lines 699 -
705, The test is calling nonexistent loader methods `_get_turn_sequence_lengths`
and `_clear_turn_cache` on MockLoader; fix by using the correct public API on
the context or implement the methods on the loader: either change the test to
call the LoaderContext public method `ctx.get_turn_sequence_lengths(turn_id)`
(and use the context to clear any cache instead of calling `_clear_turn_cache`)
and ensure `_seq_distribution` and `_turn_sequence_cache` are set on the
`LoaderContext`, or add explicit implementations of `_get_turn_sequence_lengths`
and `_clear_turn_cache` on `BaseDatasetLoader` that delegate to the loader's
`LoaderContext` storage (`_seq_distribution`, `_turn_sequence_cache`) so the
existing test calls succeed; update references to `MockLoader`,
`BaseDatasetLoader`, `LoaderContext`, `ctx.get_turn_sequence_lengths`,
`_seq_distribution`, and `_turn_sequence_cache` accordingly.
Summary by CodeRabbit
Documentation
New Features