You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
|**Load Generator**|`src/inference_endpoint/load_generator/`| Central orchestrator: `BenchmarkSession` owns the lifecycle, `Scheduler` controls timing, `LoadGenerator` issues queries |
52
-
|**Endpoint Client**|`src/inference_endpoint/endpoint_client/`| Multi-process HTTP workers communicating via ZMQ IPC. `HTTPEndpointClient` is the main entry point |
53
-
|**Dataset Manager**|`src/inference_endpoint/dataset_manager/`| Loads pickle, HuggingFace, JSONL datasets. `Dataset` base class with `load_sample()`/`num_samples()` interface |
54
-
|**Metrics**|`src/inference_endpoint/metrics/`|`EventRecorder` writes to SQLite, `MetricsReporter` reads and aggregates (QPS, latency, TTFT, TPOT) |
55
-
|**Config**|`src/inference_endpoint/config/`| Pydantic-based YAML schema (`schema.py`), ruleset registry for MLCommons compliance, `RuntimeSettings` for runtime state |
56
-
|**CLI**|`src/inference_endpoint/config/cli.py`| cyclopts-based, auto-generated from `schema.py` Pydantic models. Flat shorthands via `cyclopts.Parameter(name=...)`|
|**Load Generator**|`src/inference_endpoint/load_generator/`| Central orchestrator: `BenchmarkSession` owns the lifecycle, `Scheduler` controls timing, `LoadGenerator` issues queries |
52
+
|**Endpoint Client**|`src/inference_endpoint/endpoint_client/`| Multi-process HTTP workers communicating via ZMQ IPC. `HTTPEndpointClient` is the main entry point |
53
+
|**Dataset Manager**|`src/inference_endpoint/dataset_manager/`| Loads pickle, HuggingFace, JSONL datasets. `Dataset` base class with `load_sample()`/`num_samples()` interface |
54
+
|**Metrics**|`src/inference_endpoint/metrics/`|`EventRecorder` writes to SQLite, `MetricsReporter` reads and aggregates (QPS, latency, TTFT, TPOT) |
55
+
|**Config**|`src/inference_endpoint/config/`| Pydantic-based YAML schema (`schema.py`), ruleset registry for MLCommons compliance, `RuntimeSettings` for runtime state |
56
+
|**CLI**|`src/inference_endpoint/cli.py`, `commands/benchmark/cli.py`| cyclopts-based, auto-generated from `schema.py` Pydantic models. Flat shorthands via `cyclopts.Parameter(alias=...)`|
0 commit comments