You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Further instructions on the final submission will be published as the deadline approaches.
92
-
93
-
<!-- Fork the `mlperf-inference-results-scc25` branch of the repository URL at [mlperf-automations](https://github.com/mlcommons/mlperf-automations).
90
+
MLCommons provides students with a [Submission UI](https://submissions-ui.mlcommons.org/index), where they can upload the generated **.tar** file using their assigned submission ID.
94
91
95
-
Run the following command after **replacing `--repo_url` with your GitHub fork URL**.
92
+
The deadline for submitting results is 6:00 PM CDT on November 17 (Monday), 2025.
Alternatively, students may use the Submission CLI provided through the MLCFlow automation. To do this, first follow the installation steps in this [guide](../../../install/index.md).
95
+
After installing, follow the instructions under [**Upload the final submission**](https://docs.mlcommons.org/inference/submission/#upload-the-final-submission).
104
96
105
-
Once uploaded give a Pull Request to the origin repository. Github action will be running there and once
106
-
finished you can see your submitted results at [https://docs.mlcommons.org/mlperf-automations](https://docs.mlcommons.org/mlperf-automations). -->
This will download the full preprocessed dataset file (`mlperf_deepseek_r1_dataset_4388_fp8_eval.pkl`) and the calibration dataset file (`mlperf_deepseek_r1_calibration_dataset_500_fp8_eval.pkl`).
54
55
55
56
To specify a custom download directory, use the `-d` flag:
**NOTE**: `sglang` backend uses `sglang==0.5.4` installed into `lmsysorg/sglang:v0.5.2-cu129-b200` base image.
109
+
106
110
## Backend-Specific Setup
107
111
108
112
After launching any Docker container, run the setup script which automatically detects your backend:
@@ -115,6 +119,7 @@ setup.sh
115
119
The setup script creates a virtual environment and configures it differently based on the backend:
116
120
117
121
#### All Backends
122
+
118
123
- Virtual environment is **activated** after `setup.sh`
119
124
- Activate backend-specific venv using `source .venv_[pytorch|vllm|sglang]/bin/activate`
120
125
- All commands are to be run using the virtual environment
@@ -159,6 +164,7 @@ The reference implementation includes full support for MLPerf inference benchmar
159
164
### Running MLPerf Benchmarks
160
165
161
166
#### Offline Scenario
167
+
162
168
```bash
163
169
(.venv_BACKEND) $ python run_mlperf.py \
164
170
--mode offline \
@@ -167,13 +173,25 @@ The reference implementation includes full support for MLPerf inference benchmar
167
173
```
168
174
169
175
#### Server Scenario
176
+
170
177
```bash
171
178
(.venv_BACKEND) $ python run_mlperf.py \
172
179
--mode server \
173
180
--input-file <input_dataset>.pkl \
174
181
--output-dir mlperf_results
175
182
```
176
183
184
+
#### Interactive Scenario
185
+
186
+
```bash
187
+
(.venv_BACKEND) $ python run_mlperf.py \
188
+
--mode interactive \
189
+
--input-file <input_dataset>.pkl \
190
+
--output-dir mlperf_results
191
+
```
192
+
193
+
**NOTE:** to enable Speculative Decoding for Sglang Backend, toggle `BACKEND_REGISTRY['sglang']['enable_speculative_decode']` in `utils/backend_registry.py` (disabled by default).
194
+
177
195
#### Pytorch Backend for Mlperf
178
196
179
197
PyTorch backend uses distributed execution with `torchrun` and `run_mlperf_mpi.py`:
@@ -188,24 +206,36 @@ PyTorch backend uses distributed execution with `torchrun` and `run_mlperf_mpi.p
> **Note**: For PyTorch backend, use the `_mpi` versions with `torchrun`. For vLLM and SGLang backends, use the single-process versions without `_mpi`.
208
226
227
+
## Speculative Decoding
228
+
229
+
For the DeepSeek-R1 Interactive Scenario, users can enable Speculative Decoding Optimization for the SGLANG Backend by setting the `enable_speculative_decode` flag to `True` in `language/deepseek-r1/utils/backend_registry.py`.
230
+
231
+
When Enabled, SGLANG backend will run the allowed configuration as per [Inference Policies](https://github.com/mlcommons/inference_policies/blob/master/inference_rules.adoc) (appendix-speculative-decoding):
0 commit comments