Skip to content

Commit a546a16

Browse files
committed
update code
Signed-off-by: guangli.bao <[email protected]>
1 parent d246dee commit a546a16

File tree

4 files changed

+16
-22
lines changed

4 files changed

+16
-22
lines changed

docs/datasets.md

Lines changed: 14 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -221,32 +221,26 @@ benchmark_generative_text(data=data, ...)
221221
- For lists of items, all elements must be of the same type.
222222
- A processor/tokenizer is only required if `GUIDELLM__PREFERRED_PROMPT_TOKENS_SOURCE="local"` or `GUIDELLM__PREFERRED_OUTPUT_TOKENS_SOURCE="local"` is set in the environment. In this case, the processor/tokenizer must be specified using the `--processor` argument. If not set, the processor/tokenizer will be set to the model passed in or retrieved from the server.
223223

224-
225224
### ShareGPT Datasets
226225

227226
You can use ShareGPT_V3_unfiltered_cleaned_split.json as benchmark datasets.
228227

229-
1. Download and prepare the ShareGPT dataset
230-
You can specify the proportion of data to process by providing a number between 0 and 1 as an argument to the script.
228+
#### Example Commands
231229

232-
```bash
233-
cd src/guidellm/utils
234-
pip install -r requirements.txt
235-
bash prepare_sharegpt_data.sh 1
236-
```
230+
Download and prepare the ShareGPT dataset; You can specify the proportion of data to process by providing a number between 0 and 1 as an argument to the script.
237231

238-
In this example, 1 indicates processing 100% of the dataset. You can adjust this value as needed.
232+
```bash
233+
cd src/guidellm/utils && pip install -r requirements.txt && bash prepare_sharegpt_data.sh 1
239234

240-
Conda env Recommanded to install libs.
235+
```
241236

242-
2. Run the benchmark
243-
Example:
237+
In this example, 1 indicates processing 100% of the dataset. You can adjust this value as needed. Conda env Recommanded to install libs.
244238

245-
```bash
246-
guidellm benchmark \
247-
--target "http://localhost:8000" \
248-
--rate-type "throughput" \
249-
--data-args '{"prompt_column": "value", "split": "train"}' \
250-
--max-requests 10 \
251-
--data "/${local_path}/ShareGPT.json"
252-
```
239+
```bash
240+
guidellm benchmark \
241+
--target "http://localhost:8000" \
242+
--rate-type "throughput" \
243+
--data-args '{"prompt_column": "value", "split": "train"}' \
244+
--max-requests 10 \
245+
--data "/${local_path}/ShareGPT.json"
246+
```
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
#!/bin/bash
22

33
wget https://huggingface.co/datasets/anon8231489123/ShareGPT_Vicuna_unfiltered/resolve/main/ShareGPT_V3_unfiltered_cleaned_split.json
4-
python3 sharegpt_data_preprocessing.py --parse $1
4+
python3 preprocessing_sharegpt_data.py --parse $1
Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
11
tqdm
22
pandas
33
openai
4-
pyyaml
4+
pyyaml

0 commit comments

Comments
 (0)