This document describes datasets that AIPerf can use to generate stimulus. Additional support is under development, so check back often.
| Dataset | Support | Data Source |
|---|---|---|
| Synthetic Text | ✅ | Synthetically generated text prompts pulled from Shakespeare |
| Synthetic Audio | ✅ | Synthetically generated audio samples |
| Synthetic Images | ✅ | Synthetically generated image samples |
| Custom Data | ✅ | --input-file your_file.jsonl --custom-dataset-type single_turn |
| Mooncake | ✅ | Mooncake trace file --input-file your_trace_file.jsonl --custom-dataset-type mooncake_trace |
| ShareGPT | ✅ | Conversations from --public-dataset sharegpt |