You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+27Lines changed: 27 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -237,6 +237,33 @@ This is the user interface that users will interact with.
237
237
By following these steps, you will be able to serve your models using the web UI. You can open your browser and chat with a model now.
238
238
If the models do not show up, try to reboot the gradio web server.
239
239
240
+
## Launch Chatbot Arena (side-by-side battle UI)
241
+
242
+
Currently, Chatbot Arena is powered by FastChat. Here is how you can launch an instance of Chatbot Arena locally.
243
+
244
+
FastChat supports popular API-based models such as OpenAI, Anthropic, Gemini, Mistral and more. To add a custom API, please refer to the model support [doc](./docs/model_support.md). Below we take OpenAI models as an example.
245
+
246
+
Create a JSON configuration file `api_endpoint.json` with the api endpoints of the models you want to serve, for example:
247
+
```
248
+
{
249
+
"gpt-4o-2024-05-13": {
250
+
"model_name": "gpt-4o-2024-05-13",
251
+
"api_base": "https://api.openai.com/v1",
252
+
"api_type": "openai",
253
+
"api_key": [Insert API Key],
254
+
"anony_only": false
255
+
}
256
+
}
257
+
```
258
+
For Anthropic models, specify `"api_type": "anthropic_message"` with your Anthropic key. Similarly, for gemini model, specify `"api_type": "gemini"`. More details can be found in [api_provider.py](https://github.com/lm-sys/FastChat/blob/main/fastchat/serve/api_provider.py).
259
+
260
+
To serve your own model using local gpus, follow the instructions in [Serving with Web GUI](#serving-with-web-gui).
#### (Optional): Advanced Features, Scalability, Third Party UI
241
268
- You can register multiple model workers to a single controller, which can be used for serving a single model with higher throughput or serving multiple models at the same time. When doing so, please allocate different GPUs and ports for different model workers.
We have pre-generated several category classifier benchmarks and ground truths. You can download them (with [`git-lfs`](https://git-lfs.com) installed) to the directory `classify/` by running
// cd into classify/ and then copy the label_bench directory to the current directory
6
+
> cp -r categories-benchmark-eval/label_bench .
7
+
```
8
+
Your label_bench directory should follow the structure:
9
+
```markdown
10
+
├── label_bench/
11
+
│ ├── creative_writing_bench/
12
+
│ │ ├── data/
13
+
│ │ │ └── llama-v3p1-70b-instruct.json
14
+
│ │ └── test.json
15
+
│ ├── ...
16
+
│ ├── your_bench_name/
17
+
│ │ ├── data/
18
+
│ │ │ ├── your_classifier_data_1.json
19
+
│ │ │ ├── your_classifier_data_2.json
20
+
│ │ │ └── ...
21
+
│ │ └── test.json (your ground truth)
22
+
└── ...
23
+
```
24
+
25
+
## How to evaluate your category classifier?
26
+
27
+
To test your new classifier for a new category, you would have to make sure you created the category child class in `category.py`. Then, to generate classification labels, make the necessary edits in `config.yaml` and run
28
+
```console
29
+
python label.py --config config.yaml --testing
30
+
```
31
+
32
+
Then, add your new category bench to `tag_names` in `display_score.py`. After making sure that you also have a correctly formatted ground truth json file, you can report the performance of your classifier by running
33
+
```console
34
+
python display_score.py --bench <your_bench>
35
+
```
36
+
37
+
If you want to check out conflicts between your classifier and ground truth, use
0 commit comments