Skip to content

Commit 33d60c9

Browse files
committed
update documentation
1 parent fd6e375 commit 33d60c9

File tree

1 file changed

+26
-0
lines changed

1 file changed

+26
-0
lines changed

README.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -237,6 +237,32 @@ This is the user interface that users will interact with.
237237
By following these steps, you will be able to serve your models using the web UI. You can open your browser and chat with a model now.
238238
If the models do not show up, try to reboot the gradio web server.
239239

240+
## Launch Chatbot Arena
241+
242+
Currently, Chatbot Arena is powered by FastChat. Here is how you can launch an instance of Chatbot Arena locally.
243+
244+
Create a file `api_endpoint.json` and record the the api endpoints of the models you want to serve, for example:
245+
```
246+
{
247+
"gpt-4o-2024-05-13": {
248+
"model_name": "gpt-4o-2024-05-13",
249+
"api_base": "https://api.openai.com/v1",
250+
"api_type": "openai",
251+
"api_key": [Insert API Key],
252+
"anony_only": false
253+
}
254+
}
255+
```
256+
Then make sure to export `OPENAI_API_KEY` with your openai key. For anthropic model, set `api_type` to `"anthropic_message"` and export `ANTHROPIC_API_KEY` with your anthropic key. For gemini model, set `api_type` to `"gemini"` and export `GEMINI_API_KEY` with your gemini key. For additional information, you can refer to `fastchat/serve/api_provider.py` for implementation details of other model types.
257+
258+
If you want to serve your own model using local gpus, following the instructions in [Serving with Web GUI](#serving-with-web-gui).
259+
260+
To launch a gradio web server, run `gradio_web_server_multi.py`.
261+
```
262+
cd fastchat/serve
263+
python gradio_web_server_multi.py --port 8080 --share --register-api-endpoint-file api_endpoint.json
264+
```
265+
240266
#### (Optional): Advanced Features, Scalability, Third Party UI
241267
- You can register multiple model workers to a single controller, which can be used for serving a single model with higher throughput or serving multiple models at the same time. When doing so, please allocate different GPUs and ports for different model workers.
242268
```

0 commit comments

Comments
 (0)