Skip to content

Commit c90f9ca

Browse files
committed
merge
2 parents b65c565 + 05b9305 commit c90f9ca

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

41 files changed

+5543
-955
lines changed

.gitignore

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -29,3 +29,8 @@ tests/state_of_the_union.txt
2929

3030
# Build
3131
build
32+
33+
# Image data
34+
serve_images
35+
val2014
36+
vqa_examples

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1,9 +1,9 @@
11
# FastChat
2-
| [**Demo**](https://chat.lmsys.org/) | [**Discord**](https://discord.gg/HSWAKCrnFx) | [**X**](https://x.com/lmsysorg) |
2+
| [**Demo**](https://lmarena.ai/) | [**Discord**](https://discord.gg/HSWAKCrnFx) | [**X**](https://x.com/lmsysorg) |
33

44
FastChat is an open platform for training, serving, and evaluating large language model based chatbots.
5-
- FastChat powers Chatbot Arena (https://chat.lmsys.org/), serving over 10 million chat requests for 70+ LLMs.
6-
- Chatbot Arena has collected over 500K human votes from side-by-side LLM battles to compile an online [LLM Elo leaderboard](https://leaderboard.lmsys.org).
5+
- FastChat powers Chatbot Arena ([lmarena.ai](https://lmarena.ai)), serving over 10 million chat requests for 70+ LLMs.
6+
- Chatbot Arena has collected over 1.5M human votes from side-by-side LLM battles to compile an online [LLM Elo leaderboard](https://lmarena.ai/?leaderboard).
77

88
FastChat's core features include:
99
- The training and evaluation code for state-of-the-art models (e.g., Vicuna, MT-Bench).
@@ -26,7 +26,7 @@ FastChat's core features include:
2626

2727
</details>
2828

29-
<a href="https://chat.lmsys.org"><img src="assets/demo_narrow.gif" width="70%"></a>
29+
<a href="https://lmarena.ai"><img src="assets/demo_narrow.gif" width="70%"></a>
3030

3131
## Contents
3232
- [Install](#install)
@@ -97,7 +97,7 @@ You can use the commands below to chat with them. They will automatically downlo
9797

9898
## Inference with Command Line Interface
9999

100-
<a href="https://chat.lmsys.org"><img src="assets/screenshot_cli.png" width="70%"></a>
100+
<a href="https://lmarena.ai"><img src="assets/screenshot_cli.png" width="70%"></a>
101101

102102
(Experimental Feature: You can specify `--style rich` to enable rich text output and better text streaming quality for some non-ASCII content. This may not work properly on certain terminals.)
103103

@@ -202,7 +202,7 @@ export FASTCHAT_USE_MODELSCOPE=True
202202

203203
## Serving with Web GUI
204204

205-
<a href="https://chat.lmsys.org"><img src="assets/screenshot_gui.png" width="70%"></a>
205+
<a href="https://lmarena.ai"><img src="assets/screenshot_gui.png" width="70%"></a>
206206

207207
To serve using the web UI, you need three main components: web servers that interface with users, model workers that host one or more models, and a controller to coordinate the webserver and model workers. You can learn more about the architecture [here](docs/server_arch.md).
208208

docker/Dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,4 +4,4 @@ RUN apt-get update -y && apt-get install -y python3.9 python3.9-distutils curl
44
RUN curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py
55
RUN python3.9 get-pip.py
66
RUN pip3 install fschat
7-
RUN pip3 install fschat[model_worker,webui] pydantic==1.10.13
7+
RUN pip3 install fschat[model_worker,webui]

docs/arena.md

Lines changed: 34 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,5 @@
11
# Chatbot Arena
2-
Chatbot Arena is an LLM benchmark platform featuring anonymous, randomized battles, available at https://chat.lmsys.org.
2+
Chatbot Arena is an LLM benchmark platform featuring anonymous, randomized battles, available at https://lmarena.ai.
33
We invite the entire community to join this benchmarking effort by contributing your votes and models.
44

55
## How to add a new model
@@ -13,3 +13,36 @@ If you have a model hosted by a 3rd party API provider or yourself, please give
1313
### Method 2: Hosted by LMSYS
1414
1. Contribute the code to support this model in FastChat by submitting a pull request. See [instructions](model_support.md).
1515
2. After the model is supported, we will try to schedule some compute resources to host the model in the arena. However, due to the limited resources we have, we may not be able to serve every model. We will select the models based on popularity, quality, diversity, and other factors.
16+
17+
18+
## How to launch vision arena
19+
20+
1. Run `python3 -m fastchat.serve.controller` to start the controller and begin registering local model workers and API-provided workers.
21+
2. Run `python3 -m fastchat.serve.sglang_worker --model-path <model-path> --tokenizer-path <tokenizer-path>` to run local vision-language models. Currently supported models include the LLaVA and Yi-VL series.
22+
3. If you are using a 3rd party model with an API provider (e.g. GPT-4-V, Gemini 1.5), please follow the instructions [model_support.md](model_support.md) to add a json file `api_endpoints.json`.
23+
4. Run the gradio server with the `--vision-arena` flag on.
24+
5. To run and store images into a remote directory, add the flag: `--use-remote-storage`
25+
6. To run and allow samples of random questions, add `--random_questions metadata_sampled.json`. Check sections below for how to generate this.
26+
27+
Example command:
28+
```
29+
python3 -m fastchat.serve.gradio_web_server_multi --share --register-api-endpoint-file api_endpoints.json --vision-arena --use-remote-storage --random-questions metadata_sampled.json
30+
```
31+
32+
### NSFW and CSAM Detection
33+
1. Adding NSFW Endpoint and API key: Please add the following environment variables to run the NSFW moderation filter for images:
34+
- `AZURE_IMG_MODERATION_ENDPOINT`: This is the endpoint that the NSFW moderator is hosted (e.g. https://{endpoint}/contentmoderator/moderate/v1.0/ProcessImage/Evaluate). Change the `endpoint` to your own.
35+
- `AZURE_IMG_MODERATION_API_KEY`: Your API key to run this endpoint.
36+
2. Adding CSAM API key:
37+
- `PHOTODNA_API_KEY`: The API key that runs the CSAM detector endpoint.
38+
39+
Example in `~/.bashrc`:
40+
```
41+
export AZURE_IMG_MODERATION_ENDPOINT=https://<endpoint>/contentmoderator/moderate/v1.0/ProcessImage/Evaluate
42+
export AZURE_IMG_MODERATION_API_KEY=<api-key>
43+
export PHOTODNA_API_KEY=<api-key>
44+
```
45+
46+
### Adding Random Samples for VQA
47+
We provide random samples of example images for users to interact with coming from various datasets including DocVQA, RealWorldQA, ChartQA and VizWiz-VQA.
48+
1. Download the images and generate random questions file by running `python fastchat/serve/vision/create_vqa_examples_dir.py`

docs/dashinfer_integration.md

Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# dash-infer Integration
2+
[DashInfer](https://github.com/modelscope/dash-infer) is a high-performance inference engine specifically optimized for CPU environments, delivering exceptional performance boosts for LLM inference tasks. It supports acceleration for a variety of models including Llama, Qwen, and ChatGLM, making it a versatile choice as a performant worker in FastChat. Notably, DashInfer exhibits significant performance enhancements on both Intel x64 and ARMv9 processors, catering to a wide spectrum of hardware platforms. Its efficient design and optimization techniques ensure rapid and accurate inference capabilities, making it an ideal solution for deploying large language models in resource-constrained environments or scenarios where CPU utilization is preferred over GPU acceleration.
3+
4+
## Instructions
5+
1. Install dash-infer.
6+
```
7+
pip install dashinfer
8+
```
9+
10+
2. When you launch a model worker, replace the normal worker (`fastchat.serve.model_worker`) with the dash-infer worker (`fastchat.serve.dashinfer_worker`). All other commands such as controller, gradio web server, and OpenAI API server are kept the same.
11+
```
12+
python3 -m fastchat.serve.dashinfer_worker --model-path qwen/Qwen-7B-Chat --revision=master /path/to/dashinfer-model-generation-config.json
13+
```
14+
Here is an example:
15+
```
16+
python3 -m fastchat.serve.dashinfer_worker --model-path qwen/Qwen-7B-Chat --revision=master dash-infer/examples/python/model_config/config_qwen_v10_7b.json
17+
```
18+
19+
If you use an already downloaded model, try to replace model-path with a local one and choose a conversation template via --conv-template option
20+
'''
21+
python3 -m fastchat.serve.dashinfer_worker --model-path ~/.cache/modelscope/hub/qwen/Qwen-7B-Chat --conv-template qwen-7b-chat /path/to/dashinfer-model-generation-config.json
22+
'''
23+
All avaliable conversation chat templates are listed at [fastchat/conversation.py](../fastchat/conversation.py)

docs/dataset_release.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,5 +2,6 @@
22
We release the following datasets based on our projects and websites.
33

44
- [LMSYS-Chat-1M: A Large-Scale Real-World LLM Conversation Dataset](https://huggingface.co/datasets/lmsys/lmsys-chat-1m)
5+
- [LMSYS-Human-Preference-55k](lmsys/lmsys-arena-human-preference-55k)
56
- [Chatbot Arena Conversation Dataset](https://huggingface.co/datasets/lmsys/chatbot_arena_conversations)
67
- [MT-bench Human Annotation Dataset](https://huggingface.co/datasets/lmsys/mt_bench_human_judgments)

docs/model_support.md

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -116,12 +116,21 @@ For custom protocols, implementation of a streaming generator in [fastchat/serve
116116
"api_type": "openai",
117117
"api_base": "https://api.openai.com/v1",
118118
"api_key": "sk-******",
119-
"anony_only": false
119+
"anony_only": false,
120+
"recommended_config": {
121+
"temperature": 0.7,
122+
"top_p": 1.0
123+
},
124+
"text-arena": true,
125+
"vision-arena": false,
120126
}
121127
}
122128
```
123129
- "api_type" can be one of the following: openai, anthropic, gemini, mistral, yandexgpt or reka. For custom APIs, add a new type and implement it accordingly.
124130
- "anony_only" indicates whether to display this model in anonymous mode only.
131+
- "recommended_config" indicates the recommended generation parameters for temperature and top_p.
132+
- "text-arena" indicates whether the model should be displayed in the Text Arena.
133+
- "vision-arena" indicates whether the model should be displayed in the Vision Arena.
125134

126135
2. Launch the Gradio web server with the argument `--register api_endpoints.json`:
127136
```

fastchat/constants.py

Lines changed: 15 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,19 +7,32 @@
77

88
REPO_PATH = os.path.dirname(os.path.dirname(__file__))
99

10+
# Survey Link URL (to be removed)
11+
SURVEY_LINK = """<div style='text-align: center; margin: 20px 0;'>
12+
<div style='display: inline-block; border: 2px solid #DE3163; padding: 10px; border-radius: 5px;'>
13+
<span style='color: #DE3163; font-weight: bold;'>We would love your feedback! Fill out <a href='https://docs.google.com/forms/d/e/1FAIpQLSfKSxwFOW6qD05phh4fwYjk8q0YV1VQe_bmK0_qOVTbC66_MA/viewform?usp=sf_link' style='color: #DE3163; text-decoration: underline;'>this short survey</a> to tell us what you like about the arena, what you don't like, and what you want to see in the future.</span>
14+
</div>
15+
</div>"""
16+
1017
##### For the gradio web server
1118
SERVER_ERROR_MSG = (
1219
"**NETWORK ERROR DUE TO HIGH TRAFFIC. PLEASE REGENERATE OR REFRESH THIS PAGE.**"
1320
)
21+
TEXT_MODERATION_MSG = (
22+
"$MODERATION$ YOUR TEXT VIOLATES OUR CONTENT MODERATION GUIDELINES."
23+
)
24+
IMAGE_MODERATION_MSG = (
25+
"$MODERATION$ YOUR IMAGE VIOLATES OUR CONTENT MODERATION GUIDELINES."
26+
)
1427
MODERATION_MSG = "$MODERATION$ YOUR INPUT VIOLATES OUR CONTENT MODERATION GUIDELINES."
1528
CONVERSATION_LIMIT_MSG = "YOU HAVE REACHED THE CONVERSATION LENGTH LIMIT. PLEASE CLEAR HISTORY AND START A NEW CONVERSATION."
1629
INACTIVE_MSG = "THIS SESSION HAS BEEN INACTIVE FOR TOO LONG. PLEASE REFRESH THIS PAGE."
1730
SLOW_MODEL_MSG = "⚠️ Both models will show the responses all at once. Please stay patient as it may take over 30 seconds."
18-
RATE_LIMIT_MSG = "**RATE LIMIT OF THIS MODEL IS REACHED. PLEASE COME BACK LATER OR USE BATTLE MODE (the 1st tab).**"
31+
RATE_LIMIT_MSG = "**RATE LIMIT OF THIS MODEL IS REACHED. PLEASE COME BACK LATER OR USE <span style='color: red; font-weight: bold;'>[BATTLE MODE](https://lmarena.ai)</span> (the 1st tab).**"
1932
# Maximum input length
2033
INPUT_CHAR_LEN_LIMIT = int(os.getenv("FASTCHAT_INPUT_CHAR_LEN_LIMIT", 12000))
2134
BLIND_MODE_INPUT_CHAR_LEN_LIMIT = int(
22-
os.getenv("FASTCHAT_BLIND_MODE_INPUT_CHAR_LEN_LIMIT", 24000)
35+
os.getenv("FASTCHAT_BLIND_MODE_INPUT_CHAR_LEN_LIMIT", 30000)
2336
)
2437
# Maximum conversation turns
2538
CONVERSATION_TURN_LIMIT = 50

0 commit comments

Comments
 (0)