Skip to content

Commit ffa782e

Browse files
authored
Merge branch 'main' into support_sharegpt
2 parents d904a7e + a4bdbb5 commit ffa782e

File tree

11 files changed

+538
-307
lines changed

11 files changed

+538
-307
lines changed

.github/workflows/main.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -217,7 +217,7 @@ jobs:
217217
uses: peaceiris/actions-gh-pages@v3
218218
with:
219219
github_token: ${{ secrets.GITHUB_TOKEN }}
220-
publish_dir: ./ui/out
220+
publish_dir: ./src/ui/out
221221
destination_dir: ui/dev
222222
keep_files: false
223223
user_name: ${{ github.actor }}

.github/workflows/nightly.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -238,7 +238,7 @@ jobs:
238238
uses: peaceiris/actions-gh-pages@v3
239239
with:
240240
github_token: ${{ secrets.GITHUB_TOKEN }}
241-
publish_dir: ./ui/out
241+
publish_dir: ./src/ui/out
242242
destination_dir: ui/nightly
243243
keep_files: false
244244
user_name: ${{ github.actor }}

.github/workflows/release-candidate.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -282,7 +282,7 @@ jobs:
282282
uses: peaceiris/actions-gh-pages@v3
283283
with:
284284
github_token: ${{ secrets.GITHUB_TOKEN }}
285-
publish_dir: ./ui/out
285+
publish_dir: ./src/ui/out
286286
destination_dir: ui/release/latest
287287
keep_files: false
288288
user_name: ${{ github.actor }}

.github/workflows/release.yml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -281,7 +281,7 @@ jobs:
281281
uses: peaceiris/actions-gh-pages@v3
282282
with:
283283
github_token: ${{ secrets.GITHUB_TOKEN }}
284-
publish_dir: ./ui/out
284+
publish_dir: ./src/ui/out
285285
destination_dir: ui/latest
286286
keep_files: false
287287
user_name: ${{ github.actor }}

docs/backends.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -40,6 +40,24 @@ docker run --gpus 1 -ti --shm-size 1g --ipc=host --rm -p 8080:80 \
4040

4141
For more information on starting a TGI server, see the [TGI Documentation](https://huggingface.co/docs/text-generation-inference/index).
4242

43+
### 3. llama.cpp
44+
45+
[llama.cpp](https://github.com/ggml-org/llama.cpp) provides lightweight, OpenAI-compatible server through its [llama-server](https://github.com/ggml-org/llama.cpp/blob/master/tools/server) tool.
46+
47+
To start a llama.cpp server with the gpt-oss-20b model, you can use the following command:
48+
49+
```bash
50+
llama-server -hf ggml-org/gpt-oss-20b-GGUF --alias gpt-oss-20b --ctx-size 0 --jinja -ub 2048 -b 2048
51+
```
52+
53+
Note that we are providing an alias `gpt-oss-20b` for the model name because `guidellm` is using it to retrieve model metadata in JSON format and such metadata is not included in GGUF model repositories. A simple workaround is to download the metadata files from safetensors repository and place them in a local directory named after the alias:
54+
55+
```bash
56+
huggingface-cli download openai/gpt-oss-20b --include "*.json" --local-dir gpt-oss-20b/
57+
```
58+
59+
Now you can run `guidellm` as usual and it will be able to fetch the model metadata from the local directory.
60+
4361
## Expanding Backend Support
4462

4563
GuideLLM is an open platform, and we encourage contributions to extend its backend support. Whether it's adding new server implementations, integrating with Python-based backends, or enhancing existing capabilities, your contributions are welcome. For more details on how to contribute, see the [CONTRIBUTING.md](https://github.com/vllm-project/guidellm/blob/main/CONTRIBUTING.md) file.

pdm.toml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,4 @@
1+
[strategy]
2+
update = "reuse"
13
[lock]
24
format = "pylock"

pylock.toml

Lines changed: 507 additions & 299 deletions
Large diffs are not rendered by default.

pyproject.toml

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -62,6 +62,10 @@ dependencies = [
6262
]
6363

6464
[project.optional-dependencies]
65+
recommended = [
66+
"tiktoken>=0.11.0", # For OpenAI tokenizer
67+
"blobfile>=3.1.0", # For OpenAI tokenizer
68+
]
6569
dev = [
6670
# build
6771
"build>=1.0.0",
@@ -102,7 +106,6 @@ dev = [
102106
"mkdocs-linkcheck~=1.0.6",
103107
]
104108

105-
# For PEP 735 compliant tools
106109
[dependency-groups]
107110
dev = [ "guidellm[dev]" ]
108111

src/guidellm/backend/openai.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -688,7 +688,7 @@ def _extract_completions_delta_content(
688688
return data["choices"][0]["text"]
689689

690690
if type_ == "chat_completions":
691-
return data["choices"][0]["delta"]["content"]
691+
return data.get("choices", [{}])[0].get("delta", {}).get("content")
692692

693693
raise ValueError(f"Unsupported type: {type_}")
694694

src/guidellm/presentation/data_models.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -190,7 +190,7 @@ class TabularDistributionSummary(DistributionSummary):
190190
"""
191191

192192
@computed_field
193-
def percentile_rows(self) -> list[dict[str, float]]:
193+
def percentile_rows(self) -> list[dict[str, Union[str, float]]]:
194194
rows = [
195195
{"percentile": name, "value": value}
196196
for name, value in self.percentiles.model_dump().items()

0 commit comments

Comments
 (0)