Skip to content

Commit 80a6709

Browse files
authored
Merge branch 'implement-ui-build-pipeline' into save-as-html
2 parents e9972e0 + 7131e21 commit 80a6709

File tree

2 files changed

+25
-6
lines changed

2 files changed

+25
-6
lines changed

.github/workflows/development.yml

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -12,6 +12,8 @@ jobs:
1212
python: ["3.9", "3.13"]
1313
steps:
1414
- uses: actions/checkout@v4
15+
with:
16+
ref: "${{ github.event.pull_request.merge_commit_sha }}"
1517
- name: Set up Python
1618
uses: actions/setup-python@v5
1719
with:
@@ -28,6 +30,8 @@ jobs:
2830
steps:
2931
- name: Check out code
3032
uses: actions/checkout@v3
33+
with:
34+
ref: "${{ github.event.pull_request.merge_commit_sha }}"
3135

3236
- name: Set up Node.js 22
3337
uses: actions/setup-node@v4
@@ -47,6 +51,8 @@ jobs:
4751
python: ["3.9", "3.13"]
4852
steps:
4953
- uses: actions/checkout@v4
54+
with:
55+
ref: "${{ github.event.pull_request.merge_commit_sha }}"
5056
- name: Set up Python
5157
uses: actions/setup-python@v5
5258
with:
@@ -63,6 +69,8 @@ jobs:
6369
steps:
6470
- name: Check out code
6571
uses: actions/checkout@v3
72+
with:
73+
ref: "${{ github.event.pull_request.merge_commit_sha }}"
6674

6775
- name: Set up Node.js 22
6876
uses: actions/setup-node@v4
@@ -82,6 +90,8 @@ jobs:
8290
python: ["3.9"]
8391
steps:
8492
- uses: actions/checkout@v4
93+
with:
94+
ref: "${{ github.event.pull_request.merge_commit_sha }}"
8595
- name: Set up Python
8696
uses: actions/setup-python@v5
8797
with:
@@ -98,6 +108,8 @@ jobs:
98108
steps:
99109
- name: Check out code
100110
uses: actions/checkout@v3
111+
with:
112+
ref: "${{ github.event.pull_request.merge_commit_sha }}"
101113

102114
- name: Set up Node.js 22
103115
uses: actions/setup-node@v4
@@ -117,6 +129,8 @@ jobs:
117129
python: ["3.9", "3.13"]
118130
steps:
119131
- uses: actions/checkout@v4
132+
with:
133+
ref: "${{ github.event.pull_request.merge_commit_sha }}"
120134
- name: Set up Python
121135
uses: actions/setup-python@v5
122136
with:
@@ -152,6 +166,8 @@ jobs:
152166
python: ["3.9", "3.13"]
153167
steps:
154168
- uses: actions/checkout@v4
169+
with:
170+
ref: "${{ github.event.pull_request.merge_commit_sha }}"
155171
- name: Set up Python
156172
uses: actions/setup-python@v5
157173
with:
@@ -168,6 +184,8 @@ jobs:
168184
steps:
169185
- name: Check out code
170186
uses: actions/checkout@v3
187+
with:
188+
ref: "${{ github.event.pull_request.merge_commit_sha }}"
171189

172190
- name: Set up Node.js 22
173191
uses: actions/setup-node@v4
@@ -190,6 +208,7 @@ jobs:
190208
uses: actions/checkout@v4
191209
with:
192210
fetch-depth: 0
211+
ref: "${{ github.event.pull_request.merge_commit_sha }}"
193212
- name: Set up Python
194213
uses: actions/setup-python@v5
195214
with:

README.md

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -64,7 +64,7 @@ vllm serve "neuralmagic/Meta-Llama-3.1-8B-Instruct-quantized.w4a16"
6464

6565
For more information on starting a vLLM server, see the [vLLM Documentation](https://docs.vllm.ai/en/latest/serving/openai_compatible_server.html).
6666

67-
For information on starting other supported inference servers or platforms, see the [Supported Backends documentation](https://github.com/neuralmagic/guidellm/blob/main/docs/backends.md).
67+
For information on starting other supported inference servers or platforms, see the [Supported Backends Documentation](https://github.com/neuralmagic/guidellm/blob/main/docs/backends.md).
6868

6969
#### 2. Run a GuideLLM Benchmark
7070

@@ -92,21 +92,21 @@ After the evaluation is completed, GuideLLM will summarize the results into thre
9292

9393
The sections will look similar to the following: <img alt="Sample GuideLLM benchmark output" src="https://raw.githubusercontent.com/neuralmagic/guidellm/main/docs/assets/sample-output.png" />
9494

95-
For more details about the metrics and definitions, please refer to the [Metrics documentation](https://github.com/neuralmagic/guidellm/blob/main/docs/metrics.md).
95+
For more details about the metrics and definitions, please refer to the [Metrics Documentation](https://github.com/neuralmagic/guidellm/blob/main/docs/metrics.md).
9696

9797
#### 4. Explore the Results File
9898

9999
By default, the full results, including complete statistics and request data, are saved to a file `benchmarks.json` in the current working directory. This file can be used for further analysis or reporting, and additionally can be reloaded into Python for further analysis using the `guidellm.benchmark.GenerativeBenchmarksReport` class. You can specify a different file name and extension with the `--output` argument.
100100

101-
For more details about the supported output file types, please take a look at the [Outputs documentation](https://github.com/neuralmagic/guidellm/blob/main/docs/outputs.md).
101+
For more details about the supported output file types, please take a look at the [Outputs Documentation](https://github.com/neuralmagic/guidellm/blob/main/docs/outputs.md).
102102

103103
#### 5. Use the Results
104104

105105
The results from GuideLLM are used to optimize your LLM deployment for performance, resource efficiency, and cost. By analyzing the performance metrics, you can identify bottlenecks, determine the optimal request rate, and select the most cost-effective hardware configuration for your deployment.
106106

107107
For example, when deploying a chat application, we likely want to ensure that our time to first token (TTFT) and inter-token latency (ITL) are under certain thresholds to meet our service level objectives (SLOs) or service level agreements (SLAs). For example, setting TTFT to 200ms and ITL 25ms for the sample data provided in the example above, we can see that even though the server is capable of handling up to 13 requests per second, we would only be able to meet our SLOs for 99% of users at a request rate of 3.5 requests per second. If we relax our constraints on ITL to 50 ms, then we can meet the TTFT SLA for 99% of users at a request rate of approximately 10 requests per second.
108108

109-
For further details on determining the optimal request rate and SLOs, refer to the [SLOs documentation](https://github.com/neuralmagic/guidellm/blob/main/docs/service_level_objectives.md).
109+
For further details on determining the optimal request rate and SLOs, refer to the [SLOs Documentation](https://github.com/neuralmagic/guidellm/blob/main/docs/service_level_objectives.md).
110110

111111
### Configurations
112112

@@ -197,7 +197,7 @@ Alternatively, in config.py update the ENV_REPORT_MAPPING used as the asset base
197197

198198
### Documentation
199199

200-
Our comprehensive documentation offers detailed guides and resources to help you maximize the benefits of GuideLLM. Whether just getting started or looking to dive deeper into advanced topics, you can find what you need in our [documentation](https://github.com/neuralmagic/guidellm/blob/main/docs).
200+
Our comprehensive documentation offers detailed guides and resources to help you maximize the benefits of GuideLLM. Whether just getting started or looking to dive deeper into advanced topics, you can find what you need in our [Documentation](https://github.com/neuralmagic/guidellm/blob/main/docs).
201201

202202
### Core Docs
203203

@@ -222,7 +222,7 @@ We appreciate contributions to the code, examples, integrations, documentation,
222222

223223
### Releases
224224

225-
Visit our [GitHub Releases page](https://github.com/neuralmagic/guidellm/releases) and review the release notes to stay updated with the latest releases.
225+
Visit our [GitHub Releases Page](https://github.com/neuralmagic/guidellm/releases) and review the release notes to stay updated with the latest releases.
226226

227227
### License
228228

0 commit comments

Comments
 (0)