Skip to content

Commit d09f685

Browse files
committed
Merge branch 'update-docs' into implement-base-ui-app
2 parents c23c674 + 1e29968 commit d09f685

File tree

3 files changed

+5
-5
lines changed

3 files changed

+5
-5
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -92,7 +92,7 @@ After the evaluation is completed, GuideLLM will summarize the results into thre
9292

9393
The sections will look similar to the following: <img alt="Sample GuideLLM benchmark output" src="https://raw.githubusercontent.com/neuralmagic/guidellm/main/docs/assets/sample-output.png" />
9494

95-
For more details about the metrics and definitions, please refer to the [Metrics documentation](https://raw.githubusercontent.com/neuralmagic/guidellm/main/docs/metrics.md).
95+
For more details about the metrics and definitions, please refer to the [Metrics documentation](https://github.com/neuralmagic/guidellm/blob/main/docs/metrics.md).
9696

9797
#### 4. Explore the Results File
9898

@@ -106,7 +106,7 @@ The results from GuideLLM are used to optimize your LLM deployment for performan
106106

107107
For example, when deploying a chat application, we likely want to ensure that our time to first token (TTFT) and inter-token latency (ITL) are under certain thresholds to meet our service level objectives (SLOs) or service level agreements (SLAs). For example, setting TTFT to 200ms and ITL 25ms for the sample data provided in the example above, we can see that even though the server is capable of handling up to 13 requests per second, we would only be able to meet our SLOs for 99% of users at a request rate of 3.5 requests per second. If we relax our constraints on ITL to 50 ms, then we can meet the TTFT SLA for 99% of users at a request rate of approximately 10 requests per second.
108108

109-
For further details on determining the optimal request rate and SLOs, refer to the [SLOs documentation](https://raw.githubusercontent.com/neuralmagic/guidellm/main/docs/service_level_objectives.md).
109+
For further details on determining the optimal request rate and SLOs, refer to the [SLOs documentation](https://github.com/neuralmagic/guidellm/blob/main/docs/service_level_objectives.md).
110110

111111
### Configurations
112112

src/guidellm/utils/random.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -37,7 +37,7 @@ def __iter__(self) -> Iterator[int]:
3737
if calc_min == calc_max:
3838
yield calc_min
3939
elif not self.variance:
40-
yield self.rng.randint(calc_min, calc_max + 1)
40+
yield self.rng.randint(calc_min, calc_max)
4141
else:
4242
rand = self.rng.gauss(self.average, self.variance)
4343
yield round(max(calc_min, min(calc_max, rand)))

tox.ini

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -66,8 +66,8 @@ description = Run link checks for root and docs markdown files
6666
deps =
6767
.[dev]
6868
commands =
69-
mkdocs-linkcheck ./
70-
mkdocs-linkcheck docs/
69+
mkdocs-linkcheck ./ --exclude 'https://github\.com/.*/blob/.*'
70+
mkdocs-linkcheck docs/ --exclude 'https://github\.com/.*/blob/.*'
7171

7272

7373
[testenv:build]

0 commit comments

Comments
 (0)