Merge branch 'main' into feat/max-error-rate

markurtz · web-flow · commit 2889dce417ac · 2025-05-27T10:36:11.000-04:00
diff --git a/.github/workflows/development.yml b/.github/workflows/development.yml
@@ -37,22 +37,6 @@ jobs:
       - name: Run quality checks
         run: tox -e types
 
-  link-checks:
-    runs-on: ubuntu-latest
-    strategy:
-      matrix:
-        python: ["3.9"]
-    steps:
-      - uses: actions/checkout@v4
-      - name: Set up Python
-        uses: actions/setup-python@v5
-        with:
-          python-version: ${{ matrix.python }}
-      - name: Install dependencies
-        run: pip install tox
-      - name: Run link checks
-        run: tox -e links
-
   precommit-checks:
     runs-on: ubuntu-latest
     strategy:
diff --git a/.github/workflows/main.yml b/.github/workflows/main.yml
@@ -38,22 +38,6 @@ jobs:
       - name: Run quality checks
         run: tox -e types
 
-  link-checks:
-    runs-on: ubuntu-latest
-    strategy:
-      matrix:
-        python: ["3.9"]
-    steps:
-      - uses: actions/checkout@v4
-      - name: Set up Python
-        uses: actions/setup-python@v5
-        with:
-          python-version: ${{ matrix.python }}
-      - name: Install dependencies
-        run: pip install tox
-      - name: Run link checks
-        run: tox -e links
-
   precommit-checks:
     runs-on: ubuntu-latest
     strategy:
diff --git a/CONTRIBUTING.md b/CONTRIBUTING.md
@@ -47,7 +47,7 @@ You can either clone the repository directly or fork it if you plan to contribut
    cd guidellm
    ```
 
-For detailed instructions on setting up your development environment, please refer to the [DEVELOPING.md](https://github.com/neuralmagic/speculators/blob/main/DEVELOPING.md) file. It includes step-by-step guidance on:
+For detailed instructions on setting up your development environment, please refer to the [DEVELOPING.md](https://github.com/neuralmagic/guidellm/blob/main/DEVELOPING.md) file. It includes step-by-step guidance on:
 
 - Installing dependencies
 - Running tests
@@ -114,8 +114,8 @@ If you encounter a bug or have a feature request, please open an issue on GitHub
 
 ## Community Standards
 
-We are committed to fostering a welcoming and inclusive community. Please read and adhere to our [Code of Conduct](https://github.com/neuralmagic/speculators/blob/main/CODE_OF_CONDUCT.md).
+We are committed to fostering a welcoming and inclusive community. Please read and adhere to our [Code of Conduct](https://github.com/neuralmagic/guidellm/blob/main/CODE_OF_CONDUCT.md).
 
 ## License
 
-By contributing to Speculators, you agree that your contributions will be licensed under the [Apache License 2.0](https://github.com/neuralmagic/speculators/blob/main/LICENSE).
+By contributing to GuideLLM, you agree that your contributions will be licensed under the [Apache License 2.0](https://github.com/neuralmagic/guidellm/blob/main/LICENSE).
diff --git a/DEVELOPING.md b/DEVELOPING.md
@@ -1,6 +1,6 @@
-# Developing for Speculators
+# Developing for GuideLLM
 
-Thank you for your interest in contributing to Speculators! This document provides detailed instructions for setting up your development environment, implementing changes, and adhering to the project's best practices. Your contributions help us grow and improve this project.
+Thank you for your interest in contributing to GuideLLM! This document provides detailed instructions for setting up your development environment, implementing changes, and adhering to the project's best practices. Your contributions help us grow and improve this project.
 
 ## Setting Up Your Development Environment
 
@@ -142,7 +142,7 @@ tox
 To ensure your changes are covered by tests, run:
 
 ```bash
-tox -e test-unit -- --cov=speculators --cov-report=html
+tox -e test-unit -- --cov=guidellm --cov-report=html
 ```
 
 Review the coverage report to confirm that your new code is adequately tested.
@@ -181,7 +181,7 @@ Review the coverage report to confirm that your new code is adequately tested.
 
 ## Additional Resources
 
-- [CONTRIBUTING.md](https://github.com/neuralmagic/speculators/blob/main/CONTRIBUTING.md): Guidelines for contributing to the project.
-- [CODE_OF_CONDUCT.md](https://github.com/neuralmagic/speculators/blob/main/CODE_OF_CONDUCT.md): Our expectations for community behavior.
-- [tox.ini](https://github.com/neuralmagic/speculators/blob/main/tox.ini): Configuration for Tox environments.
-- [.pre-commit-config.yaml](https://github.com/neuralmagic/speculators/blob/main/.pre-commit-config.yaml): Configuration for pre-commit hooks.
+- [CONTRIBUTING.md](https://github.com/neuralmagic/guidellm/blob/main/CONTRIBUTING.md): Guidelines for contributing to the project.
+- [CODE_OF_CONDUCT.md](https://github.com/neuralmagic/guidellm/blob/main/CODE_OF_CONDUCT.md): Our expectations for community behavior.
+- [tox.ini](https://github.com/neuralmagic/guidellm/blob/main/tox.ini): Configuration for Tox environments.
+- [.pre-commit-config.yaml](https://github.com/neuralmagic/guidellm/blob/main/.pre-commit-config.yaml): Configuration for pre-commit hooks.
diff --git a/README.md b/README.md
@@ -92,7 +92,7 @@ After the evaluation is completed, GuideLLM will summarize the results into thre
 
 The sections will look similar to the following: <img alt="Sample GuideLLM benchmark output" src="https://raw.githubusercontent.com/neuralmagic/guidellm/main/docs/assets/sample-output.png" />
 
-For more details about the metrics and definitions, please refer to the [Metrics documentation](https://raw.githubusercontent.com/neuralmagic/guidellm/main/docs/metrics.md).
+For more details about the metrics and definitions, please refer to the [Metrics documentation](https://github.com/neuralmagic/guidellm/blob/main/docs/metrics.md).
 
 #### 4. Explore the Results File
 
@@ -106,7 +106,7 @@ The results from GuideLLM are used to optimize your LLM deployment for performan
 
 For example, when deploying a chat application, we likely want to ensure that our time to first token (TTFT) and inter-token latency (ITL) are under certain thresholds to meet our service level objectives (SLOs) or service level agreements (SLAs). For example, setting TTFT to 200ms and ITL 25ms for the sample data provided in the example above, we can see that even though the server is capable of handling up to 13 requests per second, we would only be able to meet our SLOs for 99% of users at a request rate of 3.5 requests per second. If we relax our constraints on ITL to 50 ms, then we can meet the TTFT SLA for 99% of users at a request rate of approximately 10 requests per second.
 
-For further details on determining the optimal request rate and SLOs, refer to the [SLOs documentation](https://raw.githubusercontent.com/neuralmagic/guidellm/main/docs/service_level_objectives.md).
+For further details on determining the optimal request rate and SLOs, refer to the [SLOs documentation](https://github.com/neuralmagic/guidellm/blob/main/docs/service_level_objectives.md).
 
 ### Configurations
 
diff --git a/src/guidellm/utils/random.py b/src/guidellm/utils/random.py
@@ -37,7 +37,7 @@ def __iter__(self) -> Iterator[int]:
             if calc_min == calc_max:
                 yield calc_min
             elif not self.variance:
-                yield self.rng.randint(calc_min, calc_max + 1)
+                yield self.rng.randint(calc_min, calc_max)
             else:
                 rand = self.rng.gauss(self.average, self.variance)
                 yield round(max(calc_min, min(calc_max, rand)))