- 
                Notifications
    
You must be signed in to change notification settings  - Fork 32
 
Build: Trigger CI for new vllm_backend Triton releases #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Changes from 96 commits
4f59ff5
              5f40521
              b19dbb7
              65f979c
              a12519f
              59b549f
              8ca761d
              718c244
              6c1ad63
              5a251ae
              73655f8
              450f73b
              915346e
              00d10f9
              0ff3e8b
              1ea7b00
              580bbf0
              2085c70
              6d79abc
              3418d12
              380fb94
              9a1725b
              8e182af
              93a10a5
              50cc924
              c800ef4
              9dca430
              fa1846a
              1fd0e56
              8209da1
              7f6157c
              a5e9fc4
              0301c04
              b6f351e
              ba5dba3
              bfe7131
              d6606d3
              223a10b
              23865a6
              b1c58b3
              72ee876
              18c6ab6
              d4b30c0
              a2b3058
              dd7ccf9
              cfa8c48
              8bdc8e0
              07912f2
              9268c2f
              98f4a3f
              e0c4ad4
              013e389
              20f3d39
              2d5098a
              84e14dd
              fb20236
              29e73ba
              a2e7db3
              bfb9466
              f348d49
              85c53aa
              25a71f7
              deb13ce
              d888f12
              7a08c86
              e76e209
              1b3dbc0
              1a93230
              1efadc7
              3d0110a
              f29d9d0
              dec1329
              52aaded
              62fc87c
              6605983
              98fbc29
              f484188
              91dc27a
              9bf9c12
              805ae1e
              7ccea9c
              d347b1d
              fff8f14
              156724a
              4604761
              232787f
              656c8f5
              47b9a91
              047b885
              fb7977f
              2050086
              9492364
              ce76a76
              2fafdda
              5792663
              f3258ef
              f2c7a89
              e45854c
              5929f7a
              2512b8a
              7c97d9f
              552db92
              65fe72c
              3ada940
              760f884
              61ba078
              0100f94
              5768b34
              ea2120c
              adec735
              834e76e
              File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| name: Welcome message | ||
| on: | ||
| pull_request_target: | ||
| types: [opened] | ||
| 
     | 
||
| jobs: | ||
| pr_reminder: | ||
| runs-on: ubuntu-latest | ||
| steps: | ||
| - name: Add first comment | ||
| uses: actions/github-script@v6 | ||
| with: | ||
| script: | | ||
| github.rest.issues.createComment({ | ||
| owner: context.repo.owner, | ||
| repo: context.repo.repo, | ||
| issue_number: context.issue.number, | ||
| body: '👋 Hi! \nThank you for contributing to the project.\n Just a reminder: PRs will trigger full CI run by default. We will add verified labels on the PR once build and tests steps are successful.\n🚀' | ||
| }) | ||
| env: | ||
| GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }} | 
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| 
          
            
          
           | 
    @@ -36,4 +36,3 @@ jobs: | |
| - uses: actions/checkout@v3 | ||
| - uses: actions/setup-python@v3 | ||
| - uses: pre-commit/[email protected] | ||
| 
     | 
||
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,45 @@ | ||
| name: Validate Triton Pull request by running our change on latest vLLM release | ||
| on: | ||
| pull_request: | ||
| jobs: | ||
| mirror_repo: | ||
| environment: GITLAB | ||
| runs-on: self-hosted | ||
| steps: | ||
| - name: Sync Mirror Repository | ||
| run: | | ||
| #!/bin/bash | ||
| curl --request POST --header "PRIVATE-TOKEN:${{ secrets.TOKEN }}" "${{ secrets.MIRROR_URL }}" | ||
| trigger-ci: | ||
| environment: GITLAB | ||
| needs: mirror_repo | ||
| runs-on: self-hosted | ||
| steps: | ||
| - name: Trigger Pipeline | ||
| run: | | ||
| #!/bin/bash | ||
| # Get latest VLLM RELEASED VERSION from https://github.com/triton-inference-server/vllm_backend/releases | ||
| TAG=$(curl https://api.github.com/repos/triton-inference-server/vllm_backend/releases/latest | grep -i "tag_name" | awk -F '"' '{print $4}') | ||
| export TRITON_CONTAINER_VERSION=${TAG#v} # example: 24.08 | ||
| if [ -z "$TRITON_CONTAINER_VERSION" ] | ||
| then | ||
| echo "\$TRITON_CONTAINER_VERSION is NULL, setting it to 24.08" | ||
| TRITON_CONTAINER_VERSION=24.08 | ||
| else | ||
| echo "\$TRITON_CONTAINER_VERSION is NOT NULL" | ||
| fi | ||
| echo "TRITON_CONTAINER_VERSION = ${TRITON_CONTAINER_VERSION}" | ||
| 
     | 
||
| # Get latest VLLM RELEASED VERSION from https://github.com/vllm-project/vllm/releases | ||
| TAG=$(curl https://api.github.com/repos/vllm-project/vllm/releases/latest | grep -i "tag_name" | awk -F '"' '{print $4}') | ||
| export VLLM_VERSION=${TAG#v} # example: 0.5.5 | ||
| if [ -z "$VLLM_VERSION" ] | ||
| then | ||
| echo "\$VLLM_VERSION is NULL, setting it to 0.5.5" | ||
| VLLM_VERSION=0.5.5 | ||
| else | ||
| echo "\$VLLM_VERSION is NOT NULL" | ||
| fi | ||
| echo "VLLM_VERSION = ${VLLM_VERSION}" | ||
| 
     | 
||
| curl --fail --request POST --form token=${{ secrets.PIPELINE_TOKEN }} -F ref=${GITHUB_HEAD_REF} -F variables[BUILD_OPTION]="BUILD_SOURCE" -F variables[TRITON_CONTAINER_VERSION]="${TRITON_CONTAINER_VERSION}" -F variables[VLLM_VERSION]="${VLLM_VERSION}" -F variables[TEST_OPTION]="ALL_TESTS" "${{ secrets.PIPELINE_URL }}" | 
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| name: Validate latest vLLM release from https://github.com/vllm-project/vllm/releases against latest Triton release https://github.com/triton-inference-server/vllm_backend/releases | ||
| on: | ||
| schedule: | ||
| - cron: "30 09 */3 * *" | ||
| jobs: | ||
| mirror_repo: | ||
| environment: GITLAB | ||
| runs-on: self-hosted | ||
| steps: | ||
| - name: Sync Mirror Repository | ||
| run: | | ||
| #!/bin/bash | ||
| curl --request POST --header "PRIVATE-TOKEN:${{ secrets.TOKEN }}" "${{ secrets.MIRROR_URL }}" | ||
| trigger-ci: | ||
| environment: GITLAB | ||
| needs: mirror_repo | ||
| runs-on: self-hosted | ||
| steps: | ||
| - name: Trigger Pipeline | ||
| run: | | ||
| #!/bin/bash | ||
| # Get latest VLLM RELEASED VERSION from https://github.com/triton-inference-server/vllm_backend/releases | ||
| TAG=$(curl https://api.github.com/repos/triton-inference-server/vllm_backend/releases/latest | grep -i "tag_name" | awk -F '"' '{print $4}') | ||
| export TRITON_CONTAINER_VERSION=${TAG#v} # example: 24.08 | ||
| # Get latest VLLM RELEASED VERSION from https://github.com/vllm-project/vllm/releases | ||
| TAG=$(curl https://api.github.com/repos/vllm-project/vllm/releases/latest | grep -i "tag_name" | awk -F '"' '{print $4}') | ||
| export VLLM_VERSION=${TAG#v} # example: 0.5.5 | ||
| echo "VLLM_VERSION = ${VLLM_VERSION}" | ||
| if [ -z "$TRITON_CONTAINER_VERSION" || -z "$VLLM_VERSION"] | ||
| then | ||
| echo "Can't find latest Triton or vllm version.. Skipping CI run" | ||
| else | ||
| echo "TRITON_CONTAINER_VERSION = ${TRITON_CONTAINER_VERSION}" | ||
| echo "VLLM_VERSION = ${VLLM_VERSION}" | ||
| curl --fail --request POST --form token=${{ secrets.PIPELINE_TOKEN }} -F ref=${GITHUB_HEAD_REF} -F variables[BUILD_OPTION]="PULL_DOCKER" -F variables[TRITON_CONTAINER_VERSION]="${TRITON_CONTAINER_VERSION}" -F variables[VLLM_VERSION]="${VLLM_VERSION}" -F variables[TEST_OPTION]="ALL_TESTS" "${{ secrets.PIPELINE_URL }}" | ||
| fi | 
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| 
          
            
          
           | 
    @@ -27,6 +27,9 @@ | |
| --> | ||
| 
     | 
||
| [](https://opensource.org/licenses/BSD-3-Clause) | ||
|  | ||
|  | ||
|  | ||
| 
         There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. could you please clarify how these static badges work? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Ideally this should be automated. But, it is a manual process at the moment. Once the cron task is finished and the pipeline is green, I'd have to issue a PR to update the badges here. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Do we want to hold on adding badges, until we have an automated workflow in place?  | 
||
| 
     | 
||
| # vLLM Backend | ||
| 
     | 
||
| 
          
            
          
           | 
    @@ -82,7 +85,17 @@ latest YY.MM (year.month) of [Triton release](https://github.com/triton-inferenc | |
| 
     | 
||
| ``` | ||
| # YY.MM is the version of Triton. | ||
| export TRITON_CONTAINER_VERSION=<YY.MM> | ||
| # Get latest VLLM RELEASED VERSION from https://github.com/triton-inference-server/vllm_backend/releases | ||
| TAG=$(curl https://api.github.com/repos/triton-inference-server/vllm_backend/releases/latest | grep -i "tag_name" | awk -F '"' '{print $4}') | ||
| export TRITON_CONTAINER_VERSION=${TAG#v} # example: 24.06 | ||
| echo "TRITON_CONTAINER_VERSION = ${TRITON_CONTAINER_VERSION}" | ||
| 
     | 
||
| # Get latest VLLM RELEASED VERSION from https://github.com/vllm-project/vllm/releases | ||
| TAG=$(curl https://api.github.com/repos/vllm-project/vllm/releases/latest | grep -i "tag_name" | awk -F '"' '{print $4}') | ||
| export VLLM_VERSION=${TAG#v} # example: 0.5.3.post1 | ||
| echo "VLLM_VERSION = ${VLLM_VERSION}" | ||
| 
     | 
||
| git clone -b r${TRITON_CONTAINER_VERSION} https://github.com/triton-inference-server/server.git | ||
                
      
                  nvda-mesharma marked this conversation as resolved.
               
          
            Show resolved
            Hide resolved
                
      
                  nvda-mesharma marked this conversation as resolved.
               
          
            Show resolved
            Hide resolved
         | 
||
| ./build.py -v --enable-logging | ||
| --enable-stats | ||
| --enable-tracing | ||
| 
        
          
        
         | 
    @@ -100,6 +113,10 @@ export TRITON_CONTAINER_VERSION=<YY.MM> | |
| --upstream-container-version=${TRITON_CONTAINER_VERSION} | ||
| --backend=python:r${TRITON_CONTAINER_VERSION} | ||
| --backend=vllm:r${TRITON_CONTAINER_VERSION} | ||
| --vllm-version=${VLLM_VERSION} | ||
| # Build Triton Server | ||
| cd server/build | ||
                
      
                  nvda-mesharma marked this conversation as resolved.
               
              
                Outdated
          
            Show resolved
            Hide resolved
         | 
||
| bash -x ./docker_build | ||
| ``` | ||
| 
     | 
||
| ### Option 3. Add the vLLM Backend to the Default Triton Container | ||
| 
          
            
          
           | 
    ||
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,36 @@ | ||
| #!/bin/bash | ||
| # Copyright 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # | ||
| # Redistribution and use in source and binary forms, with or without | ||
| # modification, are permitted provided that the following conditions | ||
| # are met: | ||
| # * Redistributions of source code must retain the above copyright | ||
| # notice, this list of conditions and the following disclaimer. | ||
| # * Redistributions in binary form must reproduce the above copyright | ||
| # notice, this list of conditions and the following disclaimer in the | ||
| # documentation and/or other materials provided with the distribution. | ||
| # * Neither the name of NVIDIA CORPORATION nor the names of its | ||
| # contributors may be used to endorse or promote products derived | ||
| # from this software without specific prior written permission. | ||
| # | ||
| # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY | ||
| # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
| # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR | ||
| # PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR | ||
| # CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, | ||
| # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, | ||
| # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR | ||
| # PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY | ||
| # OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
| # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
| # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
| 
     | 
||
| while getopts t: flag | ||
| do | ||
| case "${flag}" in | ||
| u) PROD_CONTAINER=${OPTARG};; | ||
| esac | ||
| done | ||
| 
     | 
||
| echo "Pulling container image ${PROD_CONTAINER}" | ||
| docker pull ${PROD_CONTAINER} | 
| Original file line number | Diff line number | Diff line change | 
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| #!/bin/bash | ||
| # Copyright 2024, NVIDIA CORPORATION & AFFILIATES. All rights reserved. | ||
| # | ||
| # Redistribution and use in source and binary forms, with or without | ||
| # modification, are permitted provided that the following conditions | ||
| # are met: | ||
| # * Redistributions of source code must retain the above copyright | ||
| # notice, this list of conditions and the following disclaimer. | ||
| # * Redistributions in binary form must reproduce the above copyright | ||
| # notice, this list of conditions and the following disclaimer in the | ||
| # documentation and/or other materials provided with the distribution. | ||
| # * Neither the name of NVIDIA CORPORATION nor the names of its | ||
| # contributors may be used to endorse or promote products derived | ||
| # from this software without specific prior written permission. | ||
| # | ||
| # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS ``AS IS'' AND ANY | ||
| # EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE | ||
| # IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR | ||
| # PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT OWNER OR | ||
| # CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, | ||
| # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, | ||
| # PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR | ||
| # PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY | ||
| # OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT | ||
| # (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE | ||
| # OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. | ||
| 
     | 
||
| while getopts t:v: flag | ||
| do | ||
| case "${flag}" in | ||
| u) TRITON_CONTAINER_VERSION=${OPTARG};; | ||
| a) VLLM_VERSION=${OPTARG};; | ||
| esac | ||
| done | ||
| 
     | 
||
| echo "Triton version is ${TRITON_CONTAINER_VERSION} and vllm version is ${VLLM_VERSION}" | ||
| #git clone -b r${TRITON_CONTAINER_VERSION} https://github.com/triton-inference-server/server.git | ||
                
      
                  nvda-mesharma marked this conversation as resolved.
               
          
            Show resolved
            Hide resolved
         | 
||
| git clone -b mesharma-ci https://github.com/triton-inference-server/server.git | ||
| set -x && python3 server/build.py -v \ | ||
| --enable-logging \ | ||
| --enable-stats \ | ||
| --enable-tracing \ | ||
| --enable-metrics \ | ||
| --enable-gpu-metrics \ | ||
| --enable-cpu-metrics \ | ||
| --enable-gpu \ | ||
| --no-container-interactive \ | ||
| --container-prebuild-command="docker login -u gitlab-ci-token -p ${CI_JOB_TOKEN} ${CI_REGISTRY}" \ | ||
                
      
                  oandreeva-nv marked this conversation as resolved.
               
          
            Show resolved
            Hide resolved
         | 
||
| --filesystem=gcs \ | ||
| --filesystem=s3 \ | ||
| --filesystem=azure_storage \ | ||
| --endpoint=http \ | ||
| --endpoint=grpc \ | ||
| --endpoint=sagemaker \ | ||
| --endpoint=vertex-ai \ | ||
| --upstream-container-version=${TRITON_CONTAINER_VERSION} \ | ||
| --backend=python:r${TRITON_CONTAINER_VERSION} \ | ||
| --backend=vllm:r${TRITON_CONTAINER_VERSION} \ | ||
| --vllm-version=${VLLM_VERSION} 2>&1 | ||
| # Build Triton Server | ||
| cd server/build | ||
| bash -x ./docker_build | ||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you please clarify why the version 0.5.5 was picked?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have a passing pipeline with this version. Hence, I picked this as the default until a new version is tested and verified.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should have the latest version we support, or wait until we have tests, that indicate that we can migrate to the latest. Since we don't support 0.5.5 and vLLM's latest version now is 0.6.1.post2, I can see a confusion, that will arise in users with this badge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done