generated from amazon-archives/__template_Apache-2.0
-
Notifications
You must be signed in to change notification settings - Fork 49
hf-vllm v0.11.0 #179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
fgbelidji
wants to merge
21
commits into
awslabs:main
Choose a base branch
from
fgbelidji:hf-vllm
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
hf-vllm v0.11.0 #179
Changes from 18 commits
Commits
Show all changes
21 commits
Select commit
Hold shift + click to select a range
e7471c8
Added hf-vllm container
fgbelidji 7bc79b8
changed release_utils for new backend vllm
fgbelidji 7a983cf
changed releases.json
fgbelidji 525d8da
Adjusted target to sagemaker
fgbelidji 0c56121
fix file copy
fgbelidji bb4849c
merge commit
d423b85
Added hf-vllm 0.10.2
662905c
updated releases.json for first hf-vllm image
6654dac
Removed v0.10.0
66188dc
Updated wheel size limit and missing packages
5f4c764
updated tests for HF-VLLM
8dc1a54
Added dockerfile with vllm 0.11.0 and its base image
4416840
Updated hf-vllm tests
9ff43b6
whitelisted ffmpeg
d87625b
Merge branch 'main' into hf-vllm
fgbelidji 6f418d9
Typo ffmpeg cve
14aa8af
Merge branch 'hf-vllm' of github.com:fgbelidji/llm-hosting-container …
e988bfe
Fix python verion pre-build"
c9d8d56
Removed version 0.10.2
af6a54a
cleaning code
1104252
cleaning code
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
583 changes: 583 additions & 0 deletions
583
huggingface/pytorch/hf-vllm/docker/0.10.2/THIRD-PARTY-LICENSES
Large diffs are not rendered by default.
Oops, something went wrong.
557 changes: 557 additions & 0 deletions
557
huggingface/pytorch/hf-vllm/docker/0.10.2/gpu/Dockerfile
Large diffs are not rendered by default.
Oops, something went wrong.
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,42 @@ | ||
| ARG FINAL_BASE_IMAGE=763104351884.dkr.ecr.us-west-2.amazonaws.com/vllm:0.11.0-gpu-py312-cu128-ubuntu22.04-sagemaker-v1.7 | ||
| FROM ${FINAL_BASE_IMAGE} AS vllm-base | ||
|
|
||
| LABEL maintainer="Amazon AI" | ||
| LABEL dlc_major_version="1" | ||
|
|
||
| ARG HUGGINGFACE_HUB_VERSION=0.36.0 | ||
| ARG HF_XET_VERSION=1.2.0 | ||
|
|
||
| RUN apt-get update -y \ | ||
| && apt-get install -y --no-install-recommends curl unzip \ | ||
| && rm -rf /var/lib/apt/lists/* | ||
|
|
||
|
|
||
| RUN pip install --upgrade pip && \ | ||
| pip install --no-cache-dir \ | ||
| huggingface-hub==${HUGGINGFACE_HUB_VERSION} \ | ||
| hf-xet==${HF_XET_VERSION} \ | ||
| grpcio | ||
|
|
||
|
|
||
| FROM vllm-base AS sagemaker | ||
| ENV HF_HUB_ENABLE_HF_TRANSFER="1" \ | ||
| HF_HUB_USER_AGENT_ORIGIN="aws:sagemaker:gpu-cuda:inference:hf-vllm" | ||
|
|
||
| RUN set -eux; \ | ||
| HOME_DIR=/root; \ | ||
| uv pip install --system --upgrade pip requests PTable; \ | ||
| curl -o ${HOME_DIR}/oss_compliance.zip https://aws-dlinfra-utilities.s3.amazonaws.com/oss_compliance.zip; \ | ||
| unzip ${HOME_DIR}/oss_compliance.zip -d ${HOME_DIR}/; \ | ||
| cp ${HOME_DIR}/oss_compliance/test/testOSSCompliance /usr/local/bin/testOSSCompliance; \ | ||
| chmod +x /usr/local/bin/testOSSCompliance; \ | ||
| chmod +x ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh; \ | ||
| ${HOME_DIR}/oss_compliance/generate_oss_compliance.sh ${HOME_DIR} python3; \ | ||
| rm -rf ${HOME_DIR}/oss_compliance* | ||
|
|
||
| COPY /huggingface/pytorch/hf-vllm/docker/0.11.0/THIRD-PARTY-LICENSES /root/THIRD-PARTY-LICENSES | ||
|
|
||
| ENTRYPOINT ["/usr/local/bin/sagemaker_entrypoint.sh"] | ||
|
|
||
|
|
||
|
|
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,51 @@ | ||
|
|
||
| version: 0.2 | ||
|
|
||
| env: | ||
| shell: bash | ||
| variables: | ||
| FRAMEWORK_FOLDER: "huggingface/pytorch/hf-vllm/docker" | ||
| PYTHONPATH: "/codebuild/output/src*/src/github.com/awslabs/llm-hosting-container" | ||
|
|
||
| phases: | ||
| install: | ||
| runtime-versions: | ||
| python: 3.12 | ||
| commands: | ||
| - echo "Installing Python version 3.12 ..." | ||
| - pyenv global $PYTHON_312_VERSION | ||
|
|
||
| pre_build: | ||
| commands: | ||
| - echo Pre-build started on `date` | ||
| - export PYTHONPATH=$(pwd):$PYTHONPATH | ||
|
|
||
| # Continue with regular pre-build steps if BUILD_REQUIRED=true | ||
| - | | ||
| echo Setting up Docker buildx. | ||
| docker buildx version | ||
| docker buildx create --name builder --driver docker-container --buildkitd-flags '--allow-insecure-entitlement security.insecure --allow-insecure-entitlement network.host' --use | ||
| docker buildx inspect --bootstrap --builder builder | ||
| docker buildx install | ||
| echo Preparing system dependencies for execution. | ||
| docker --version | ||
| docker login -u $DOCKER_USERNAME -p $DOCKER_PASSWORD | ||
| curl -LO http://repo.continuum.io/miniconda/Miniconda3-latest-Linux-x86_64.sh | ||
| bash Miniconda3-latest-Linux-x86_64.sh -bfp /miniconda3 | ||
| export PATH=/miniconda3/bin:${PATH} | ||
| conda install python=3.12 | ||
| conda update -y conda | ||
| echo Prepare HF_VLLM dependencies for execution. | ||
| mkdir hf-vllm-artifacts | ||
| python -m pip install -r $FRAMEWORK_FOLDER/hf-vllm-requirements.txt | ||
|
|
||
| build: | ||
| commands: | ||
| - | | ||
| echo "Current PYTHONPATH: $PYTHONPATH" | ||
| python $FRAMEWORK_FOLDER/hf-vllm.py | ||
|
|
||
| post_build: | ||
| commands: | ||
| - | | ||
| echo Build completed on `date` |
10 changes: 10 additions & 0 deletions
10
huggingface/pytorch/hf-vllm/docker/hf-vllm-requirements.txt
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,10 @@ | ||
| boto3 | ||
| dataclasses | ||
| docker | ||
| gitpython | ||
| sagemaker | ||
|
|
||
| parameterized | ||
| pytest | ||
| pytest-mock | ||
| pytest-xdist |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,131 @@ | ||
| import git | ||
| import logging | ||
| import os | ||
| import shutil | ||
| import subprocess | ||
| import time | ||
|
|
||
| from huggingface.pytorch.release_utils import ( | ||
| GIT_REPO_DOCKERFILES_ROOT_DIRECTORY, | ||
| GIT_REPO_PYTEST_PATH, | ||
| LOG, | ||
| Aws, | ||
| DockerClient, | ||
| EnvironmentVariable, | ||
| Mode, | ||
| ReleaseConfigs | ||
| ) | ||
|
|
||
| GIT_REPO_HF_VLLM_LOCAL_FOLDER_NAME = "hf-vllm" | ||
| GIT_REPO_HF_VLLM_TAG_PATTERN = "v{version}" | ||
| GIT_REPO_HF_VLLM_URL = "https://github.com/vllm-project/vllm.git" | ||
|
|
||
| def build(configs: ReleaseConfigs): | ||
| """Builds the Docker image for the provided configs.""" | ||
| aws = Aws() | ||
| docker_client = DockerClient() | ||
| for config in configs.releases: | ||
| LOG.info(f"Going to build image for config: {config}.") | ||
| image_uri = config.get_image_uri_for_staging() | ||
| if aws.does_ecr_image_exist(image_uri): | ||
| LOG.info(f"Skipping already built image '{image_uri}'. Config: {config}.") | ||
| continue | ||
|
|
||
| LOG.info(f"Setting up build prerequisites for release config with version: {config.version}") | ||
| build_path = GIT_REPO_HF_VLLM_LOCAL_FOLDER_NAME | ||
| shutil.rmtree(GIT_REPO_HF_VLLM_LOCAL_FOLDER_NAME, ignore_errors=True) | ||
| hf_vllm_repo = git.Repo.clone_from(GIT_REPO_HF_VLLM_URL, GIT_REPO_HF_VLLM_LOCAL_FOLDER_NAME, no_checkout=True) | ||
| hf_vllm_repo_tag = GIT_REPO_HF_VLLM_TAG_PATTERN.format(version=config.version) | ||
| hf_vllm_repo.git.checkout(hf_vllm_repo_tag) | ||
| LOG.info(f"Checked out {hf_vllm_repo} with tag: {hf_vllm_repo_tag} to {GIT_REPO_HF_VLLM_LOCAL_FOLDER_NAME}.") | ||
| shutil.copytree(GIT_REPO_DOCKERFILES_ROOT_DIRECTORY, | ||
| os.path.join(GIT_REPO_HF_VLLM_LOCAL_FOLDER_NAME, GIT_REPO_DOCKERFILES_ROOT_DIRECTORY)) | ||
| LOG.info(f"Copied '{GIT_REPO_DOCKERFILES_ROOT_DIRECTORY}' directory to HF_VLLM directory for 'COPY' command.") | ||
|
|
||
| dockerfile_path = config.get_dockerfile_path() | ||
| LOG.info(f"Building Dockerfile: '{dockerfile_path}'. This may take a while...") | ||
| docker_client.build(image_uri=image_uri, dockerfile_path=dockerfile_path, build_path=build_path) | ||
|
|
||
| username, password = aws.get_ecr_credentials(image_uri) | ||
| docker_client.login(username, password, image_uri) | ||
| docker_client.push(image_uri) | ||
|
|
||
| def test(configs: ReleaseConfigs): | ||
| """Runs SageMaker tests for the Docker images associated with the provided configs and current git commit.""" | ||
| aws = Aws() | ||
| for config in configs.releases: | ||
| LOG.info(f"Going to test built image for config: {config}.") | ||
| test_role_arn = os.getenv(EnvironmentVariable.TEST_ROLE_ARN.name) | ||
| test_session = aws.get_session_for_role(test_role_arn) | ||
| test_credentials = test_session.get_credentials() | ||
| environ = os.environ.copy() | ||
| environ.update({ | ||
| "DEVICE_TYPE": config.device.lower(), | ||
| "AWS_ACCESS_KEY_ID": test_credentials.access_key, | ||
| "AWS_SECRET_ACCESS_KEY": test_credentials.secret_key, | ||
| "AWS_SESSION_TOKEN": test_credentials.token, | ||
| "IMAGE_URI": config.get_image_uri_for_staging(), | ||
| "TEST_ROLE_ARN": test_role_arn }) | ||
|
|
||
| command = ["pytest", "-m", config.device.lower(), "-n", "auto", "--log-cli-level", "info", GIT_REPO_PYTEST_PATH] | ||
| LOG.info(f"Running test command: {command}.") | ||
| process = subprocess.run(command, env=environ, encoding="utf-8", capture_output=True) | ||
| LOG.info(process.stdout) | ||
| assert process.returncode == 0, f"Failed with config: {config}.\nError: {process.stderr}." | ||
| LOG.info(f"Finished testing image with config: {config}.") | ||
|
|
||
|
|
||
| def pr(configs: ReleaseConfigs): | ||
| """Executes both build and test modes.""" | ||
| build(configs) | ||
| test(configs) | ||
|
|
||
| def release(configs: ReleaseConfigs): | ||
| """trigger SMFrameworks algo release pipeline""" | ||
| aws = Aws() | ||
| docker_client = DockerClient() | ||
| for config in configs.releases: | ||
| LOG.info(f"Releasing image associated for config: {config}.") | ||
| released_image_uri = config.get_image_uri_for_released() | ||
| if aws.does_ecr_image_exist(released_image_uri): | ||
| LOG.info(f"Skipping already released image '{released_image_uri}'. Config: {config}.") | ||
| continue | ||
|
|
||
| staged_image_uri = config.get_image_uri_for_staging() | ||
| username, password = aws.get_ecr_credentials(staged_image_uri) | ||
| docker_client.login(username, password, staged_image_uri) | ||
| docker_client.prune_all() | ||
| docker_client.pull(staged_image_uri) | ||
|
|
||
| docker_client.login(username, password, staged_image_uri) | ||
| docker_client.tag(staged_image_uri, released_image_uri) | ||
| docker_client.push(released_image_uri) | ||
|
|
||
| js_uris = config.get_image_uris_for_jumpstart() | ||
| username, password = aws.get_ecr_credentials(js_uris[0]) | ||
| docker_client.login(username, password, js_uris[0]) | ||
| for js_uri in js_uris: | ||
| docker_client.tag(staged_image_uri, js_uri) | ||
| docker_client.push(js_uri) | ||
| LOG.info(f"Release marked as complete for following config ({js_uris}): {config}") | ||
|
|
||
|
|
||
| if __name__ == "__main__": | ||
| logging.basicConfig( | ||
| level=logging.INFO, | ||
| format="%(asctime)s %(levelname)-8s %(message)s", | ||
| datefmt="%Y-%m-%d %H:%M:%S") | ||
| configs = ReleaseConfigs() | ||
| configs.validate() | ||
| mode = os.getenv(EnvironmentVariable.MODE.name) | ||
| LOG.info(f"Mode has been set to: {mode}.") | ||
| if mode == Mode.PR.name: | ||
| pr(configs) | ||
| elif mode == Mode.BUILD.name: | ||
| build(configs) | ||
| elif mode == Mode.TEST.name: | ||
| test(configs) | ||
| elif mode == Mode.RELEASE.name: | ||
| release(configs) | ||
| else: | ||
| raise ValueError(f"The mode '{mode}' is not recognized. Please set it correctly.'") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -124,30 +124,39 @@ | |
| "python_version": "py310", | ||
| "pytorch_version": "2.0.1" | ||
| } | ||
|
|
||
fgbelidji marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| ], | ||
| "HF-VLLM": [ | ||
| { | ||
| "device": "gpu", | ||
| "min_version": "0.10.2", | ||
| "max_version": "0.11.0", | ||
| "os_version": "ubuntu22.04", | ||
| "cuda_version": "cu128", | ||
| "python_version": "py312", | ||
| "pytorch_version": "2.8.0" | ||
| } | ||
| ] | ||
| }, | ||
| "ignore_vulnerabilities": [ | ||
| "CVE-2024-42154 - linux", | ||
| "CVE-2025-32434 - torch", | ||
| "CVE-2024-48063 - torch" | ||
| "CVE-2024-48063 - torch", | ||
| "CVE-2024-35366 -- ffmpeg", | ||
| "CVE-2024-35367 -- ffmpeg", | ||
| "CVE-2024-35368 -- ffmpeg" | ||
fgbelidji marked this conversation as resolved.
Show resolved
Hide resolved
|
||
|
|
||
|
||
| ], | ||
| "releases": [ | ||
| { | ||
| "framework": "TGI", | ||
| "framework": "HF-VLLM", | ||
| "device": "gpu", | ||
| "version": "3.3.6", | ||
| "os_version": "ubuntu22.04", | ||
| "cuda_version": "cu124", | ||
| "python_version": "py311", | ||
| "pytorch_version": "2.7.0" | ||
| }, | ||
| { | ||
| "framework": "TGI", | ||
| "device": "inf2", | ||
| "version": "3.3.6", | ||
| "version": "0.11.0", | ||
| "os_version": "ubuntu22.04", | ||
| "python_version": "py310", | ||
| "pytorch_version": "2.7.0" | ||
| "python_version": "py312", | ||
| "pytorch_version": "2.8.0", | ||
| "cuda_version": "cu128" | ||
| } | ||
|
|
||
fgbelidji marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| ] | ||
| } | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you also remove the file if it is empty. Want to keep codebase clean.