update dockerfile by shihaobai · Pull Request #981 · ModelTC/LightLLM

shihaobai · 2025-07-23T12:44:21Z

No description provided.

gemini-code-assist

Summary of Changes

Hello @shihaobai, I'm Gemini Code Assist¹! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request focuses on a significant upgrade and refinement of the project's Docker build environments and core dependencies. The changes aim to modernize the build process by updating base images, CUDA, Python, and Mamba versions, while also optimizing the integration of high-performance kernels like Flash Attention through a shift to vllm and LightKernel.

Highlights

Docker Environment Modernization: Upgraded the base CUDA image, CUDA version (12.4 to 12.6.1), Python version (3.9 to 3.10), and Mamba version (23.1.0-1 to 24.7.1-0) across both Dockerfile and Dockerfile.deepep.
Optimized Kernel Integration: Revised the installation strategy for optimized kernels, moving from a direct flash-attn wheel to installing vllm and building LightKernel from source within the Docker images.
Core Dependency Updates: Updated key Python packages in requirements.txt, including packaging, setuptools, and torch, while streamlining dependencies by removing torchvision, triton, and sympy.
DeepEP Build Process Refinement: For Dockerfile.deepep, the build process for DeepEP was significantly enhanced, incorporating a more robust gdrcopy installation, updated NVSHMEM build procedures, and pinning a specific DeepEP commit.
Flash Attention Interface Adaptation: Modified flashattention_nopad.py to adapt to the new sgl_kernel interface for Flash Attention, although the new path is currently conditionally disabled.

Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point in your pull request via creating an issue comment (i.e. comment on the pull request page) using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands.

Feature	Command	Description
Code Review	`/gemini review`	Performs a code review for the current pull request in its current state.
Pull Request Summary	`/gemini summary`	Provides a summary of the current pull request in its current state.
Comment	@gemini-code-assist	Responds in comments when explicitly tagged, both in issue comments and review comments.
Help	`/gemini help`	Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments or fill out our survey to provide feedback.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution. ↩

gemini-code-assist

Code Review

This pull request updates the Dockerfiles and dependencies. My review focuses on Dockerfile best practices for reducing image size and build time, and I've identified a critical issue in the flash attention code that appears to be a leftover debugging flag. Addressing these points will improve the quality and efficiency of the build process and ensure the code is production-ready.

gemini-code-assist · 2025-07-23T12:47:52Z

lightllm/models/vit/triton_kernel/flashattention_nopad.py

+    则使用 sgl_kernel里的接口，否则使用 Triton 版本。
    """
-    if _flash_attn_v3_available and is_hopper():
+    if _flash_attn_v3_available and is_hopper() and False:


The condition ... and False permanently disables this code path, forcing the function to always use the Triton kernel implementation. This looks like a temporary debugging change that should not be merged. If the flash_attention_v3_fwd path is not ready, it should be removed or properly guarded. If it is intended to be used, this and False must be removed.

Suggested change

if _flash_attn_v3_available and is_hopper() and False:

if _flash_attn_v3_available and is_hopper():

gemini-code-assist · 2025-07-23T12:47:52Z

docker/Dockerfile.deepep

+RUN apt-get update && apt-get install -y libnuma-dev wget devscripts debhelper dh-make build-essential dkms
+RUN apt-get install -y ibverbs-providers infiniband-diags perftest rdma-core libibverbs-dev librdmacm-dev


These two RUN instructions for apt-get should be combined into a single layer to optimize the Docker build process and reduce image size. Also, it's a best practice to:

Use --no-install-recommends to avoid installing unnecessary packages.

Clean up the apt cache with rm -rf /var/lib/apt/lists/* in the same RUN command to prevent caching old package lists and to reduce image size.

RUN apt-get update && apt-get install -y --no-install-recommends \ libnuma-dev wget devscripts debhelper dh-make build-essential dkms \ ibverbs-providers infiniband-diags perftest rdma-core libibverbs-dev librdmacm-dev \ && rm -rf /var/lib/apt/lists/*

gemini-code-assist · 2025-07-23T12:47:52Z

docker/Dockerfile

+RUN pip install --no-cache-dir vllm --pre --extra-index-url https://wheels.vllm.ai/nightly 

-RUN pip install --no-cache-dir https://github.com/ModelTC/flash-attn-3-build/releases/download/v2.7.4.post1/flash_attn-3.0.0b1-cp39-cp39-linux_x86_64.whl
+RUN git clone https://github.com/ModelTC/LightKernel.git && cd LightKernel && pip install --no-deps -v .


The cloned LightKernel repository is not removed after installation. This increases the final Docker image size. It's a good practice to clean up build artifacts within the same RUN layer.

RUN git clone https://github.com/ModelTC/LightKernel.git && cd LightKernel && pip install --no-deps -v . && cd .. && rm -rf LightKernel

gemini-code-assist · 2025-07-23T12:47:52Z

docker/Dockerfile

+RUN git clone https://github.com/ModelTC/LightKernel.git && cd LightKernel && pip install --no-deps -v .

-RUN pip install --no-cache-dir nvidia-nccl-cu12==2.25.1  # for allreduce hang issues in multinode H100
+RUN apt-get update && apt-get install -y libnuma-dev # for sgl_kernel


To reduce the Docker image size, it's recommended to clean up the apt cache after installing packages. This should be done in the same RUN instruction to be effective. Using --no-install-recommends is also a good practice to avoid pulling in unnecessary packages.

RUN apt-get update && apt-get install -y --no-install-recommends libnuma-dev && rm -rf /var/lib/apt/lists/* # for sgl_kernel

gemini-code-assist · 2025-07-23T12:47:52Z

docker/Dockerfile.deepep

+RUN pip install --no-cache-dir vllm --pre --extra-index-url https://wheels.vllm.ai/nightly 

-RUN pip install --no-cache-dir nvidia-nccl-cu12==2.25.1  # for allreduce hang issues in multinode H100
+RUN git clone https://github.com/ModelTC/LightKernel.git && cd LightKernel && pip install --no-deps -v .


The cloned LightKernel repository is not removed after installation. This increases the final Docker image size. It's a good practice to clean up build artifacts within the same RUN layer.

RUN git clone https://github.com/ModelTC/LightKernel.git && cd LightKernel && pip install --no-deps -v . && cd .. && rm -rf LightKernel

gemini-code-assist · 2025-07-23T12:47:52Z

docker/Dockerfile.deepep

-    && cd build \
-    && make install -j64
+    cmake -S . -B build/ -DCMAKE_INSTALL_PREFIX=/root/nvshmem/install -DCMAKE_CUDA_ARCHITECTURES=90 \
+ && cmake --build build --target install -j64


The nvshmem source directory is kept after installation, which increases the image size. It can be removed after the build and install is complete within the same RUN layer to be effective.

&& cmake --build build --target install -j64 && cd /root && rm -rf nvshmem

gemini-code-assist · 2025-07-23T12:47:53Z

docker/Dockerfile.deepep

+ && cmake --build build --target install -j64
+
+ARG DEEPEP_COMMIT=b6ce310bb0b75079682d09bc2ebc063a074fbd58
+RUN git clone https://github.com/deepseek-ai/DeepEP.git && cd DeepEP && git checkout ${DEEPEP_COMMIT} && cd ..


The cloned DeepEP repository is not removed after installation, which increases the final Docker image size. Consider removing the repository after it's installed in the subsequent steps. This should be done in the same RUN layer as the installation to be effective.

update dockerfile

62571ed

gemini-code-assist bot reviewed Jul 23, 2025

View reviewed changes

shihaobai merged commit ff9e387 into main Jul 23, 2025
1 check passed

shihaobai deleted the new_docker branch July 23, 2025 13:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Comments

update dockerfile#981

update dockerfile#981
shihaobai merged 1 commit intomainfrom
new_docker

shihaobai commented Jul 23, 2025

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot left a comment

Uh oh!

gemini-code-assist bot Jul 23, 2025

Uh oh!

gemini-code-assist bot Jul 23, 2025

Uh oh!

gemini-code-assist bot Jul 23, 2025

Uh oh!

gemini-code-assist bot Jul 23, 2025

Uh oh!

gemini-code-assist bot Jul 23, 2025

Uh oh!

gemini-code-assist bot Jul 23, 2025

Uh oh!

gemini-code-assist bot Jul 23, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	if _flash_attn_v3_available and is_hopper() and False:
	if _flash_attn_v3_available and is_hopper():

		RUN apt-get update && apt-get install -y libnuma-dev wget devscripts debhelper dh-make build-essential dkms
		RUN apt-get install -y ibverbs-providers infiniband-diags perftest rdma-core libibverbs-dev librdmacm-dev

Comments

Conversation

shihaobai commented Jul 23, 2025

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Summary of Changes

Highlights

Footnotes

Uh oh!

gemini-code-assist bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

gemini-code-assist bot Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

gemini-code-assist bot Jul 23, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant