-
Notifications
You must be signed in to change notification settings - Fork 304
Move model storage to the /mnt directory on both the host and the Kin… #792
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
✅ Deploy Preview for vllm-semantic-router ready!
To edit notification comments on pull requests, go to your Netlify project configuration. |
👥 vLLM Semantic Team NotificationThe following members have been identified for the changed files in this PR and have been automatically assigned: 📁
|
Xunzhuo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cool
yuluo-yx
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/lgtm
|
this error looks related to the change: |
63c5b56 to
099a5b3
Compare
db5dde6 to
bd75908
Compare
Signed-off-by: Liav Weiss <[email protected]>
bd75908 to
eca47b1
Compare
You were right, it was related to the changes. The symlink works (you can see my validation):
But I still had to use The code is ready for review. |
|
thanks! |
|
closed: #782 |

Move model storage to /mnt directory to prevent disk space issues
Summary
This PR addresses disk space issues in CI/CD workflows by moving model downloads from the root filesystem (~14GB available) to the
/mntdirectory (~75GB available). This prevents "no space left on device" errors when downloading large models during CI runs.Problem
GitHub Actions runners were experiencing disk space exhaustion when downloading large models. The root filesystem (
/) has only ~14GB available, which is insufficient for model downloads that can reach several GB. The previous workaround of deleting toolchains (~25GB) was:Solution
This PR implements a comprehensive solution that works for both host-level workflows and Kind cluster-based tests:
Host-level workflows (
test-and-build.yml,integration-test-docker.yml):/mnt/modelsdirectorymodels/directory to/mnt/modelsif presentmodels/to/mnt/models/for backward compatibility/mntdisk (~75GB)Kind cluster workflows (
integration-test-k8s.yml):/mntinto Kind nodes (control-plane and worker)local-path-provisionerConfigMap to use/mnt/local-path-provisionerinstead of/tmpChanges
Files Modified
.github/workflows/test-and-build.yml(+38 lines).github/workflows/integration-test-docker.yml(+45 lines, -11 lines).github/workflows/integration-test-k8s.yml(-11 lines)e2e/pkg/cluster/kind.go(+49 lines, -2 lines)/mntmount for control-plane and worker nodeslocal-path-configConfigMap to use/mnt/local-path-provisionerlocal-path-provisionerdeployment after patchingTechnical Details
Host-Level Implementation
The symlink approach ensures backward compatibility - existing code continues to work without changes:
models/ -> /mnt/models/Kind Cluster Implementation
/mntat container path/mntlocal-path-provisioneris patched to use/mnt/local-path-provisioneras the base pathTesting
Validation Performed
Benefits
Related Issues
Addresses disk space issues mentioned in PR #623 (follow-up improvement requested by @rootfs)