Simplifying the Docker Stack #946
Replies: 4 comments 1 reply
-
|
I'll add my points here:
@dxqb can provide more insights into the specifics of Runpod and Vast images. Links for reference: |
Beta Was this translation helpful? Give feedback.
-
Thanks for working on this.
This will probably limit how much you can unify the docker files, but it might still be possible to a degree? |
Beta Was this translation helpful? Give feedback.
-
|
Thanks for the explanation. I still see a possibility for unification if we can use a build argument for the base image, as long as the OneTrainer layer can be built on top of such an image. The first criteria that come to mind are being Python-enabled and Debian-based. I’ve never worked on multi-target Docker setups, but I know OneTrainer isn’t the first project to cut its teeth on this issue, so I could copy a few practices from others projects. For now, I’ll focus on updating and cleaning up the local Docker setup, then reducing duplicated configurations between the local, Vast, and RunPod environments. I’ll consider the feasibility of full unification as I go. I'll open a draft PR when my first objectives will be complete. |
Beta Was this translation helpful? Give feedback.
-
|
Just a quick note: it'd be more awesome if we could build and publish the image to GHCR in a GHA workflow. |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
This is a follow-up to #926 and the suggestion I made there, since @dxqb asked me to open a PR for it.
I’d like to clarify the current status of the Dockerfiles and discuss whether it would be possible to unify them into a single Dockerfile. That way, my PR could improve all the existing use cases in one go.
I can provide a base image for OneTrainer with a leaner, better-documented Dockerfile, including file ownership synchronization and automatic volumes for user-generated data. However, I’m not sure what “Vast” (added in #894) and “RunPod” are, or how they should be integrated into the upcoming changeset. Also, I would suggest removing
NVIDIA-UI.dockerfile, since CUDA dependencies are already bundled with the PyTorch & related Linux wheels—so a dedicated image for that use case doesn’t seem necessary.For context, I’m currently working on Comfy-Org/ComfyUI#9305 to implement Docker support for ComfyUI, which has lots of similarities with this project, so this is effectively a two-birds-one-stone effort.
Best case, we can provide a one-size-fits-all image to avoid duplicate maintenance work.
Thoughts? Can someone fill me in on RunPod and Vast?
Beta Was this translation helpful? Give feedback.
All reactions