-
-
Couldn't load subscription status.
- Fork 1.2k
go uv-first #3120
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
go uv-first #3120
Conversation
|
Important Review skippedAuto incremental reviews are disabled on this repository. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the 📝 WalkthroughWalkthroughMigration to uv-based package management across CI, Docker, scripts, and docs. Packaging moves from setup.py/requirements*.txt to explicit pyproject.toml with src layout. Multiple GitHub workflows and Dockerfiles updated to uv, base image tags adjusted, dynamic dependency tooling removed, and documentation updated accordingly. Changes
Estimated code review effort🎯 4 (Complex) | ⏱️ ~75 minutes Possibly related PRs
Suggested reviewers
✨ Finishing touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
|
📖 Documentation Preview: https://68e147d3d626edbbb2feb89e--resonant-treacle-0fd729.netlify.app Deployed on Netlify from commit ffb307a |
Codecov Report✅ All modified and coverable lines are covered by tests. 📢 Thoughts on this report? Let us know! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (9)
scripts/unsloth_install.py (2)
10-16: Handle CPU-only installs gracefully.torch.version.cuda can be None and cuda.get_device_capability() can raise on CPU; current code raises and breaks CI unnecessarily.
Apply:
+import sys @@ -cuda = str(torch.version.cuda) +cuda_raw = torch.version.cuda +if not torch.cuda.is_available() or cuda_raw is None: + print('echo "CUDA not available; skipping unsloth install"') + sys.exit(0) +cuda = str(cuda_raw) @@ -try: - is_ampere = torch.cuda.get_device_capability()[0] >= 8 -except RuntimeError: - is_ampere = False +try: + is_ampere = torch.cuda.get_device_capability()[0] >= 8 +except Exception: + is_ampere = False @@ -if cuda != "12.1" and cuda != "11.8" and cuda != "12.4": - raise RuntimeError(f"CUDA = {cuda} not supported!") +SUPPORTED_CUDA = {"11.8", "12.1", "12.4", "12.6", "12.8"} +if cuda not in SUPPORTED_CUDA: + print(f'echo "CUDA={cuda} not supported by unsloth; skipping"') + sys.exit(0)
17-33: Don’t hard-fail on newer Torch in CI—skip with a message.Your matrix includes 2.6–2.8; this script raises “too new!” and (combined with piping) may silently skip installs later.
Apply:
@@ -else: - raise RuntimeError(f"Torch = {v} too new!") +else: + print(f'echo "unsloth not available for torch {v}; skipping"') + sys.exit(0)docs/installation.qmd (1)
60-64: Remove “source $HOME/.local/bin/env” (non-standard).This file typically doesn’t exist; users will see “No such file or directory.”
Apply:
curl -LsSf https://astral.sh/uv/install.sh | sh -source $HOME/.local/bin/env +export PATH="$HOME/.local/bin:$PATH".github/workflows/multi-gpu-e2e.yml (1)
66-74: matrix.axolotl_args is undefined.actionlint flags this; it will fail at expression time.
Either add a default to each include or stop echoing it. Example (add empty default):
matrix: include: - cuda: 126 cuda_version: 12.6.3 python_version: "3.11" pytorch: 2.6.0 + axolotl_args: "" axolotl_extras: num_gpus: 2 nightly_build: "true" - cuda: 126 cuda_version: 12.6.3 python_version: "3.11" pytorch: 2.7.1 + axolotl_args: "" axolotl_extras: vllm num_gpus: 2 nightly_build: "true" - cuda: 128 cuda_version: 12.8.1 python_version: "3.11" pytorch: 2.8.0 + axolotl_args: "" axolotl_extras: num_gpus: 2 nightly_build: "true".github/workflows/tests.yml (3)
251-259: matrix.axolotl_args is undefined here as well.Same issue as multi-gpu-e2e; actionlint flags it.
Either define it in the matrix includes or remove the echo:
- echo "AXOLOTL_ARGS=${{ matrix.axolotl_args}}" >> $GITHUB_ENV
314-323: matrix.axolotl_args undefined (repeat).Clean it up as above.
Apply the same diff removing the AXOLOTL_ARGS echo or define the matrix field.
358-367: matrix.axolotl_args undefined (repeat).Clean it up as above.
Apply the same diff removing the AXOLOTL_ARGS echo or define the matrix field.
.github/workflows/tests-nightly.yml (2)
126-133: Matrix variable axolotl_args is undefined (actionlint error).
This will fail at expression evaluation.Fix by providing a default or by adding it to the matrix:
-echo "AXOLOTL_ARGS=${{ matrix.axolotl_args}}" >> $GITHUB_ENV +echo "AXOLOTL_ARGS=${{ matrix.axolotl_args || '' }}" >> $GITHUB_ENVOr define axolotl_args in the matrix include entries.
171-178: Repeat: axolotl_args undefined here too.
Same actionlint finding applies.-echo "AXOLOTL_ARGS=${{ matrix.axolotl_args}}" >> $GITHUB_ENV +echo "AXOLOTL_ARGS=${{ matrix.axolotl_args || '' }}" >> $GITHUB_ENV
🧹 Nitpick comments (29)
.github/CONTRIBUTING.md (1)
34-35: Prefer uv sync and document uv installation.
- Use uv’s resolver/lock with a managed venv for contributors; fall back to --system only when needed.
- Add a quick note/link on installing uv so newcomers aren’t blocked.
-uv pip install -e .[dev] - -pre-commit install -# test -pytest tests/ +## Option A (recommended): managed virtualenv +uv sync --all-extras --dev + +## Option B: install into system Python (CI/containers) +# uv pip install --system -e .[dev] + +pre-commit install +# test +pytest -q.github/workflows/preview-docs.yml (2)
43-47: Pin uv action input for stability.“latest” can introduce surprise CI breakages. Either rely on the pinned action major (v4) with a minimum uv version or pin a known-good.
-uses: astral-sh/setup-uv@v4 -with: - version: "latest" +uses: astral-sh/setup-uv@v4 +with: + version: ">=0.4.28"
50-51: Consider uvx for tool invocations to avoid system installs.Keeps the runner cleaner and uses the resolved versions.
-uv pip install --system jupyter quartodoc -uv pip install --system -e . +uv pip install --system -e . +uvx pip install jupyter quartodoc # or keep as-is; alternatively: +# uvx quartodoc build (see next steps)Follow-up (optional):
- run: quartodoc build + run: uvx quartodoc build.github/workflows/lint.yml (1)
8-13: Trigger lint on uv.lock changes too.If config/deps shift, lint hooks (e.g., import-order, pyproject-based tooling) can change; include uv.lock in paths.
- '**.py' - 'pyproject.toml' + - 'uv.lock' - '.github/workflows/*.yml' - "*.[q]md" - "examples/**/*.y[a]?ml" - ".pre-commit-config.yaml".github/workflows/precommit-autoupdate.yml (2)
21-25: Pin uv input for reproducible CI.-uses: astral-sh/setup-uv@v4 -with: - version: "latest" +uses: astral-sh/setup-uv@v4 +with: + version: ">=0.4.28"
29-29: Use uvx to run pre-commit without installing it system-wide.Simpler and avoids polluting the runner image.
-uv pip install --system pre-commit -pre-commit autoupdate +uvx pre-commit autoupdate.github/workflows/docs.yml (2)
23-26: Pin uv instead of “latest” for reproducible CIAvoid surprise breakages from upstream. Pin a specific uv version (and bump intentionally).
Example:
- with: - version: "latest" + with: + version: "0.X.Y" # pin and bump deliberately
29-30: Use uv sync and the lockfile; avoid ad-hoc installs for deterministic docs buildsDirect “uv pip install …” ignores uv.lock. Prefer syncing the project (optionally with a docs extra) and running quartodoc via the project env.
- uv pip install --system jupyter quartodoc - uv pip install --system -e . + uv sync --frozenAlso update the build step to use the project env:
- - name: Build autodoc - run: quartodoc build + - name: Build autodoc + run: uv run quartodoc build.runpod/Dockerfile (1)
4-5: Harden uv install and reduce image size
- Pin uv version in the installer (supply-chain hygiene).
- Prune uv cache after install to trim the image.
- If an “-uv” base image exists for this stack, prefer it to avoid curl|sh at build time.
-RUN curl -LsSf https://astral.sh/uv/install.sh | sh && \ - /root/.local/bin/uv pip install --system -r /requirements.txt +RUN curl -LsSf https://astral.sh/uv/install.sh | sh -s -- -v 0.X.Y && \ + /root/.local/bin/uv pip install --system -r /requirements.txt && \ + /root/.local/bin/uv cache prune -aIf you choose to pin via an ARG:
# add near top (outside the changed hunk) ARG UV_VERSION=0.X.YAnd then:
-curl -LsSf https://astral.sh/uv/install.sh | sh -s -- -v 0.X.Y +curl -LsSf https://astral.sh/uv/install.sh | sh -s -- -v ${UV_VERSION}scripts/cutcrossentropy_install.py (1)
24-29: Align messaging and flags with the repo’s uv-first posture
- The ImportError suggests “pip install torch”. Recommend updating to uv for consistency.
- Optional: include “--system” in the uninstall line for symmetry (install already uses it).
Proposed tweak (outside this hunk):
-except ImportError as exc: - raise ImportError("Install torch via `pip install torch`") from exc +except ImportError as exc: + raise ImportError("Install torch via `uv add torch` (project) or `uv pip install --system torch`") from excIf you want symmetry on uninstall (not strictly required):
- UNINSTALL_PREFIX = "uv pip uninstall -y cut-cross-entropy && " + UNINSTALL_PREFIX = "uv pip uninstall --system -y cut-cross-entropy && "src/axolotl/integrations/cut_cross_entropy/README.md (1)
20-23: Add “--system” to manual install to mirror scripts and avoid env ambiguityThis matches the printed command from scripts/cutcrossentropy_install.py and prevents accidental venv creation/mismatch.
-uv pip uninstall -y cut-cross-entropy && uv pip install "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@c6a32c5" +uv pip uninstall --system -y cut-cross-entropy && uv pip install --system "cut-cross-entropy[transformers] @ git+https://github.com/axolotl-ai-cloud/ml-cross-entropy.git@c6a32c5".github/workflows/pypi.yml (2)
41-45: Pin the uv installer action and/or uv version.Using “latest” can break releases if upstream changes.
Example:
- with: - version: "latest" + with: + version: "0.5.x"
46-50: Avoid installing dev extras on the publish job.They’re unnecessary for building; they slow the workflow and can pull test-only deps.
Lean build:
- - name: Install dependencies - run: | - uv pip install --system wheel packaging==23.2 - uv pip install --system --no-build-isolation -e ".[dev]" + - name: Install build tooling + run: | + uv pip install --system wheel packaging==23.2 build @@ - - name: Build package - run: | - uv pip install --system build - python -m build + - name: Build package + run: python -m buildAlso applies to: 55-59
scripts/unsloth_install.py (1)
6-6: Update the guidance to uv-first.The ImportError message still instructs “pip install torch”.
Use:
- raise ImportError("Install torch via `pip install torch`") from error + raise ImportError("Install PyTorch first (e.g., `uv pip install torch` or follow https://pytorch.org/get-started/locally/)") from errordocs/installation.qmd (3)
22-30: Clarify Torch install guidance now that uv can install it automatically.This callout conflicts with uv-first messaging.
Proposed text:
- “uv will install the appropriate PyTorch build automatically. To pin a specific CUDA build, set UV_TORCH_BACKEND (see Advanced uv Installation) or follow PyTorch’s official instructions.”
32-43: Add optional CUDA backend pin to quick start.Helpful for GPU users to get the right wheel on first try.
Example:
For a quick installation with uv: ```{.bash} # Install uv if not already installed curl -LsSf https://astral.sh/uv/install.sh | sh +export UV_TORCH_BACKEND=cu126 # or cu128 / cpu @@ uv pip install --no-build-isolation axolotl--- `74-81`: **Unnecessary bootstrap deps.** packaging/setuptools/wheel aren’t needed before installing torch with uv. You can drop the first line: ```diff -uv pip install packaging setuptools wheel uv pip install torch==2.6.0 uv pip install awscli pydantic.github/workflows/multi-gpu-e2e.yml (1)
57-61: Pin the uv installer action and/or uv version.Avoid “latest” drift in CI.
Same suggestion as in pypi.yml.
.github/workflows/tests.yml (2)
80-81: Pin torchvision to the matching torch series.Mixing major/minor can cause ABI/runtime issues.
Example:
- uv pip install --system torch==${{ matrix.pytorch_version }} torchvision + uv pip install --system torch==${{ matrix.pytorch_version }} "torchvision==${{ matrix.pytorch_version }}.*"
242-246: Pin the uv installer action and/or uv version.Avoid breakage from upstream “latest”.
Same suggestion as in other workflows.
Also applies to: 305-309, 350-354
.github/workflows/tests-nightly.yml (2)
55-62: sed-based pyproject.toml edits are brittle.
If the dependency line changes (different quoting, specifier, extras), sed may miss or overmatch.Use a small Python TOML edit to be robust:
- sed -i 's#"transformers==.*"#"transformers @ git+https://github.com/huggingface/transformers.git@main"#' pyproject.toml + python - <<'PY' + import tomllib, tomli_w, re + p="pyproject.toml" + data=tomllib.loads(open(p,"rb").read()) + deps=data["project"]["dependencies"] + def repl(name,url): + for i,s in enumerate(deps): + if re.fullmatch(fr'"?{name}==[^"]+"?', s) or s.startswith(f'{name}=='): + deps[i]=f'{name} @ {url}' + repl("transformers","git+https://github.com/huggingface/transformers.git@main") + repl("peft","git+https://github.com/huggingface/peft.git@main") + repl("accelerate","git+https://github.com/huggingface/accelerate.git@main") + repl("trl","git+https://github.com/huggingface/trl.git@main") + repl("datasets","git+https://github.com/huggingface/datasets.git@main") + open(p,"wb").write(tomli_w.dumps(data).encode()) + PY
181-181: Align modal invocation style with earlier step.
Use -m here as well if you intend module mode.-modal run cicd.multigpu +modal run -m cicd.multigpupyproject.toml (2)
108-147: Heavy defaults; consider slimming base deps.
Gradio, lm_eval, and other tools inflate base install. If not essential at runtime, move them into extras to reduce install surface and solver pressure.Happy to propose a split if desired.
162-188: Duplicate “dev” dependency declarations (project extras vs tool.uv).
Maintaining both risks drift.Pick one source of truth (recommend: keep [project.optional-dependencies].dev) and remove [tool.uv].dev-dependencies, then install with -e ".[dev]".
-[tool.uv] -dev-dependencies = [ - "pytest", - "pytest-cov", - "pytest-xdist", - "pre-commit", - "ruff", - "mypy", -] +[tool.uv] +# dev-dependencies intentionally omitted; use project extra: uv pip install -e ".[dev]"Also applies to: 255-263
docker/Dockerfile (1)
31-31: Pin pytest under uv too (system).
Keep install target consistent.-uv pip install pytest +uv pip install --system pytestcicd/Dockerfile.jinja (4)
1-1: Fix hadolint parser error (DL1000) on templated FROMHadolint can’t parse Jinja in FROM, causing “unexpected 'B'”. Either ignore DL1000 or switch to ARG+parameter expansion so the file remains valid Docker syntax.
Apply one of the following:
Option A (ignore):
+## hadolint ignore=DL1000 FROM axolotlai/axolotl-base-uv:{{ BASE_TAG }}Option B (Docker-parseable + templated default):
- FROM axolotlai/axolotl-base-uv:{{ BASE_TAG }} +ARG BASE_TAG="{{ BASE_TAG }}" +FROM axolotlai/axolotl-base-uv:${BASE_TAG}
13-15: Reduce image size: clean apt lists in same layerApt cache isn’t cleaned; this bloats the image.
RUN apt-get update && \ - apt-get install -y --allow-change-held-packages vim curl nano libnccl2 libnccl-dev ibverbs-providers ibverbs-utils infiniband-diags librdmacm-dev librdmacm1 rdmacm-utils slurm-wlm + apt-get install -y --allow-change-held-packages vim curl nano libnccl2 libnccl-dev ibverbs-providers ibverbs-utils infiniband-diags librdmacm-dev librdmacm1 rdmacm-utils slurm-wlm && \ + rm -rf /var/lib/apt/lists/*
45-45: Align dev install with lock and avoid duplicate editable installsThis performs a second editable install, re-resolving deps and bypassing uv.lock.
- RUN uv pip install -e .[dev] + # Deterministic dev deps; reuse lock + RUN uv sync --frozen -E devIf you keep pip form, add
--system:- RUN uv pip install -e .[dev] + RUN uv pip install --system -e ".[dev]"
34-34: Install with--systemto avoid stray venvs
In Docker, prefer--systemto install directly into the image’s Python and prevent accidental.venvcreation.-RUN uv pip install packaging==23.2 setuptools==75.8.0 +RUN uv pip install --system packaging==23.2 setuptools==75.8.0
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
💡 Knowledge Base configuration:
- MCP integration is disabled by default for public repositories
- Jira integration is disabled by default for public repositories
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
⛔ Files ignored due to path filters (1)
uv.lockis excluded by!**/*.lock
📒 Files selected for processing (28)
.coveragerc(0 hunks).github/CONTRIBUTING.md(1 hunks).github/workflows/docs.yml(1 hunks).github/workflows/lint.yml(1 hunks).github/workflows/multi-gpu-e2e.yml(2 hunks).github/workflows/precommit-autoupdate.yml(1 hunks).github/workflows/preview-docs.yml(1 hunks).github/workflows/pypi.yml(1 hunks).github/workflows/tests-nightly.yml(4 hunks).github/workflows/tests.yml(6 hunks).runpod/Dockerfile(1 hunks)MANIFEST.in(1 hunks)README.md(1 hunks)cicd/Dockerfile-uv.jinja(0 hunks)cicd/Dockerfile.jinja(2 hunks)cicd/multigpu.py(1 hunks)cicd/single_gpu.py(1 hunks)docker/Dockerfile(2 hunks)docs/installation.qmd(3 hunks)pyproject.toml(3 hunks)requirements-dev.txt(0 hunks)requirements-tests.txt(0 hunks)requirements.txt(0 hunks)scripts/cutcrossentropy_install.py(1 hunks)scripts/unsloth_install.py(1 hunks)setup.py(0 hunks)src/axolotl/integrations/cut_cross_entropy/README.md(1 hunks)src/setuptools_axolotl_dynamic_dependencies.py(0 hunks)
💤 Files with no reviewable changes (7)
- .coveragerc
- requirements-dev.txt
- requirements-tests.txt
- cicd/Dockerfile-uv.jinja
- src/setuptools_axolotl_dynamic_dependencies.py
- setup.py
- requirements.txt
🧰 Additional context used
🪛 actionlint (1.7.7)
.github/workflows/pypi.yml
53-53: workflow command "set-output" was deprecated. use echo "{name}={value}" >> $GITHUB_OUTPUT instead: https://docs.github.com/en/actions/using-workflows/workflow-commands-for-github-actions
(deprecated-commands)
.github/workflows/tests.yml
250-250: property "axolotl_args" is not defined in object type {axolotl_extras: string; cuda: number; cuda_version: string; dockerfile: string; num_gpus: number; python_version: number; pytorch: string}
(expression)
313-313: property "axolotl_args" is not defined in object type {axolotl_extras: string; cuda: number; cuda_version: string; num_gpus: number; python_version: number; pytorch: string}
(expression)
358-358: property "axolotl_args" is not defined in object type {axolotl_extras: string; cuda: number; cuda_version: string; num_gpus: number; python_version: number; pytorch: string}
(expression)
.github/workflows/multi-gpu-e2e.yml
65-65: property "axolotl_args" is not defined in object type {axolotl_extras: string; cuda: number; cuda_version: string; nightly_build: bool; num_gpus: number; python_version: number; pytorch: string}
(expression)
.github/workflows/tests-nightly.yml
125-125: property "axolotl_args" is not defined in object type {axolotl_extras: string; cuda: number; cuda_version: string; nightly_build: bool; num_gpus: number; python_version: number; pytorch: string}
(expression)
170-170: property "axolotl_args" is not defined in object type {axolotl_extras: string; cuda: number; cuda_version: string; nightly_build: bool; num_gpus: number; python_version: number; pytorch: string}
(expression)
🪛 Hadolint (2.12.0)
cicd/Dockerfile.jinja
[error] 1-1: unexpected 'B'
expecting '#', '', ADD, ARG, CMD, COPY, ENTRYPOINT, ENV, EXPOSE, FROM, HEALTHCHECK, LABEL, MAINTAINER, ONBUILD, RUN, SHELL, STOPSIGNAL, USER, VOLUME, WORKDIR, a pragma, at least one space, or end of input
(DL1000)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (7)
- GitHub Check: PyTest from Source Dist (3.11, 2.7.1)
- GitHub Check: preview
- GitHub Check: PyTest (3.11, 2.8.0)
- GitHub Check: PyTest from Source Dist (3.11, 2.6.0)
- GitHub Check: PyTest from Source Dist (3.11, 2.8.0)
- GitHub Check: PyTest (3.11, 2.7.1)
- GitHub Check: PyTest (3.11, 2.6.0)
🔇 Additional comments (12)
scripts/unsloth_install.py (1)
35-35: Use a single line, but keep it safe if earlier guards exit.The combined command is fine given the new early exits; LGTM once guards are added.
.github/workflows/tests-nightly.yml (4)
46-49: Good move switching to uv in CI.
This consolidates tooling and should speed up env setup.
53-53: Pin torchvision to the torch matrix to avoid ABI/API mismatches.
torchvision often requires a torch-matched build; installing unpinned next to a pinned torch can break imports at runtime.Suggested tweak:
-uv pip install --system torch==${{ matrix.pytorch_version }} torchvision +uv pip install --system "torch==${{ matrix.pytorch_version }}" "torchvision==${{ matrix.pytorch_version }}"
65-67: Nice: uv install respects preinstalled torch and editable dev extra.
This aligns with the PR’s “uv-first” goal.
136-137: Use -m consistently for modal runs or drop it everywhere.
Single-GPU uses -m; multi-GPU below does not. Prefer consistency to avoid ambiguity between file vs module mode.-modal run -m cicd.e2e_tests +modal run -m cicd.e2e_tests(If you choose not to use -m, remove it in both places.)
pyproject.toml (2)
96-106: Verify availability of platform-marked deps.
Ensure flash-attn, deepspeed, deepspeed-kernels resolve from your configured indexes on CI runners; otherwise lock or provide extra find-links.I can script-check PyPI availability if useful.
144-144: Confirm correct optimizer package
The dependency[email protected]is a valid PyPI package; verify this is the intended library. If you meant to usetorch-optimizer, replace it withtorch-optimizer==0.3.0(or your desired version).cicd/multigpu.py (1)
26-26: Default BASE_TAG updated to -uv variant — good alignment.
Matches the uv-first Docker base images.cicd/single_gpu.py (1)
26-26: Default BASE_TAG updated to -uv variant — LGTM.
Keeps single-GPU path consistent with multi-GPU and Dockerfile.docker/Dockerfile (2)
1-2: Switch to -uv base — LGTM.
Consistent with the CI and Modal changes.
25-28: Install to system site-packages for clarity.
Without --system, uv may create/target a venv; installing to system avoids PATH surprises in downstream shells/entrypoints.-uv pip install --no-build-isolation -e .[ring-flash-attn,optimizers,ray,$AXOLOTL_EXTRAS] $AXOLOTL_ARGS +uv pip install --system --no-build-isolation -e .[ring-flash-attn,optimizers,ray,$AXOLOTL_EXTRAS] $AXOLOTL_ARGS @@ -uv pip install --no-build-isolation -e .[ring-flash-attn,optimizers,ray] $AXOLOTL_ARGS +uv pip install --system --no-build-isolation -e .[ring-flash-attn,optimizers,ray] $AXOLOTL_ARGSLikely an incorrect or invalid review comment.
cicd/Dockerfile.jinja (1)
3-3: Confirm CUDA arch list and PTX fallbackNew value adds 9.0+PTX and drops “+PTX” from 8.6. Verify target GPUs (e.g., H100, A100) and that PTX fallback choice is intentional to balance binary size vs compatibility.
17b1baf to
3a4295c
Compare
|
Opening early for CI. |
|
This is pretty much good to go, save for migrating all the docker images to build on the uv base image. The old base images should be dropped (I think?). Frankly, I'm a bit bad at docker so hopefully I didn't break anything 😅 |
Description
Title. Move to
uvoverpipcommands for installing from PyPI (uv add) and from source (uv sync), update Docker images and CI to useuveverywhere.torch is now installed as part of the regular installation process. However,
uvwill respect the pre-installed torch version in the current environment (so long as it's within the pyproject.toml bounds.I've also made flash-attn and deepspeed platform-dependent, non-optional dependencies, since it seems we always recommend users to install (and seems like we only did this for macos incompatibility reasons?), but of course this is up for debate.
Motivation and Context
uvis quite fast --> translates to users' install times, CI build timesuvhas nice features (better resolution -> fewer dependency conflicts, torch autoselect (technically no need to preinstall torch unless you want an older supported version), automatic wheel selection, deterministic resolution)setuptools_scmauto-versioning 😌auto-gptq->gptqmodel(caused uv locking issues)How has this been tested?
Tests CI comparison
pip:uv:Source install comparison
System info:
Warm start
To compare apples to apples, I started with no torch installed and benchmarked the from-source install runtimes for both main (
pip install -e . ...) and this branch (uv sync).pip: 2m 45s (script)uv: 2s (script)Thanks to pre-resolving dependencies (via
uv.lock),uvis super fast.The real-world runtime is not quite right: in both cases, the needed packages are already cached locally so there's a warm start.
Cold start
Note that this benchmark will depends on my download speed (peak ~500MiB/s).
pip: 4m 34s (script)uv: 1m 24s (script)pipis ~3.25x slower thanuv.In real-world use, I expect the average user will have some packages cached, but not all, so the runtimes should be somewhere between the warm and cold start cases. Anyways, we still save ~2-3m in either case.
Summary by CodeRabbit