Skip to content

Comments

feat(pm): internalize git dependency resolution into ruborist#2603

Open
killagu-claw wants to merge 1 commit intoutooland:nextfrom
killagu-claw:feat/add-git
Open

feat(pm): internalize git dependency resolution into ruborist#2603
killagu-claw wants to merge 1 commit intoutooland:nextfrom
killagu-claw:feat/add-git

Conversation

@killagu-claw
Copy link
Contributor

Summary

  • Move git clone & resolution logic into utoo-ruborist behind a git Cargo feature flag
  • Git specs (git+https://, github:, etc.) are now resolved on-the-fly during BFS traversal via resolve_non_registry_dep, so transitive git dependencies work correctly
  • PM no longer depends on gix directly — it enables utoo-ruborist/git instead
  • BuildDepsOptions<G, R> stays at 2 generics; no GitResolver trait in public API
  • Add shared git_cache_path helper to deduplicate SHA-prefix cache logic
  • Clean up dead is_local_spec alias, clarify worker.rs skip comment

Test plan

  • cargo build -p utoo-ruborist (without git feature)
  • cargo build -p utoo-ruborist --features git
  • cargo build -p utoo-pm
  • cargo test -p utoo-ruborist — all 122 tests pass (1 pre-existing doctest failure on non-darwin)
  • cargo test -p utoo-pm — 190/191 pass (1 pre-existing failure in test_run_script_not_found)
  • Manual: ut add github:user/repo resolves inside BFS

🤖 Generated with Claude Code

@gemini-code-assist
Copy link
Contributor

Summary of Changes

Hello @killagu-claw, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly refactors how git dependencies are managed within the project's package management system. By moving the core git cloning and resolution capabilities into the utoo-ruborist library, it streamlines the dependency resolution process, particularly for non-registry packages. This change improves the consistency and efficiency of handling git-based dependencies, ensuring they are properly cached and resolved during the dependency graph construction, while also reducing the direct dependency burden on the utoo-pm crate.

Highlights

  • Internalized Git Dependency Resolution: Moved git cloning and resolution logic from utoo-pm into utoo-ruborist, enabling it via a new git Cargo feature flag. This centralizes git handling within the dependency resolver.
  • On-the-Fly Git Spec Resolution: Implemented resolution of git specifications (e.g., git+https://, github:) directly within utoo-ruborist's BFS traversal via resolve_non_registry_dep, ensuring transitive git dependencies are correctly handled.
  • Reduced utoo-pm Direct Git Dependency: utoo-pm no longer directly depends on the gix crate; instead, it enables the utoo-ruborist/git feature, delegating git operations to the core resolver.
  • Git Package Caching: Introduced a shared git_cache_path helper and logic to deduplicate git package cloning, caching resolved git repositories by their commit SHA prefix.
  • Enhanced Package Spec Parsing: Added a new PackageSpec enum and parse_cli_spec function in utoo-ruborist to robustly handle various dependency specifiers, including registry, git, GitHub shorthand, HTTP tarball, and file paths.
Changelog
  • Cargo.lock
    • Added new dependencies: regex-automata, clru, faster-hex, libz-rs-sys, gix and its numerous sub-crates, kstring, maybe-async, prodash, futures-channel, sha1-checked, unicode-bom, tempfile, and zlib-rs.
  • crates/pm/Cargo.toml
    • Enabled the git feature for the utoo-ruborist dependency.
  • crates/pm/src/helper/lock.rs
    • Removed the parse_package_spec import, as git spec parsing is now handled by utoo-ruborist.
    • Modified update_package_json to write resolved git URLs (with pinned commit SHAs) for git specs.
    • Refactored resolve_package_spec to use utoo_ruborist::spec::parse_cli_spec and delegate git/GitHub spec resolution to new git_resolver functions.
  • crates/pm/src/helper/ruborist_context.rs
    • Removed a blank line for minor formatting.
  • crates/pm/src/service/pipeline/worker.rs
    • Added logic to skip downloading git packages in the pipeline worker, as they are now pre-cached during BFS resolution.
  • crates/pm/src/util/downloader.rs
    • Implemented a check to return cached git package paths directly if the tarball_url indicates a git source, avoiding redundant downloads.
  • crates/pm/src/util/git_resolver.rs
    • Added a new module to provide thin wrappers around utoo-ruborist's git cloning functionality, injecting the PM cache directory.
    • Introduced resolve_git_spec and resolve_github_spec for resolving git and GitHub shorthand URLs.
    • Included is_git_resolved utility function.
  • crates/pm/src/util/mod.rs
    • Added git_resolver module to the utilities.
  • crates/ruborist/Cargo.toml
    • Added gix and tempfile as optional dependencies.
    • Introduced a new git feature flag to enable git-related functionality.
  • crates/ruborist/src/lib.rs
    • Exported new spec and git modules for package specification types and git resolution helpers, respectively.
  • crates/ruborist/src/model/mod.rs
    • Added spec module to the model definitions.
  • crates/ruborist/src/model/spec.rs
    • Added a new module defining PackageSpec enum for typed representation of various dependency sources (Registry, Git, GitHub, Http, FileDir, FileTarball).
    • Implemented is_http_tarball_spec and parse_cli_spec for parsing CLI arguments into PackageSpec.
  • crates/ruborist/src/resolver/builder.rs
    • Added cache_dir option to BuildDepsConfig for git clone caching.
    • Updated gather_preload_deps to use is_non_registry_spec for filtering, including git and other non-registry types.
    • Modified process_dependency to use the new BuildDepsConfig and route non-registry specs through resolve_non_registry_dep.
    • Adjusted run_bfs_phase to pass the BuildDepsConfig for dependency processing.
  • crates/ruborist/src/resolver/git.rs
    • Added a new module containing the gix-powered git clone backend.
    • Implemented git_cache_path for determining cache locations based on commit SHA prefixes.
    • Provided clone_repo for cloning git repositories, handling authentication, shallow fetches, and extracting tree contents to a cache directory.
    • Included helper functions for GitHub authentication and tree extraction.
  • crates/ruborist/src/resolver/mod.rs
    • Added the git module to the resolver, conditionally compiled with the git feature.
  • crates/ruborist/src/resolver/preload.rs
    • Replaced is_local_spec with is_non_registry_spec to include git, GitHub, and HTTP tarball specs in the non-registry category.
    • Updated collect_deps to filter out these new non-registry specs during preloading.
  • crates/ruborist/src/resolver/registry.rs
    • Added a new Git error variant to ResolveError for git resolution failures.
  • crates/ruborist/src/service/api.rs
    • Removed the BuildDepsOptions::new constructor.
    • Passed the cache_dir from BuildDepsOptions to the BuildDepsConfig for use in git dependency resolution.
  • crates/ruborist/src/traits/git.rs
    • Added a new module defining traits and structures for git package resolution.
    • Introduced GitCloneResult to encapsulate metadata from a git clone operation.
    • Provided resolve_non_registry_dep as a high-level function to resolve non-registry specs, clone repositories, and build VersionManifest from cached package.json files.
  • crates/ruborist/src/traits/mod.rs
    • Added the git module to the traits.
Activity
  • The author, killagu-claw, created this pull request to internalize git dependency resolution into utoo-ruborist.
  • The author provided a detailed summary of the changes and a test plan, including cargo build and cargo test commands for both utoo-ruborist (with and without the git feature) and utoo-pm, as well as a manual test for ut add github:user/repo.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link
Contributor

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a significant feature by internalizing git dependency resolution into ruborist. The changes are well-structured, moving git-related logic behind a feature flag and providing a clear separation between registry and git dependency handling. The implementation uses gix for git operations and includes caching for cloned repositories. The parsing of different package specifications is now centralized and more robust.

I've found one potential issue regarding cache collisions for git repositories and a minor point of code clarity. Overall, this is a great enhancement to the package manager's capabilities.

// The resolver still emits a PackageResolved event with a git tarball URL,
// so we intentionally skip the download here to avoid a redundant fetch.
if tarball_url.starts_with("git+") || tarball_url.starts_with("git://") {
tracing::debug!(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pipeline 模式下,worker 也应该前置进行 clone 操作


[features]
default = []
git = ["gix", "tempfile"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

feature 命令为 git 不太语义,git/github 依赖可以通过 http api 来 fallback ?

wasm 场景应该也能支持,或者通过 cfg 明确进行报错

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

这个先这样吧。我另外再起一个 PR,让 gix 也支持 wasm 的。现在这个 pr 有点太大了。

Move git clone & resolution logic into `utoo-ruborist` behind a
`native-git` Cargo feature flag.  Git specs (git+https://, github:,
etc.) are now resolved on-the-fly during BFS traversal via
`resolve_non_registry_dep`, so transitive git dependencies work.

Key changes
-----------
- Add gix/tempfile as optional deps under `native-git` feature
- Add `resolve_non_registry_dep` with dual cfg (enabled / noop error)
- Route `is_non_registry_spec` edges through git resolver in BFS
- Add `PackageSpec` enum and `parse_cli_spec` for typed spec parsing
- Add shared `git_cache_path` helper with URL-hash namespacing
  to prevent cross-repo SHA prefix collisions
  (layout: `_git/<url_hash>/<sha_prefix>/package/`)
- Extract `format_save_spec` for clean version-to-write logic
- Extract `resolve_cache_path` / `is_git_url` / `git_cache_lookup`
  so download_to_cache only handles registry tarballs
- PM enables `utoo-ruborist/native-git` instead of depending on gix
- `BuildDepsOptions<G, R>` stays at 2 generics; no git types leak to PM

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants