Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
9c57249
Playing with parser implementation
tjtelan Aug 12, 2025
159661e
end to end parsing wip
tjtelan Aug 14, 2025
103f529
More tests passing
tjtelan Aug 14, 2025
6d8301b
Moving around code
tjtelan Aug 14, 2025
3fc6f34
Moved into derive_builder
tjtelan Aug 16, 2025
87ca5ff
Fixed most of the tests
tjtelan Aug 16, 2025
a1d3163
Tuning getters and setters
tjtelan Aug 19, 2025
d96eca7
Add provider
tjtelan Aug 23, 2025
b8744b5
passing tests for generic provider
tjtelan Aug 24, 2025
6ba1c8f
Default and custom provider parsing and test
tjtelan Aug 27, 2025
b5d5842
Remaining provider tests stubbed out
tjtelan Aug 29, 2025
7778674
fmt and some clippy
tjtelan Aug 29, 2025
75054f9
Checkpoint
tjtelan Sep 4, 2025
30f4eb1
Saving slices works
tjtelan Sep 4, 2025
b19abf2
Move where char is parsed
tjtelan Sep 4, 2025
c4e898e
All fields slices into struct
tjtelan Sep 5, 2025
1175a8d
More cleanup before moving into crate
tjtelan Sep 5, 2025
44118b6
Swapping out structs
tjtelan Sep 5, 2025
b7d6e0e
Everything connects again
tjtelan Sep 5, 2025
a95e34f
Parse tests passing
tjtelan Sep 6, 2025
f20bfed
Added verify steps to parsing
tjtelan Sep 6, 2025
1cd9d4f
Starting cleanup for provider parsers
tjtelan Sep 6, 2025
d847239
Provider parsing and tests
tjtelan Sep 7, 2025
a77de4e
Move raw url spec parsing into module
tjtelan Sep 8, 2025
e78d74c
Updating GitUrlParseError
tjtelan Sep 8, 2025
6d4cbcf
Update logging and docs for release
tjtelan Sep 12, 2025
6653794
Update git cliff config
tjtelan Sep 12, 2025
780d614
Update readme and msrv
tjtelan Sep 13, 2025
03ff539
Fix msrv
tjtelan Sep 13, 2025
4cffa1a
Fix build error in feature
tjtelan Sep 13, 2025
792273c
Update dev-dependencies with version numbers
tjtelan Sep 13, 2025
1d3c6b8
Cleanup in Cargo.toml
tjtelan Sep 13, 2025
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
32 changes: 20 additions & 12 deletions Cargo.toml
Original file line number Diff line number Diff line change
@@ -1,26 +1,34 @@
[package]
authors = ["T.J. Telan <[email protected]>"]
categories = ["parser-implementations", "encoding"]
description = "A parser for git repo urls based on url crate"
categories = ["parser-implementations"]
description = "A parser for urls used by git"
documentation = "https://docs.rs/git-url-parse"
edition = "2021"
keywords = ["git", "url", "parsing", "normalize"]
edition = "2024"
keywords = ["git", "url", "parser"]
license = "MIT"
name = "git-url-parse"
readme = "README.md"
repository = "https://github.com/tjtelan/git-url-parse-rs"
version = "0.4.6"
rust-version = "1.82"
rust-version = "1.85"

[features]
default = []
tracing = ["dep:tracing"]
default = ["url"]
# Enable Serialize/Deserialize on structs with `serde` crate
serde = ["dep:serde"]
# Enable debugging logging with `log` crate
log = ["dep:log"]
# Enable url parsing validation with `url` crate
url = ["dep:url"]

[dependencies]
tracing = { version = "0.1", optional = true }
url = { version = "2.2" }
strum = { version = "^0.27", features = ["derive"] }
thiserror = "^2.0"
nom = "8"
getset = "0.1"
thiserror = "2"
serde = { version = "1", features = ["derive"], optional = true }
log = { version = "0.4", optional = true }
url = { version = "2.5", optional = true }

[dev-dependencies]
env_logger = "^0.11"
env_logger = "0.11"
log = "0.4"
168 changes: 95 additions & 73 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,95 +1,117 @@
# git-url-parse

[![Crates.io](https://img.shields.io/crates/v/git-url-parse)](https://crates.io/crates/git-url-parse)
![Crates.io MSRV](https://img.shields.io/crates/msrv/git-url-parse?label=rust-version)
[![Crates.io Total Downloads](https://img.shields.io/crates/d/git-url-parse?label=Crates.io%20Downloads)](https://crates.io/crates/git-url-parse)
![Crates.io MSRV](https://img.shields.io/crates/msrv/git-url-parse?label=Min%20Supported%20Rust%20version)
[![Github actions CI status](https://github.com/tjtelan/git-url-parse-rs/actions/workflows/ci.yml/badge.svg)](https://github.com/tjtelan/git-url-parse-rs/actions/workflows/ci.yml)
[![docs.rs](https://docs.rs/git-url-parse/badge.svg)](https://docs.rs/git-url-parse/)
[![License](https://img.shields.io/github/license/tjtelan/git-url-parse-rs)](LICENSE)
![Maintenance](https://img.shields.io/maintenance/passively-maintained/2025)

Supports common protocols as specified by the [Pro Git book](https://git-scm.com/book/en/v2)
---

See: [4.1 Git on the Server - The Protocols](https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols)
<!-- cargo-rdme start -->

Supports parsing SSH/HTTPS repo urls for:
* Github
* Bitbucket
* Azure Devops
# Git Url Parse

See [tests/parse.rs](tests/parse.rs) for expected output for a variety of inputs.
Parses url used by git (e.g. `git clone <url>`)

---
## Features

- 🔍 Parses `git clone` compatible urls into [`GitUrl`](https://docs.rs/git-url-parse/latest/git_url_parse/types/struct.GitUrl.html)
- Supports multiple Git URL schemes (SSH, HTTP, HTTPS, File)
- Inspired by [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986) with adaptations to support Git urls

- 🏗️ Host provider info extraction
- Easy to implement trait [`GitProvider`](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/trait.GitProvider.html) for custom provider parsing
- Built-in support for multiple Git hosting providers
* [Generic](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.GenericProvider.html) (`git@host:owner/repo.git` style urls)
* [GitLab](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.GitLabProvider.html)
* [Azure DevOps](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.AzureDevOpsProvider.html)

## Quick Example

```rust
use git_url_parse::{GitUrl, GitUrlParseError};
use git_url_parse::types::provider::GitProvider;
use git_url_parse::types::provider::GenericProvider;

fn main() -> Result<(), git_url_parse::GitUrlParseError> {
let http_url = GitUrl::parse("https://github.com/tjtelan/git-url-parse-rs.git")?;

// Extract basic URL components
assert_eq!(http_url.host(), Some("github.com"));
assert_eq!(http_url.path(), "/tjtelan/git-url-parse-rs.git");

// Support ssh-based urls as well
let ssh_url = GitUrl::parse("[email protected]:tjtelan/git-url-parse-rs.git")?;

assert_eq!(ssh_url.scheme(), Some("ssh"));
assert_eq!(ssh_url.host(), Some("github.com"));
assert_eq!(ssh_url.path(), "tjtelan/git-url-parse-rs.git");

// Extract provider-specific information
// Built-in support for Github (Generic), Gitlab, Azure Devops style urls
let provider : GenericProvider = ssh_url.provider_info()?;
assert_eq!(provider.owner(), "tjtelan");
assert_eq!(provider.repo(), "git-url-parse-rs");

// Implement your own provider
#[derive(Debug, Clone, PartialEq, Eq)]
struct CustomProvider;

impl GitProvider<GitUrl<'_>, GitUrlParseError> for CustomProvider {
fn from_git_url(_url: &GitUrl) -> Result<Self, GitUrlParseError> {
// Your custom provider parsing here
Ok(Self)
}
}

let custom_provider: CustomProvider = ssh_url.provider_info()?;
let expected = CustomProvider;
assert_eq!(custom_provider, expected);

Ok(())
}
```

URLs that use the `ssh://` protocol (implicitly or explicitly) undergo a small normalization process in order to be parsed.
## Limitations

Internally uses `Url::parse()` from the [Url](https://crates.io/crates/url) crate after normalization.
Intended only for git repo urls. Url spec [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986) is not fully implemented.

## Examples
- No support for:
- Query parameters
- Fragment identifiers
- Percent-encoding
- Complex IP address formats

### Run example with debug output
## Install

```shell
$ RUST_LOG=git_url_parse cargo run --example multi
$ RUST_LOG=git_url_parse cargo run --example trim_auth
cargo add git-url-parse
```

### Simple usage and output
### Cargo Features

```bash
$ cargo run --example readme
```
#### `log`
Enable for internal `debug!` output from [log](https://docs.rs/log/latest)
#### `serde`
Enable for [serde](https://docs.rs/serde/latest/) `Serialize`/`Deserialize` on [`GitUrl`](https://docs.rs/git-url-parse/latest/git_url_parse/types/struct.GitUrl.html)
#### `url`
(**enabled by default**)

```rust
use git_url_parse::GitUrl;
Uses [url](https://docs.rs/url/latest/) during parsing for full url validation

fn main() {
println!("SSH: {:#?}", GitUrl::parse("[email protected]:tjtelan/git-url-parse-rs.git"));
println!("HTTPS: {:#?}", GitUrl::parse("https://github.com/tjtelan/git-url-parse-rs"));
}
```
<!-- cargo-rdme end -->

## Migration from 0.4.x and earlier

This crate was one of my first serious projects in Rust. Because I was still learning, it had some maintenance problems and was a bit awkward to use. In version 0.5, I rewrote most of it to fix those issues.

The [`GitUrl`](https://docs.rs/git-url-parse/latest/git_url_parse/types/struct.GitUrl.html) struct is only meant to handle parsing urls used by `git`, which the [url](https://docs.rs/url/latest/url) crate doesn't handle. The recent updates make it so the input string is parsed and internally stored into a simple string slice (`&str`). And, instead of exposing all the internal fields of the struct, those details are hidden, and we use methods to interact with it.

The [`GitProvider`](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/trait.GitProvider.html) trait helps extract common pieces of information that are often found in different url patterns using the [`GitUrl::provider_info`](https://docs.rs/git-url-parse/latest/git_url_parse/types/struct.GitUrl.html#method.provider_info) method. Several example provider parsers are included to show how this works. The result of [`GitUrl::parse`](https://docs.rs/git-url-parse/latest/git_url_parse/types/struct.GitUrl.html#method.parse) is more straightforward to use, but the internal details are hidden, and working with provider-specific information at the git host level is more specialized.

The most common pattern for git url paths, like `/owner/repo.git`, is handled by [`GenericProvider`](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.GenericProvider.html).

There's also [`AzureDevOpsProvider`](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.AzureDevOpsProvider.html), which is designed for Azure DevOps urls that follow the `org`, `project`, `repo` pattern.

### Example Output
```bash
SSH: Ok(
GitUrl {
host: Some(
"github.com",
),
name: "git-url-parse-rs",
owner: Some(
"tjtelan",
),
organization: None,
fullname: "tjtelan/git-url-parse-rs",
scheme: Ssh,
user: Some(
"git",
),
token: None,
port: None,
path: "tjtelan/git-url-parse-rs.git",
git_suffix: true,
scheme_prefix: false,
},
)
HTTPS: Ok(
GitUrl {
host: Some(
"github.com",
),
name: "git-url-parse-rs",
owner: Some(
"tjtelan",
),
organization: None,
fullname: "tjtelan/git-url-parse-rs",
scheme: Https,
user: None,
token: None,
port: None,
path: "/tjtelan/git-url-parse-rs",
git_suffix: false,
scheme_prefix: true,
},
)
```
Finally, there's a new supported provider called [`GitLabProvider`](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.GitLabProvider.html), which is for GitLab urls. It supports the common `owner/repo` pattern shared with [`GenericProvider`](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.GenericProvider.html), and also handles GitLab’s subgroups.
Loading
Loading