Skip to content

Commit 2001baa

Browse files
authored
feat: Reimplement GitUrl with nom (#61)
Reimplement `GitUrl` and introduce `GitProvider` trait * 2024 edition * MSRV 1.85 * Doctests * More stable dependencies * New cargo features See migration docs for 0.4.x to the new patterns at bottom of README.md Fixes #6 Fixes #7 Resolves #9 Resolves #10 Closes #19
1 parent 1f73df5 commit 2001baa

File tree

15 files changed

+2110
-1208
lines changed

15 files changed

+2110
-1208
lines changed

Cargo.toml

Lines changed: 20 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -1,26 +1,34 @@
11
[package]
22
authors = ["T.J. Telan <[email protected]>"]
3-
categories = ["parser-implementations", "encoding"]
4-
description = "A parser for git repo urls based on url crate"
3+
categories = ["parser-implementations"]
4+
description = "A parser for urls used by git"
55
documentation = "https://docs.rs/git-url-parse"
6-
edition = "2021"
7-
keywords = ["git", "url", "parsing", "normalize"]
6+
edition = "2024"
7+
keywords = ["git", "url", "parser"]
88
license = "MIT"
99
name = "git-url-parse"
1010
readme = "README.md"
1111
repository = "https://github.com/tjtelan/git-url-parse-rs"
1212
version = "0.4.6"
13-
rust-version = "1.82"
13+
rust-version = "1.85"
1414

1515
[features]
16-
default = []
17-
tracing = ["dep:tracing"]
16+
default = ["url"]
17+
# Enable Serialize/Deserialize on structs with `serde` crate
18+
serde = ["dep:serde"]
19+
# Enable debugging logging with `log` crate
20+
log = ["dep:log"]
21+
# Enable url parsing validation with `url` crate
22+
url = ["dep:url"]
1823

1924
[dependencies]
20-
tracing = { version = "0.1", optional = true }
21-
url = { version = "2.2" }
22-
strum = { version = "^0.27", features = ["derive"] }
23-
thiserror = "^2.0"
25+
nom = "8"
26+
getset = "0.1"
27+
thiserror = "2"
28+
serde = { version = "1", features = ["derive"], optional = true }
29+
log = { version = "0.4", optional = true }
30+
url = { version = "2.5", optional = true }
2431

2532
[dev-dependencies]
26-
env_logger = "^0.11"
33+
env_logger = "0.11"
34+
log = "0.4"

README.md

Lines changed: 95 additions & 73 deletions
Original file line numberDiff line numberDiff line change
@@ -1,95 +1,117 @@
1-
# git-url-parse
2-
31
[![Crates.io](https://img.shields.io/crates/v/git-url-parse)](https://crates.io/crates/git-url-parse)
4-
![Crates.io MSRV](https://img.shields.io/crates/msrv/git-url-parse?label=rust-version)
2+
[![Crates.io Total Downloads](https://img.shields.io/crates/d/git-url-parse?label=Crates.io%20Downloads)](https://crates.io/crates/git-url-parse)
3+
![Crates.io MSRV](https://img.shields.io/crates/msrv/git-url-parse?label=Min%20Supported%20Rust%20version)
54
[![Github actions CI status](https://github.com/tjtelan/git-url-parse-rs/actions/workflows/ci.yml/badge.svg)](https://github.com/tjtelan/git-url-parse-rs/actions/workflows/ci.yml)
65
[![docs.rs](https://docs.rs/git-url-parse/badge.svg)](https://docs.rs/git-url-parse/)
76
[![License](https://img.shields.io/github/license/tjtelan/git-url-parse-rs)](LICENSE)
87
![Maintenance](https://img.shields.io/maintenance/passively-maintained/2025)
98

10-
Supports common protocols as specified by the [Pro Git book](https://git-scm.com/book/en/v2)
9+
---
1110

12-
See: [4.1 Git on the Server - The Protocols](https://git-scm.com/book/en/v2/Git-on-the-Server-The-Protocols)
11+
<!-- cargo-rdme start -->
1312

14-
Supports parsing SSH/HTTPS repo urls for:
15-
* Github
16-
* Bitbucket
17-
* Azure Devops
13+
# Git Url Parse
1814

19-
See [tests/parse.rs](tests/parse.rs) for expected output for a variety of inputs.
15+
Parses url used by git (e.g. `git clone <url>`)
2016

21-
---
17+
## Features
18+
19+
- 🔍 Parses `git clone` compatible urls into [`GitUrl`](https://docs.rs/git-url-parse/latest/git_url_parse/types/struct.GitUrl.html)
20+
- Supports multiple Git URL schemes (SSH, HTTP, HTTPS, File)
21+
- Inspired by [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986) with adaptations to support Git urls
22+
23+
- 🏗️ Host provider info extraction
24+
- Easy to implement trait [`GitProvider`](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/trait.GitProvider.html) for custom provider parsing
25+
- Built-in support for multiple Git hosting providers
26+
* [Generic](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.GenericProvider.html) (`git@host:owner/repo.git` style urls)
27+
* [GitLab](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.GitLabProvider.html)
28+
* [Azure DevOps](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.AzureDevOpsProvider.html)
29+
30+
## Quick Example
31+
32+
```rust
33+
use git_url_parse::{GitUrl, GitUrlParseError};
34+
use git_url_parse::types::provider::GitProvider;
35+
use git_url_parse::types::provider::GenericProvider;
36+
37+
fn main() -> Result<(), git_url_parse::GitUrlParseError> {
38+
let http_url = GitUrl::parse("https://github.com/tjtelan/git-url-parse-rs.git")?;
39+
40+
// Extract basic URL components
41+
assert_eq!(http_url.host(), Some("github.com"));
42+
assert_eq!(http_url.path(), "/tjtelan/git-url-parse-rs.git");
43+
44+
// Support ssh-based urls as well
45+
let ssh_url = GitUrl::parse("[email protected]:tjtelan/git-url-parse-rs.git")?;
46+
47+
assert_eq!(ssh_url.scheme(), Some("ssh"));
48+
assert_eq!(ssh_url.host(), Some("github.com"));
49+
assert_eq!(ssh_url.path(), "tjtelan/git-url-parse-rs.git");
50+
51+
// Extract provider-specific information
52+
// Built-in support for Github (Generic), Gitlab, Azure Devops style urls
53+
let provider : GenericProvider = ssh_url.provider_info()?;
54+
assert_eq!(provider.owner(), "tjtelan");
55+
assert_eq!(provider.repo(), "git-url-parse-rs");
56+
57+
// Implement your own provider
58+
#[derive(Debug, Clone, PartialEq, Eq)]
59+
struct CustomProvider;
60+
61+
impl GitProvider<GitUrl<'_>, GitUrlParseError> for CustomProvider {
62+
fn from_git_url(_url: &GitUrl) -> Result<Self, GitUrlParseError> {
63+
// Your custom provider parsing here
64+
Ok(Self)
65+
}
66+
}
67+
68+
let custom_provider: CustomProvider = ssh_url.provider_info()?;
69+
let expected = CustomProvider;
70+
assert_eq!(custom_provider, expected);
71+
72+
Ok(())
73+
}
74+
```
2275

23-
URLs that use the `ssh://` protocol (implicitly or explicitly) undergo a small normalization process in order to be parsed.
76+
## Limitations
2477

25-
Internally uses `Url::parse()` from the [Url](https://crates.io/crates/url) crate after normalization.
78+
Intended only for git repo urls. Url spec [RFC 3986](https://datatracker.ietf.org/doc/html/rfc3986) is not fully implemented.
2679

27-
## Examples
80+
- No support for:
81+
- Query parameters
82+
- Fragment identifiers
83+
- Percent-encoding
84+
- Complex IP address formats
2885

29-
### Run example with debug output
86+
## Install
3087

3188
```shell
32-
$ RUST_LOG=git_url_parse cargo run --example multi
33-
$ RUST_LOG=git_url_parse cargo run --example trim_auth
89+
cargo add git-url-parse
3490
```
3591

36-
### Simple usage and output
92+
### Cargo Features
3793

38-
```bash
39-
$ cargo run --example readme
40-
```
94+
#### `log`
95+
Enable for internal `debug!` output from [log](https://docs.rs/log/latest)
96+
#### `serde`
97+
Enable for [serde](https://docs.rs/serde/latest/) `Serialize`/`Deserialize` on [`GitUrl`](https://docs.rs/git-url-parse/latest/git_url_parse/types/struct.GitUrl.html)
98+
#### `url`
99+
(**enabled by default**)
41100

42-
```rust
43-
use git_url_parse::GitUrl;
101+
Uses [url](https://docs.rs/url/latest/) during parsing for full url validation
44102

45-
fn main() {
46-
println!("SSH: {:#?}", GitUrl::parse("[email protected]:tjtelan/git-url-parse-rs.git"));
47-
println!("HTTPS: {:#?}", GitUrl::parse("https://github.com/tjtelan/git-url-parse-rs"));
48-
}
49-
```
103+
<!-- cargo-rdme end -->
104+
105+
## Migration from 0.4.x and earlier
106+
107+
This crate was one of my first serious projects in Rust. Because I was still learning, it had some maintenance problems and was a bit awkward to use. In version 0.5, I rewrote most of it to fix those issues.
108+
109+
The [`GitUrl`](https://docs.rs/git-url-parse/latest/git_url_parse/types/struct.GitUrl.html) struct is only meant to handle parsing urls used by `git`, which the [url](https://docs.rs/url/latest/url) crate doesn't handle. The recent updates make it so the input string is parsed and internally stored into a simple string slice (`&str`). And, instead of exposing all the internal fields of the struct, those details are hidden, and we use methods to interact with it.
110+
111+
The [`GitProvider`](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/trait.GitProvider.html) trait helps extract common pieces of information that are often found in different url patterns using the [`GitUrl::provider_info`](https://docs.rs/git-url-parse/latest/git_url_parse/types/struct.GitUrl.html#method.provider_info) method. Several example provider parsers are included to show how this works. The result of [`GitUrl::parse`](https://docs.rs/git-url-parse/latest/git_url_parse/types/struct.GitUrl.html#method.parse) is more straightforward to use, but the internal details are hidden, and working with provider-specific information at the git host level is more specialized.
112+
113+
The most common pattern for git url paths, like `/owner/repo.git`, is handled by [`GenericProvider`](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.GenericProvider.html).
114+
115+
There's also [`AzureDevOpsProvider`](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.AzureDevOpsProvider.html), which is designed for Azure DevOps urls that follow the `org`, `project`, `repo` pattern.
50116

51-
### Example Output
52-
```bash
53-
SSH: Ok(
54-
GitUrl {
55-
host: Some(
56-
"github.com",
57-
),
58-
name: "git-url-parse-rs",
59-
owner: Some(
60-
"tjtelan",
61-
),
62-
organization: None,
63-
fullname: "tjtelan/git-url-parse-rs",
64-
scheme: Ssh,
65-
user: Some(
66-
"git",
67-
),
68-
token: None,
69-
port: None,
70-
path: "tjtelan/git-url-parse-rs.git",
71-
git_suffix: true,
72-
scheme_prefix: false,
73-
},
74-
)
75-
HTTPS: Ok(
76-
GitUrl {
77-
host: Some(
78-
"github.com",
79-
),
80-
name: "git-url-parse-rs",
81-
owner: Some(
82-
"tjtelan",
83-
),
84-
organization: None,
85-
fullname: "tjtelan/git-url-parse-rs",
86-
scheme: Https,
87-
user: None,
88-
token: None,
89-
port: None,
90-
path: "/tjtelan/git-url-parse-rs",
91-
git_suffix: false,
92-
scheme_prefix: true,
93-
},
94-
)
95-
```
117+
Finally, there's a new supported provider called [`GitLabProvider`](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.GitLabProvider.html), which is for GitLab urls. It supports the common `owner/repo` pattern shared with [`GenericProvider`](https://docs.rs/git-url-parse/latest/git_url_parse/types/provider/struct.GenericProvider.html), and also handles GitLab’s subgroups.

0 commit comments

Comments
 (0)