Skip to content

Commit 7c87834

Browse files
committed
Support caching for repository clones and improve hook parsing logic
- Integrated `git2` crate for efficient repository cloning. - Reworked hook parsing logic to rely solely on `.pre-commit-hooks.yaml` without repo-specific conditions. - Updated cache directory management for storing cloned repositories efficiently. - Enhanced test suite to cover cache functionality and hook validation extensively. - Marked relevant tasks as completed in `plan.md`. - Updated `.gitignore` to exclude cache directory.
1 parent 90f648e commit 7c87834

File tree

8 files changed

+267
-149
lines changed

8 files changed

+267
-149
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -19,3 +19,4 @@ target/
1919
# and can be added to the global gitignore or merged into this file. For a more nuclear
2020
# option (not recommended) you can uncomment the following to ignore the entire idea folder.
2121
#.idea/
22+
.rustyhook/cache

Cargo.lock

Lines changed: 56 additions & 0 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

Cargo.toml

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ flate2 = "1.0"
2323
tar = "0.4"
2424
zip = "0.6"
2525
zstd = "0.13"
26+
git2 = "0.18"
2627

2728
[[bin]]
2829
name = "rh"

docs/build.md

Lines changed: 9 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,10 +1,10 @@
1-
Identify the first improvements based on the task list in `docs/plan.md` and plan on completing the task.
1+
Identify the first unchecked improvement based on the task list in `docs/plan.md` and plan on completing the task.
22
Use the following workflow. Execute each in order:
3-
* Address the task in the plan by updating and creating code.
4-
* Create tests for the changes you made.
5-
* Run the relevant code to ensure it works as expected
6-
* run the application without any arguments and correct any errors and warnings you find iteratively until no errors or warnings are found..
7-
* run the application with the "compat" argument and correct any errors and warnings you find iteratively until it works without error.
8-
* Update the documentation if needed.
9-
* commit and push all files - craft a good commit message relevant to the task you are completing.
10-
* After completing each task, mark it as done by changing the checkbox from [ ] to [X]
3+
* Address the task in the plan by updating and creating code.
4+
* Create tests for the changes you made.
5+
* Run the relevant code to ensure it works as expected
6+
* run the application without any arguments and correct any errors and warnings you find iteratively until no errors or warnings are found..
7+
* run the application with the "compat" argument and correct any errors and warnings you find iteratively until it works without error.
8+
* Update the documentation if needed.
9+
* commit and push all files - craft a good commit message relevant to the task you are completing.
10+
* After completing each task, mark it as done by changing the checkbox from [ ] to [X]

docs/plan.md

Lines changed: 4 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -26,7 +26,10 @@ Tasks:
2626
- [x] Take the best ideas from the pre-commit and lefthook projects and integrate them into rustyhook.
2727
- [x] ensure all tasks can operate in parallel
2828
- [x] There is no need for any form of conditional checks when reading the compat repositories. You can instead look at the .pre-commit-hooks.yaml file in the root of each repository and parse it to tell you how to execute the hooks.
29-
- [ ] The entire concept of how the compat command is working is flawed. It should never need to know anything about any particular repository. It should be able to read the .pre-commit-hooks.yaml file and determine how to execute the hooks.
29+
- [x] The entire concept of how the compat command is working is flawed. It should never need to know anything about any particular repository. Like "if repo_url.contains("pre-commit/pre-commit-hooks") {" It should be able to read the .pre-commit-hooks.yaml file and determine how to execute the hooks.
30+
- [x] Make sure that when we are downloading the git repositories for the hooks that they go into our local cache directory instead of tmp.
31+
- [ ] As part of the doctor command, we should ensure that the cache directory is clean and has been added to the .gitignore file and .dockerignore file if they exist. If they do not suggest to the user that they might want to add them if a .git directory is found and conversely if a Dockerfile is found.
32+
- [ ] When performing a git clone, you can clone to a 1 level depth to avoid downloading the entire repository history.
3033
- [ ] We will need some sort of mutex system to ensure that the hooks are not running at the same time on the same file. Perhaps what might work better is to mark the hooks as readers or readers and writers to allow for all readers to execute first and in parallel but the reader/writers can only execute in parallel as long as their file globs do not overlap.
3134
- [ ] implement an "explain" command that can be used to explain the current configuration and any errors that may have occurred. Perhaps the existing doctor command can be used instead?
3235
- [ ] use uv to start all python hooks in a separate process

src/config/compat.rs

Lines changed: 93 additions & 127 deletions
Original file line numberDiff line numberDiff line change
@@ -6,6 +6,7 @@ use serde::{Deserialize, Serialize};
66
use std::collections::HashMap;
77
use std::fs;
88
use std::path::{Path, PathBuf};
9+
use git2;
910

1011
use super::parser::{Config, Hook, Repo, ConfigError, HookType, AccessMode};
1112

@@ -42,154 +43,119 @@ pub struct PreCommitHookDefinition {
4243
}
4344

4445
/// Represents a .pre-commit-hooks.yaml file
46+
///
47+
/// This is a wrapper around a vector of hook definitions to make it easier to work with.
4548
#[derive(Debug, Serialize, Deserialize)]
4649
pub struct PreCommitHooksFile {
4750
/// List of hooks in this repository
4851
pub hooks: Vec<PreCommitHookDefinition>,
4952
}
5053

54+
impl From<Vec<PreCommitHookDefinition>> for PreCommitHooksFile {
55+
fn from(hooks: Vec<PreCommitHookDefinition>) -> Self {
56+
PreCommitHooksFile { hooks }
57+
}
58+
}
59+
5160
/// Parse a .pre-commit-hooks.yaml file
5261
pub fn parse_precommit_hooks_file<P: AsRef<Path>>(path: P) -> Result<PreCommitHooksFile, ConfigError> {
5362
let hooks_str = fs::read_to_string(path)?;
54-
let hooks: PreCommitHooksFile = serde_yaml::from_str(&hooks_str)?;
55-
Ok(hooks)
63+
64+
// Try to parse as a PreCommitHooksFile first
65+
match serde_yaml::from_str::<PreCommitHooksFile>(&hooks_str) {
66+
Ok(hooks_file) => Ok(hooks_file),
67+
Err(_) => {
68+
// If that fails, try to parse as a Vec<PreCommitHookDefinition>
69+
let hooks: Vec<PreCommitHookDefinition> = serde_yaml::from_str(&hooks_str)?;
70+
Ok(PreCommitHooksFile::from(hooks))
71+
}
72+
}
5673
}
5774

5875
/// Find and parse the .pre-commit-hooks.yaml file for a repository
76+
///
77+
/// This function clones the repository to the local cache directory and looks for
78+
/// .pre-commit-hooks.yaml in the root of the repository.
79+
/// If found, it parses the file and returns the hooks defined in it.
80+
/// If the file can't be found or parsed, it returns None.
5981
pub fn find_precommit_hooks_for_repo(repo_url: &str) -> Option<PreCommitHooksFile> {
60-
// In a real implementation, this would fetch the repository and parse its .pre-commit-hooks.yaml file
61-
// For now, we'll simulate fetching and parsing the .pre-commit-hooks.yaml file
82+
// Create a cache directory for repositories
83+
let cache_dir = std::env::current_dir().unwrap_or_default().join(".rustyhook").join("cache").join("repos");
6284

63-
// This function should fetch the repository, look for a .pre-commit-hooks.yaml file,
64-
// and parse it to determine the hooks available in the repository.
85+
// Create a subdirectory for this specific repository
86+
// Use a hash of the repo URL to create a unique directory name
87+
use std::collections::hash_map::DefaultHasher;
88+
use std::hash::{Hash, Hasher};
6589

66-
// For the purpose of this implementation, we'll create a mock function that returns
67-
// a simulated .pre-commit-hooks.yaml file for well-known repositories.
68-
// In a production environment, this would be replaced with actual fetching and parsing logic.
90+
let mut hasher = DefaultHasher::new();
91+
repo_url.hash(&mut hasher);
92+
let repo_hash = hasher.finish();
6993

70-
// Extract the repository name from the URL for logging purposes
71-
let repo_parts: Vec<&str> = repo_url.split('/').collect();
72-
if repo_parts.len() < 2 {
73-
return None;
74-
}
94+
let repo_dir = cache_dir.join(format!("{}", repo_hash));
7595

76-
// Get the last part of the URL (repo name)
77-
let _repo = repo_parts.last().unwrap_or(&"");
78-
79-
// In a real implementation, we would:
80-
// 1. Clone or fetch the repository
81-
// 2. Look for a .pre-commit-hooks.yaml file
82-
// 3. Parse the file and return the hooks
83-
84-
// For now, we'll return a simulated set of hooks for well-known repositories
85-
// This is just for demonstration purposes until the actual fetching logic is implemented
86-
87-
// Create a mock .pre-commit-hooks.yaml file based on the repository URL
88-
// These are representative examples of what these files might contain
89-
90-
// For pre-commit-hooks repository
91-
if repo_url.contains("pre-commit/pre-commit-hooks") {
92-
let hooks = vec![
93-
PreCommitHookDefinition {
94-
id: "trailing-whitespace".to_string(),
95-
name: "Trim Trailing Whitespace".to_string(),
96-
description: "Trims trailing whitespace".to_string(),
97-
entry: "trailing-whitespace".to_string(),
98-
language: "python".to_string(),
99-
files: "".to_string(),
100-
args: vec![],
101-
stages: vec!["commit".to_string()],
102-
},
103-
PreCommitHookDefinition {
104-
id: "end-of-file-fixer".to_string(),
105-
name: "Fix End of Files".to_string(),
106-
description: "Ensures that a file is either empty, or ends with one newline".to_string(),
107-
entry: "end-of-file-fixer".to_string(),
108-
language: "python".to_string(),
109-
files: "".to_string(),
110-
args: vec![],
111-
stages: vec!["commit".to_string()],
112-
},
113-
PreCommitHookDefinition {
114-
id: "check-yaml".to_string(),
115-
name: "Check Yaml".to_string(),
116-
description: "Checks yaml files for parseable syntax".to_string(),
117-
entry: "check-yaml".to_string(),
118-
language: "python".to_string(),
119-
files: "".to_string(),
120-
args: vec![],
121-
stages: vec!["commit".to_string()],
122-
},
123-
PreCommitHookDefinition {
124-
id: "check-added-large-files".to_string(),
125-
name: "Check for added large files".to_string(),
126-
description: "Prevents giant files from being committed".to_string(),
127-
entry: "check-added-large-files".to_string(),
128-
language: "python".to_string(),
129-
files: "".to_string(),
130-
args: vec![],
131-
stages: vec!["commit".to_string()],
132-
},
133-
];
134-
return Some(PreCommitHooksFile { hooks });
135-
}
96+
// Create the directory if it doesn't exist
97+
if !repo_dir.exists() {
98+
if let Err(err) = std::fs::create_dir_all(&repo_dir) {
99+
log::warn!("Failed to create cache directory: {}", err);
100+
return None;
101+
}
136102

137-
// For ruff repository
138-
else if repo_url.contains("astral-sh/ruff-pre-commit") {
139-
let hooks = vec![
140-
PreCommitHookDefinition {
141-
id: "ruff".to_string(),
142-
name: "Ruff".to_string(),
143-
description: "Run Ruff to check Python code".to_string(),
144-
entry: "ruff".to_string(),
145-
language: "python".to_string(),
146-
files: "".to_string(),
147-
args: vec![],
148-
stages: vec!["commit".to_string()],
149-
},
150-
PreCommitHookDefinition {
151-
id: "ruff-format".to_string(),
152-
name: "Ruff Format".to_string(),
153-
description: "Run Ruff formatter on Python code".to_string(),
154-
entry: "ruff format".to_string(),
155-
language: "python".to_string(),
156-
files: "".to_string(),
157-
args: vec![],
158-
stages: vec!["commit".to_string()],
159-
},
160-
];
161-
return Some(PreCommitHooksFile { hooks });
103+
log::debug!("Cloning repository {} into {}", repo_url, repo_dir.display());
104+
105+
// Clone the repository
106+
match git2::Repository::clone(repo_url, &repo_dir) {
107+
Ok(_repo) => {},
108+
Err(err) => {
109+
log::warn!("Failed to clone repository {}: {}", repo_url, err);
110+
// Clean up the directory if the clone failed
111+
let _ = std::fs::remove_dir_all(&repo_dir);
112+
return None;
113+
}
114+
};
115+
} else {
116+
log::debug!("Using cached repository at {}", repo_dir.display());
162117
}
163118

164-
// For biome repository
165-
else if repo_url.contains("biomejs/pre-commit") {
166-
let hooks = vec![
167-
PreCommitHookDefinition {
168-
id: "biome-check".to_string(),
169-
name: "Biome Check".to_string(),
170-
description: "Run Biome check on JavaScript/TypeScript files".to_string(),
171-
entry: "biome check".to_string(),
172-
language: "node".to_string(),
173-
files: "".to_string(),
174-
args: vec![],
175-
stages: vec!["commit".to_string()],
176-
},
177-
PreCommitHookDefinition {
178-
id: "biome-format".to_string(),
179-
name: "Biome Format".to_string(),
180-
description: "Run Biome format on JavaScript/TypeScript files".to_string(),
181-
entry: "biome format".to_string(),
182-
language: "node".to_string(),
183-
files: "".to_string(),
184-
args: vec![],
185-
stages: vec!["commit".to_string()],
186-
},
187-
];
188-
return Some(PreCommitHooksFile { hooks });
189-
}
119+
// Look for .pre-commit-hooks.yaml in the repository
120+
let path = repo_dir.join(".pre-commit-hooks.yaml");
121+
122+
// Try to find and parse the file
123+
log::debug!("Looking for .pre-commit-hooks.yaml at: {}", path.display());
124+
125+
if path.exists() {
126+
log::debug!("Found .pre-commit-hooks.yaml at: {}", path.display());
127+
128+
// Read the file
129+
match fs::read_to_string(&path) {
130+
Ok(content) => {
131+
// Parse the YAML content
132+
// Try to parse as a PreCommitHooksFile first
133+
match serde_yaml::from_str::<PreCommitHooksFile>(&content) {
134+
Ok(hooks_file) => {
135+
log::info!("Successfully parsed .pre-commit-hooks.yaml from {} as a struct", path.display());
136+
return Some(hooks_file);
137+
}
138+
Err(_) => {
139+
// If that fails, try to parse as a Vec<PreCommitHookDefinition>
140+
match serde_yaml::from_str::<Vec<PreCommitHookDefinition>>(&content) {
141+
Ok(hooks) => {
142+
log::info!("Successfully parsed .pre-commit-hooks.yaml from {} as a sequence", path.display());
143+
return Some(PreCommitHooksFile::from(hooks));
144+
}
145+
Err(err) => {
146+
log::warn!("Failed to parse .pre-commit-hooks.yaml from {}: {}", path.display(), err);
147+
}
148+
}
149+
}
150+
}
151+
}
152+
Err(err) => {
153+
log::warn!("Failed to read .pre-commit-hooks.yaml from {}: {}", path.display(), err);
154+
}
155+
}
156+
}
190157

191-
// For other repositories, we would need to fetch and parse their .pre-commit-hooks.yaml file
192-
// For now, we'll return None to indicate that we couldn't find a hooks file
158+
log::warn!("Could not fetch .pre-commit-hooks.yaml for {}", repo_url);
193159
None
194160
}
195161

0 commit comments

Comments
 (0)