Skip to content

Commit 5169ae7

Browse files
committed
Merge branch 'code_review_28Feb'
2 parents f8e71ad + 1b1eef6 commit 5169ae7

File tree

10 files changed

+380
-95
lines changed

10 files changed

+380
-95
lines changed

Cargo.toml

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ edition = "2024"
66
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
77

88
[profile.release]
9-
# panic = 'abort'
9+
panic = 'abort'
1010
codegen-units = 1
1111
lto = 'fat'
1212
strip = true
@@ -21,7 +21,6 @@ hex = "0.4"
2121
data-encoding = "2.10"
2222
pico-args = "0.5"
2323
rayon = "1.11"
24-
# readonly = "0.2"
2524
generic-array = "1.3"
2625
byteorder = { default-features = false, version = "1.5" }
2726
md-5 = { default-features = false, version = "0.10" }
Lines changed: 230 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,230 @@
1+
# HashRust Code Review
2+
3+
**Date:** 2026-02-28
4+
**Reviewer:** Claude Code (claude-sonnet-4-6)
5+
**Commit:** f8e71ad (main)
6+
**Scope:** Full codebase review — current state vs. prior review (2026-02-09)
7+
**Lines reviewed:** ~1,100 Rust
8+
9+
---
10+
11+
## Prior Review Status
12+
13+
The February 9 Opus review identified 22 findings. The current codebase has addressed the majority:
14+
15+
| Prior Finding | Status |
16+
|---|---|
17+
| C1: Silent partial hash on I/O error | **Fixed** — loop now uses `?` |
18+
| C2: TOCTOU race in file_size | **Fixed**`file_size()` calls `metadata()` directly |
19+
| H1: Interleaved stdout in parallel mode | **Fixed** — results collected then printed serially |
20+
| H2: Loose dependency versions | **Fixed** — standard semver constraints |
21+
| M1: Duplicate CRC32/U32 validation | **Fixed** — validation lives only in CLI layer |
22+
| M2: `OutputEncoding::Unspecified` | **Fixed** — removed; `Option<OutputEncoding>` used during parse |
23+
| M3: Boolean parameter explosion | **Fixed** — struct literal construction, no constructor |
24+
| M4: Integration test temp file races | **Fixed**`tempfile` crate used throughout |
25+
| M5: No unit tests for core hashing | **Fixed**`hash_tests` module covers all algorithms |
26+
| M6: No multi-threaded path coverage | **Fixed**`test_multi_file_parallel_hashing` added |
27+
| L1: Heap buffer per file | **Fixed**`[u8; BUFFER_SIZE]` stack allocation |
28+
| L2: Unnecessary `.to_string()` | **Fixed** |
29+
| L3: `hash_file_whole` verbosity | **Fixed** — uses `D::digest(&data)` |
30+
| L4: `Crc32::from_state` dead code | **Fixed** — removed |
31+
| L5: Unused `_debug_mode` params | **Fixed** — removed |
32+
| L6: `readonly` setter bypass | **Fixed**`readonly` removed entirely |
33+
| L7: Redundant `file_exists` | **Fixed** — removed |
34+
| I1: Clippy pedantic violations | **Fixed** |
35+
| I3: Heavyweight progress system | **Fixed** — uses `indicatif::MultiProgress` natively |
36+
| I4: `BasicHash` encapsulation | **Fixed** — inner field is private, public API surface unchanged |
37+
38+
---
39+
40+
## Current Findings
41+
42+
### High
43+
44+
#### H1: Per-file hashing errors do not fail the process
45+
**Files:** `src/core/worker.rs:81`, `src/core/worker.rs:140`
46+
47+
```rust
48+
Err(e) => eprintln!("File error for '{pathstr}': {e}"),
49+
```
50+
51+
Both `file_hashes_st` and `file_hashes_mt` swallow per-file errors: the error is printed to stderr but the functions return `Ok(())`. `worker_func` propagates that `Ok`, so the process exits with code 0.
52+
53+
This is a correctness defect for any scripted consumer. If a caller pipes a list of files and one becomes unreadable mid-run, the process silently succeeds. Standard hashing tools (`sha256sum`, `b2sum`) exit non-zero when any file fails.
54+
55+
**Fix:** Track whether any error occurred (e.g., `let mut had_error = false;`) and return `Err` if set, or use an explicit non-zero exit code via `std::process::exit`.
56+
57+
---
58+
59+
#### H2: Help text printed to stdout on error path
60+
**File:** `src/main.rs:20`
61+
62+
```rust
63+
if let Err(e) = result {
64+
show_help(true); // uses println! → stdout
65+
println!();
66+
return Err(e); // anyhow prints this to stderr
67+
}
68+
```
69+
70+
When the CLI returns an error (invalid algorithm, file not found, etc.), help text goes to stdout while the error itself goes to stderr. This mixes streams, breaking piped usage (`./hash_rust bad-arg 2>/dev/null | wc -l` would receive spurious help text) and violating POSIX convention.
71+
72+
**Fix:** Replace `println!` in `show_help` with `eprintln!` when called from the error path, or add a `to_stderr: bool` parameter.
73+
74+
---
75+
76+
### Medium
77+
78+
#### M1: Non-UTF-8 paths from stdin cause a hard failure
79+
**File:** `src/io/files.rs:24`
80+
81+
```rust
82+
let lines = stdin.lock().lines().collect::<Result<Vec<String>, _>>()?;
83+
```
84+
85+
`BufRead::lines()` returns `Err` for any line containing non-UTF-8 bytes. Because the `?` here is applied to the outer `collect`, a single non-UTF-8 filename causes the *entire* stdin read to fail, discarding all subsequent paths.
86+
87+
On Linux, filenames are arbitrary byte sequences. A directory with even one file whose name is not valid UTF-8 will abort stdin mode entirely, even for all the valid paths that preceded it.
88+
89+
**Fix:** Use `stdin.lock().split(b'\n')` which yields raw `Vec<u8>`, then convert with `String::from_utf8_lossy`, or `OsString::from_vec` (Unix-only). At minimum, log-and-skip bad lines rather than aborting.
90+
91+
---
92+
93+
#### M2: Dead `Write` impl on `Crc32`
94+
**File:** `src/hash/crc32.rs:40-51`
95+
96+
```rust
97+
impl Write for Crc32 {
98+
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> { ... }
99+
fn flush(&mut self) -> std::io::Result<()> { Ok(()) }
100+
}
101+
```
102+
103+
`Crc32` is used exclusively through the `Digest` trait (`Update` + `FixedOutput`). The `Write` impl is never called anywhere in the codebase. It adds noise, imports `std::io::Write` into scope with a `pub use`, and could mislead a future maintainer into thinking `Crc32` has `Write`-based consumers.
104+
105+
**Fix:** Remove the `impl Write for Crc32` block and the `use std::io::Write` import.
106+
107+
---
108+
109+
#### M3: Integration test path-with-spaces fragility
110+
**File:** `tests/integration_tests.rs:301`
111+
112+
```rust
113+
assert!(
114+
line.split_whitespace().count() == 2,
115+
"Line {i} should have exactly 2 space-separated parts (hash and path)"
116+
);
117+
```
118+
119+
The output format is `<hash> <path>`, but this assertion breaks the moment any file path contains whitespace — which is common on macOS where the system temp dir can be `/private/var/folders/...` (though in practice this specific path doesn't contain spaces, user-created temp dirs can). The test will also fail for any future encoding that produces output containing spaces (e.g., Base64 with padding considerations, though that is not currently the case).
120+
121+
**Fix:** Split on the first whitespace only — `line.splitn(2, char::is_whitespace).count() == 2` — or verify the hash prefix length (64 hex chars for SHA3-256) directly.
122+
123+
---
124+
125+
### Low
126+
127+
#### L1: `#[allow(dead_code)]` on public methods is misleading
128+
**File:** `src/core/types.rs:71-80`
129+
130+
```rust
131+
#[allow(dead_code)]
132+
pub fn as_str(&self) -> &str { ... }
133+
134+
#[allow(dead_code)]
135+
pub fn into_inner(self) -> String { ... }
136+
```
137+
138+
`dead_code` lint does not fire on `pub` items in a binary crate — only on private/`pub(crate)` items that are not reachable from the public API. These suppressions have no effect and imply the methods are dead when they are actually part of the intended public interface. They're probably leftovers from when these methods were `pub(crate)`.
139+
140+
**Fix:** Remove the `#[allow(dead_code)]` attributes from both methods.
141+
142+
---
143+
144+
#### L2: `test_empty_supplied_paths` tests nothing
145+
**File:** `src/unit_tests.rs:603-619`
146+
147+
```rust
148+
#[test]
149+
fn test_empty_supplied_paths() {
150+
let config = ConfigSettings { ..., supplied_paths: Vec::new() };
151+
assert!(config.supplied_paths.is_empty());
152+
}
153+
```
154+
155+
This constructs a struct and asserts a property of the literal value used to construct it. No code is exercised. The test comment says "This would normally read from stdin, but we're testing the config behavior" — but it doesn't test config behavior either, only struct construction.
156+
157+
**Fix:** Either delete the test, or replace it with an actual behavioral test (e.g., that `get_required_filenames` with empty `supplied_paths` attempts to read stdin and returns an empty vec when stdin is empty).
158+
159+
---
160+
161+
#### L3: `threshold_millis()` wraps a module-level constant
162+
**File:** `src/progress.rs:72-74`
163+
164+
```rust
165+
pub fn threshold_millis() -> u64 {
166+
PROGRESS_THRESHOLD_MILLIS
167+
}
168+
```
169+
170+
This is a public function that returns a private constant. The only call site is `worker.rs:164`. The indirection adds no encapsulation (the constant's value is fully determined at compile time) and serves no purpose a `pub const` would not serve equally.
171+
172+
**Fix:** Make `PROGRESS_THRESHOLD_MILLIS` `pub(crate)` and reference it directly, or keep the function but note it's an intentional abstraction boundary.
173+
174+
---
175+
176+
#### L4: CLAUDE.md documents `panic = 'abort'` but it is commented out
177+
**File:** `Cargo.toml:6`
178+
179+
```toml
180+
[profile.release]
181+
# panic = 'abort'
182+
```
183+
184+
`CLAUDE.md` states: *"Release builds use LTO and panic=abort for size/speed"*. The setting is commented out. The discrepancy means either the documentation or the config is wrong. Without `panic=abort`, the release binary includes unwinding infrastructure, increasing binary size and disabling the size/speed claim.
185+
186+
**Fix:** Either restore `panic = 'abort'` to the release profile, or remove the claim from CLAUDE.md.
187+
188+
---
189+
190+
#### L5: Commented-out `readonly` dependency is dead config noise
191+
**File:** `Cargo.toml:last dependency block`
192+
193+
```toml
194+
# readonly = "0.2"
195+
```
196+
197+
This was removed as part of fixing L6 from the prior review, but the commented-out line was left behind. It provides no value and could confuse someone scanning the dependency list.
198+
199+
**Fix:** Delete the commented-out line.
200+
201+
---
202+
203+
## Summary
204+
205+
| Severity | Count | Items |
206+
|---|---|---|
207+
| High | 2 | H1 (exit code on errors), H2 (help to stdout) |
208+
| Medium | 3 | M1 (non-UTF-8 stdin), M2 (dead Write impl), M3 (test fragility) |
209+
| Low | 5 | L1–L5 above |
210+
| **Total** | **10** | |
211+
212+
---
213+
214+
## Remediation Sequence
215+
216+
### Immediately (correctness)
217+
1. **H1** — Return non-zero exit code when any file fails to hash
218+
2. **H2** — Route help text to stderr on error path
219+
220+
### Near-term (correctness + test reliability)
221+
3. **M1** — Handle non-UTF-8 stdin paths gracefully
222+
4. **M3** — Fix fragile `split_whitespace` assertion in integration test
223+
224+
### Cleanup (code hygiene)
225+
5. **M2** — Remove `impl Write for Crc32`
226+
6. **L1** — Remove `#[allow(dead_code)]` from public methods
227+
7. **L2** — Delete or replace the no-op `test_empty_supplied_paths`
228+
8. **L4** — Reconcile `panic = 'abort'` between CLAUDE.md and Cargo.toml
229+
9. **L5** — Delete commented-out `readonly` line
230+
10. **L3** — Make `PROGRESS_THRESHOLD_MILLIS` `pub(crate)` and remove the wrapper function

src/cli/args.rs

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,5 @@
11
use std::ffi::OsString;
2+
use std::io::Write;
23
use std::str::FromStr;
34

45
use anyhow::{Result, anyhow};
@@ -80,16 +81,17 @@ pub fn parse_hash_encoding(
8081
}
8182
}
8283

83-
pub fn show_help(longform: bool) {
84-
println!(
84+
pub fn show_help(longform: bool, out: &mut dyn Write) {
85+
let _ = writeln!(
86+
out,
8587
"File hasher for various algorithms. Version {} ({})",
8688
VERSION.unwrap_or("?"),
8789
GIT_VERSION_SHORT
8890
);
8991
if longform {
90-
println!("{HELP}");
92+
let _ = writeln!(out, "{HELP}");
9193
}
92-
println!("Default algorithm is {DEFAULT_HASH:?}");
94+
let _ = writeln!(out, "Default algorithm is {DEFAULT_HASH:?}");
9395
}
9496

9597
fn args_finished(args: Arguments) -> Result<Vec<OsString>> {

src/core/worker.rs

Lines changed: 27 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -28,17 +28,21 @@ pub fn worker_func(config: &ConfigSettings) -> Result<()> {
2828
eprintln!("Files to hash: {paths:?}");
2929
}
3030

31-
if config.single_thread || paths.len() == 1 {
32-
file_hashes_st(config, &paths);
31+
let had_error = if config.single_thread || paths.len() == 1 {
32+
file_hashes_st(config, &paths)
3333
} else {
34-
file_hashes_mt(config, &paths);
34+
file_hashes_mt(config, &paths)
35+
};
36+
37+
if had_error {
38+
std::process::exit(1);
3539
}
3640

3741
Ok(())
3842
}
3943

4044
fn show_initial_info(config: &ConfigSettings) {
41-
crate::cli::args::show_help(false);
45+
crate::cli::args::show_help(false, &mut std::io::stderr());
4246
eprintln!();
4347
eprintln!("Config: {config:?}");
4448
if config.supplied_paths.is_empty() {
@@ -51,7 +55,7 @@ fn show_initial_info(config: &ConfigSettings) {
5155
}
5256
}
5357

54-
fn file_hashes_st<S>(config: &ConfigSettings, paths: &[S])
58+
fn file_hashes_st<S>(config: &ConfigSettings, paths: &[S]) -> bool
5559
where
5660
S: AsRef<str> + Display + Send + Sync,
5761
{
@@ -66,6 +70,8 @@ where
6670
Some(ProgressCoordinator::new())
6771
};
6872

73+
let mut had_error = false;
74+
6975
for pathstr in paths {
7076
let file_hash =
7177
hash_with_progress(config, AsRef::<str>::as_ref(pathstr), coordinator.as_ref());
@@ -78,12 +84,17 @@ where
7884
println!("{basic_hash} {pathstr}");
7985
}
8086
}
81-
Err(e) => eprintln!("File error for '{pathstr}': {e}"),
87+
Err(e) => {
88+
eprintln!("File error for '{pathstr}': {e}");
89+
had_error = true;
90+
}
8291
}
8392
}
93+
94+
had_error
8495
}
8596

86-
fn file_hashes_mt<S>(config: &ConfigSettings, paths: &[S])
97+
fn file_hashes_mt<S>(config: &ConfigSettings, paths: &[S]) -> bool
8798
where
8899
S: AsRef<str> + Sync + Display,
89100
{
@@ -127,6 +138,8 @@ where
127138
pb.finish_with_message("Complete!");
128139
}
129140

141+
let mut had_error = false;
142+
130143
for (pathstr, file_hash) in results {
131144
match file_hash {
132145
Ok(basic_hash) => {
@@ -137,9 +150,14 @@ where
137150
}
138151
}
139152

140-
Err(e) => eprintln!("File error for '{pathstr}': {e}"),
153+
Err(e) => {
154+
eprintln!("File error for '{pathstr}': {e}");
155+
had_error = true;
156+
}
141157
}
142158
}
159+
160+
had_error
143161
}
144162

145163
fn hash_with_progress(
@@ -161,7 +179,7 @@ fn hash_with_progress(
161179
}
162180

163181
if config.debug_mode
164-
&& elapsed >= Duration::from_millis(ProgressCoordinator::threshold_millis())
182+
&& elapsed >= Duration::from_millis(crate::progress::PROGRESS_THRESHOLD_MILLIS)
165183
{
166184
eprintln!(
167185
"File '{}' took {:.2}s to hash",

src/hash/crc32.rs

Lines changed: 0 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,3 @@
1-
use std::io::Write;
2-
31
use crc32fast::Hasher;
42
pub use digest::Digest;
53
use digest::{FixedOutput, HashMarker, Output, OutputSizeUser, Reset, Update};
@@ -36,16 +34,3 @@ impl Reset for Crc32 {
3634
self.0.reset();
3735
}
3836
}
39-
40-
impl Write for Crc32 {
41-
#[inline]
42-
fn write(&mut self, buf: &[u8]) -> std::io::Result<usize> {
43-
self.0.update(buf);
44-
Ok(buf.len())
45-
}
46-
47-
#[inline]
48-
fn flush(&mut self) -> std::io::Result<()> {
49-
Ok(())
50-
}
51-
}

0 commit comments

Comments
 (0)