|
1 | | -# Investigation: optimize_scans divergence at low quality |
| 1 | +# Investigation: optimize_scans divergence at low quality — RESOLVED |
2 | 2 |
|
3 | | -## Problem |
| 3 | +## Root Cause |
4 | 4 |
|
5 | | -The `Max Compression` config (`optimize_scans: true`) shows increasing file size gap |
6 | | -between Rust and C at low quality levels. At Q40, it exceeds our 1% average / 3% |
7 | | -per-image thresholds: |
| 5 | +`encode_rust()` in `src/test_encoder.rs` was missing `.optimize_scans(config.optimize_scans)` |
| 6 | +in the `Encoder` builder chain. The Rust scan optimizer was **never called** — the encoder |
| 7 | +always used the fixed 9-scan script regardless of the `optimize_scans` config flag. |
8 | 8 |
|
9 | | -| Q | Avg Delta | Max Dev | Worst Image | |
10 | | -|----|-----------|---------|-------------| |
11 | | -| 40 | +1.11% | 3.37% | kodim23, kodim09 | |
12 | | -| 50 | +0.77% | 3.13% | kodim23 | |
13 | | -| 55 | +0.75% | 2.82% | kodim23 | |
14 | | -| 65 | +0.70% | 2.74% | | |
15 | | -| 75 | +0.59% | 2.12% | | |
16 | | -| 85 | +0.41% | 1.25% | | |
17 | | -| 90 | +0.28% | 0.59% | | |
18 | | -| 95 | +0.40% | 0.81% | | |
| 9 | +Meanwhile, the C encoder correctly passed `optimize_scans` via FFI, so C's scan search |
| 10 | +found simpler, more efficient scripts at low quality (4-5 scans without successive |
| 11 | +approximation), while Rust always used the default 9-scan SA script. |
19 | 12 |
|
20 | | -Without `optimize_scans`, all configs are within ±0.7% average even at Q40. |
21 | | -The gap is strictly in the scan optimization search. |
| 13 | +## Evidence |
22 | 14 |
|
23 | | -## Context |
| 15 | +Before fix (R=Rust optimize_scans, C=C optimize_scans): |
| 16 | +- R(optsc) == R(fixed) at ALL quality levels — Rust optimizer was never invoked |
| 17 | +- C correctly found smaller scripts at low Q (C saves 4.6% at Q10 vs fixed script) |
24 | 18 |
|
25 | | -`optimize_scans` tries multiple progressive scan configurations and picks the |
26 | | -smallest. Both Rust and C implement this, but their scan search heuristics may |
27 | | -differ. At low quality, more coefficients are quantized to zero, giving the |
28 | | -optimizer a larger search space where different heuristics produce different |
29 | | -local optima. |
| 19 | +After fix — Rust scan optimizer runs and finds similar scripts as C: |
| 20 | +``` |
| 21 | + Q R(optsc) C(optsc) Δopt% |
| 22 | + 10 183834 184696 -0.47% (was +4.18%) |
| 23 | + 20 354621 357038 -0.68% (was +2.40%) |
| 24 | + 30 507224 509658 -0.48% (was +1.54%) |
| 25 | + 40 643993 645968 -0.31% (was +1.11%) |
| 26 | + 50 769538 771715 -0.28% (was +0.77%) |
| 27 | + 75 1263157 1259526 +0.29% (was +0.59%) |
| 28 | + 85 1766757 1760481 +0.36% (was +0.41%) |
| 29 | + 95 3218288 3209303 +0.28% (was +0.40%) |
| 30 | +``` |
30 | 31 |
|
31 | | -## What to investigate |
| 32 | +At low quality, Rust is now **smaller** than C (the scan optimizer works well). |
32 | 33 |
|
33 | | -1. **Map the full curve.** Run Max Compression at Q10, Q20, Q25, Q30, Q35, Q40, |
34 | | - Q45, Q50 on the Kodak corpus. Add a temporary `#[test]` or `#[ignore]` test |
35 | | - to `parity_benchmark.rs` that only runs Max Compression across these qualities |
36 | | - and prints per-image detail for each. Determine where the gap plateaus. |
| 34 | +## Fix |
37 | 35 |
|
38 | | -2. **Per-image scan counts.** For the worst images (kodim23, kodim09), compare |
39 | | - the number of scans chosen by Rust vs C at Q40. Use `count_scans()` (pattern |
40 | | - in `corpus_comparison.rs`). If scan counts differ, the search is finding |
41 | | - fundamentally different scan scripts. |
| 36 | +One-line fix in `src/test_encoder.rs:134`: |
| 37 | +```rust |
| 38 | +.optimize_scans(config.optimize_scans) |
| 39 | +``` |
42 | 40 |
|
43 | | -3. **Compare scan scripts directly.** Parse the SOS markers from both outputs |
44 | | - and print `(Ns, comps, Ss, Se, Ah, Al)` for each scan. Pattern is in |
45 | | - `corpus_comparison.rs::print_scan_details()`. Identify which scans differ. |
| 41 | +## Remaining Observations |
46 | 42 |
|
47 | | -4. **Trace the scan trial encoder.** The Rust implementation is in |
48 | | - `src/scan_trial.rs`. The C implementation calls `jpeg_search_progression()` |
49 | | - in `jcmaster.c`. Compare: |
50 | | - - How many candidate scans are evaluated |
51 | | - - The cost function (file size estimation) |
52 | | - - The greedy selection order |
53 | | - - Whether the trial encoder's Huffman table estimation matches C's |
54 | | - |
55 | | -5. **Check if C uses `trellis_freq_split` during scan search.** C mozjpeg has |
56 | | - `trellis_freq_split = 8` which splits AC trellis into low/high frequency |
57 | | - passes. If C's scan optimizer accounts for this split during trial encoding |
58 | | - but Rust doesn't, that could explain the gap at low quality where the split |
59 | | - matters more. |
60 | | - |
61 | | -6. **Kodim23 specifically.** This image consistently has the worst deviation. |
62 | | - It's a landscape with lots of sky gradient + sharp foreground detail. |
63 | | - Encode it standalone at Q40 with both, diff the scan scripts, and check |
64 | | - if one finds genuinely smaller output or if it's a Huffman table estimation |
65 | | - error in the trial encoder. |
66 | | - |
67 | | -## Key files |
68 | | - |
69 | | -- `src/scan_trial.rs` — Rust scan trial encoder |
70 | | -- `src/progressive.rs` — Rust progressive scan generation |
71 | | -- `tests/parity_benchmark.rs` — benchmark test (add exploration tests here) |
72 | | -- `tests/corpus_comparison.rs` — has `count_scans()` and `print_scan_details()` |
73 | | -- C: `jcmaster.c` → `jpeg_search_progression()` |
74 | | -- C: `jcphuff.c` → trial encoding for scan cost estimation |
75 | | - |
76 | | -## Acceptance criteria |
77 | | - |
78 | | -- Understand whether the gap is from different scan scripts or different |
79 | | - file sizes for the same scan script |
80 | | -- If different scripts: determine if Rust's choice is suboptimal or just different |
81 | | -- If same scripts: the gap is in entropy coding, not scan search — investigate |
82 | | - per-scan Huffman table differences |
83 | | -- Document findings, decide whether to fix or accept and adjust thresholds |
| 43 | +Some images still show Rust choosing different scripts than C (different Al levels |
| 44 | +or frequency splits). This is expected — the scan search is a greedy heuristic and |
| 45 | +can find different local optima. The per-image max deviation is ~1.6% at Q55, which |
| 46 | +is within acceptable range. |
0 commit comments