Skip to content

Commit eba7b37

Browse files
author
CID Agent
committed
cid(advance): Generalize video API to accept borrowed slices
Change soft_hash_video_v0 and gen_video_code_v0 from &[Vec<i32>] to generic <S: AsRef<[i32]> + Ord>, eliminating per-frame heap allocations in the FFI crate while remaining backward-compatible with all bindings.
1 parent af6eda4 commit eba7b37

File tree

4 files changed

+53
-34
lines changed

4 files changed

+53
-34
lines changed

.claude/agent-memory/advance/MEMORY.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -185,6 +185,15 @@ iterations.
185185
`drop(chunks)`, then mutate `self.buf`. Explicit `drop(chunks)` makes the borrow release visible
186186
- Benchmark: `DataHasher` streaming at 64 KiB chunks on 1 MB data achieves ~1.1 GiB/s throughput
187187

188+
## API Generics
189+
190+
- Video API (`soft_hash_video_v0`, `gen_video_code_v0`) uses `<S: AsRef<[i32]> + Ord>` instead of
191+
concrete `&[Vec<i32>]`. This allows FFI to pass `&[&[i32]]` (zero-copy borrows) while other
192+
bindings continue passing `&[Vec<i32>]` unchanged. `AsRef<[i32]>` gives slice access, `Ord`
193+
enables `BTreeSet` deduplication. Body uses `.as_ref()` for element access
194+
- FFI video wrappers use `Vec<&[i32]>` (1 remaining `.to_vec()` in FFI crate is for
195+
`alg_cdc_chunks`)
196+
188197
## Gotchas
189198

190199
- `pop_local_frame` is `unsafe` in jni crate v0.21 (Rust 2024 edition) — must wrap in `unsafe {}`

.claude/context/handoff.md

Lines changed: 28 additions & 24 deletions
Original file line numberDiff line numberDiff line change
@@ -1,29 +1,33 @@
1-
## 2026-02-25 — Review of: Optimize DataHasher::update buffer allocation
1+
## 2026-02-25 — Generalize video API to accept borrowed slices
22

3-
**Verdict:** PASS
3+
**Done:** Changed `soft_hash_video_v0` and `gen_video_code_v0` from concrete `&[Vec<i32>]` to
4+
generic `<S: AsRef<[i32]> + Ord>` parameters. Updated both FFI wrappers to construct `Vec<&[i32]>`
5+
(borrowed slices) instead of `Vec<Vec<i32>>` (heap-allocated copies), eliminating per-frame
6+
allocations.
47

5-
**Summary:** Replaced per-call heap allocations in `DataHasher::update` with a persistent
6-
`buf: Vec<u8>` that is reused across calls. The `data.to_vec()`, `[tail, data].concat()`, and
7-
`prev_chunk.to_vec()` patterns are eliminated — new data is appended via `extend_from_slice`, and
8-
the tail is shifted to the front with `copy_within` + `truncate`. A Criterion streaming benchmark
9-
was added. All 261 tests pass, clippy clean, 14 pre-commit hooks pass, benchmark runs at ~1.0 GiB/s.
8+
**Files changed:**
109

11-
**Verification:**
12-
13-
- [x] `cargo test -p iscc-lib` passes — 261 tests (208 + 31 + 22), 0 failures
14-
- [x] `cargo clippy -p iscc-lib -- -D warnings` clean
15-
- [x] `grep -c 'to_vec\|\.concat()' crates/iscc-lib/src/streaming.rs` returns 0 — no per-call
16-
allocations remain
17-
- [x] `cargo bench -p iscc-lib -- DataHasher` runs successfully — ~1.0 GiB/s throughput
18-
19-
**Issues found:**
10+
- `crates/iscc-lib/src/lib.rs`: Generalized `soft_hash_video_v0` and `gen_video_code_v0` signatures
11+
to `<S: AsRef<[i32]> + Ord>`. Updated body to use `.as_ref()` for column count and iteration,
12+
and `BTreeSet<&S>` for deduplication.
13+
- `crates/iscc-ffi/src/lib.rs`: Changed both `iscc_gen_video_code_v0` and `iscc_soft_hash_video_v0`
14+
from `Vec<Vec<i32>>` with `.to_vec()` to `Vec<&[i32]>` with direct slice borrows.
2015

21-
- (none)
22-
23-
**Next:** The `[normal]` DataHasher issue is resolved. Consider the `[normal]` iscc-ffi video frame
24-
allocation issue (changing `iscc_lib` video API to accept `&[&[i32]]`), or one of the `[low]` issues
25-
(dct power-of-two validation, wtahash bounds check, iscc-py `__version__`, etc.).
16+
**Verification:**
2617

27-
**Notes:** The borrow checker pattern (extract `tail_len` as `usize` before `drop(chunks)`) is clean
28-
and well-documented with inline comments. The `drop(chunks)` is explicit to make the borrow release
29-
obvious — good practice for code that manipulates borrowed data and the owning buffer in sequence.
18+
- `cargo test -p iscc-lib`: 261 tests passed (208 + 31 + 22), 0 failures
19+
- `cargo test -p iscc-ffi`: 62 tests passed, 0 failures
20+
- `cargo clippy --workspace --all-targets -- -D warnings`: clean (no warnings)
21+
- `.to_vec()` count in `crates/iscc-ffi/src/lib.rs`: 1 (only the unrelated `alg_cdc_chunks` on line
22+
906 remains — down from 3)
23+
- `mise run check`: all 14 pre-commit hooks passed
24+
25+
**Next:** The `[normal]` iscc-ffi video frame allocation issue is resolved. Consider the remaining
26+
`[low]` issues: `alg_dct` power-of-two validation, `alg_wtahash` bounds check, iscc-py
27+
`__version__`, iscc-py module docstring, iscc-wasm conformance feature gate, or iscc-wasm stale
28+
CLAUDE.md.
29+
30+
**Notes:** The generic approach (`AsRef<[i32]> + Ord`) was chosen over concrete `&[&[i32]]` to
31+
maintain backward compatibility — all existing callers that pass `&[Vec<i32>]` compile unchanged
32+
because `Vec<i32>` implements both traits. Only the FFI crate (the actual beneficiary) was modified.
33+
No other binding crate (py, napi, wasm, jni) required changes.

crates/iscc-ffi/src/lib.rs

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -406,12 +406,12 @@ pub unsafe extern "C" fn iscc_gen_video_code_v0(
406406
let sig_ptrs = unsafe { std::slice::from_raw_parts(frame_sigs, num_frames) };
407407
let lens = unsafe { std::slice::from_raw_parts(frame_lens, num_frames) };
408408

409-
let frames: Vec<Vec<i32>> = sig_ptrs
409+
let frames: Vec<&[i32]> = sig_ptrs
410410
.iter()
411411
.zip(lens.iter())
412412
.map(|(&ptr, &len)| {
413413
// SAFETY: caller guarantees each ptr is valid for its length
414-
unsafe { std::slice::from_raw_parts(ptr, len) }.to_vec()
414+
unsafe { std::slice::from_raw_parts(ptr, len) }
415415
})
416416
.collect();
417417

@@ -956,12 +956,12 @@ pub unsafe extern "C" fn iscc_soft_hash_video_v0(
956956
let sig_ptrs = unsafe { std::slice::from_raw_parts(frame_sigs, num_frames) };
957957
let lens = unsafe { std::slice::from_raw_parts(frame_lens, num_frames) };
958958

959-
let frames: Vec<Vec<i32>> = sig_ptrs
959+
let frames: Vec<&[i32]> = sig_ptrs
960960
.iter()
961961
.zip(lens.iter())
962962
.map(|(&ptr, &len)| {
963963
// SAFETY: caller guarantees each ptr is valid for its length
964-
unsafe { std::slice::from_raw_parts(ptr, len) }.to_vec()
964+
unsafe { std::slice::from_raw_parts(ptr, len) }
965965
})
966966
.collect();
967967

crates/iscc-lib/src/lib.rs

Lines changed: 12 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -522,21 +522,24 @@ pub fn gen_audio_code_v0(cv: &[i32], bits: u32) -> IsccResult<AudioCodeResult> {
522522
///
523523
/// Deduplicates frame signatures, computes column-wise sums across all
524524
/// unique frames, then applies WTA-Hash to produce a digest of `bits/8` bytes.
525-
pub fn soft_hash_video_v0(frame_sigs: &[Vec<i32>], bits: u32) -> IsccResult<Vec<u8>> {
525+
pub fn soft_hash_video_v0<S: AsRef<[i32]> + Ord>(
526+
frame_sigs: &[S],
527+
bits: u32,
528+
) -> IsccResult<Vec<u8>> {
526529
if frame_sigs.is_empty() {
527530
return Err(IsccError::InvalidInput(
528531
"frame_sigs must not be empty".into(),
529532
));
530533
}
531534

532-
// Deduplicate using BTreeSet (Vec<i32> implements Ord)
533-
let unique: std::collections::BTreeSet<&Vec<i32>> = frame_sigs.iter().collect();
535+
// Deduplicate using BTreeSet (S: Ord)
536+
let unique: std::collections::BTreeSet<&S> = frame_sigs.iter().collect();
534537

535538
// Column-wise sum into i64 to avoid overflow
536-
let cols = frame_sigs[0].len();
539+
let cols = frame_sigs[0].as_ref().len();
537540
let mut vecsum = vec![0i64; cols];
538541
for sig in &unique {
539-
for (c, &val) in sig.iter().enumerate() {
542+
for (c, &val) in sig.as_ref().iter().enumerate() {
540543
vecsum[c] += val as i64;
541544
}
542545
}
@@ -548,7 +551,10 @@ pub fn soft_hash_video_v0(frame_sigs: &[Vec<i32>], bits: u32) -> IsccResult<Vec<
548551
///
549552
/// Produces an ISCC Content-Code for video from a sequence of MPEG-7 frame
550553
/// signatures. Each frame signature is a 380-element integer vector.
551-
pub fn gen_video_code_v0(frame_sigs: &[Vec<i32>], bits: u32) -> IsccResult<VideoCodeResult> {
554+
pub fn gen_video_code_v0<S: AsRef<[i32]> + Ord>(
555+
frame_sigs: &[S],
556+
bits: u32,
557+
) -> IsccResult<VideoCodeResult> {
552558
let digest = soft_hash_video_v0(frame_sigs, bits)?;
553559
let component = codec::encode_component(
554560
codec::MainType::Content,

0 commit comments

Comments
 (0)