-
-
Notifications
You must be signed in to change notification settings - Fork 160
Description
π Describe the bug
The kornia-apriltag crate currently suffers from performance bottlenecks due to rapid heap allocations inside its inner detection loops. There are several TODO: Avoid allocation comments in the hot paths of decoder.rs and quad.rs. Because these inner loops (like fit_quads, compute_line_fit_prefix_sums, and decode_tags) run potentially thousands of times per frame during tag extraction, the continuous allocation of Vec arrays forces the memory allocator/OS into heavy churn, severely fragmenting the heap and bottlenecking frames-per-second throughput.
π Steps to Reproduce
1.Run the ApriltagDetector on any image.
2.Observe the internal execution flow: the program allocates and deallocates multiple vectors (like quads, lfps, and detections) repeatedly inside the geometry and decoding functions for every single candidate.π» Minimal Code Example
N/A - This is a structural memory issue rather than a code crash. The problematic lines in the repository are:
- crates/kornia-apriltag/src/decoder.rs(Line 354): let mut detections = Vec::new(); // TODO: Avoid allocations on every call
- crates/kornia-apriltag/src/quad.rs(Line 126): let mut quads = Vec::new(); // TODO: Avoid this allocation every time
- crates/kornia-apriltag/src/quad.rs (Line 414): let mut lfps = vec![LineFit::default(); gradient_infos.len()]; // TODO: Find a way to avoid allocationβ Expected behavior
These dynamic arrays should be hoisted out of the low-level functions and stored as persistent, reusable buffers inside the ApriltagDetector state (e.g., passing a &mut Vec buffer downward). During each detection frame, the pipeline should just call .clear() and reuse the existing heap capacity over and over without any new OS memory allocations.
β Actual behavior
The code calls Vec::new() and vec![...] directly inside the hot loops, causing gigabytes of garbage allocations over a sustained video stream.
π§ Environment
- kornia-rs version:
- Rust version (`rustc -V`):rustc 1.93.1
- Cargo version (`cargo -V`):cargo 1.93.1
- OS (e.g., Linux, macOS, Windows):macOS
- Target architecture (if cross-compiling): N/A
- Python version (if using Python bindings): Python 3.9.6π Additional context
This optimization targets purely memory footprint and single-thread scaling behavior, fulfilling the existing TODO markers left in the codebase by the original author.
We will move those temporary vectors into the AprilTagDecoder struct itself.
Instead of:
Allocating new memory every frame
We now:
Allocate memory once at startup
Reuse the same buffers forever
No more repeated heap allocations.
No more memory churn.
π€ Contribution Intent
- I plan to submit a PR to fix this bug
- I'm reporting this bug but not planning to fix it