Skip to content

CMC/ECC fails when frames are passed through a stream instead of OpenCV. #2211

@R-C101

Description

@R-C101

Search before asking

Bug

I pass frames by writing them onto the memory and allowing a python script to read the bytes, reconstruct the frames and then pass the frames to Object Detection and then boxmot.

When a normal OpenCV script is used with the botsort tracker, there are no issues, everything is solid.

When my implementation is used with the botsort tracker, there is a warning:

MainProcess/MainThread | WARNING | /opt/conda/envs/tracking/lib/python3.9/site-packages/boxmot/motion/cmc/ecc.py:99 | apply - Affine matrix could not be generated: OpenCV(4.12.0) /io/opencv/modules/video/src/ecc.cpp:574: error: (-7:Iterations do not converge) NaN encountered. in function 'findTransformECC'

All other models work fine with my implementation, the tracking also works however the results have declined due to a failed motion estimation obviously and there is also a reduction of about ~10ms in tracking time due to this.

Why is this happening? I saw the comment in code that the error is common in practice, but what I don't understand is why it's failing for this specific implementation. What is different?

Environment

boxmot==13.0.17
torch==2.6.0
opencv==4.1

Minimal Reproducible Example

"""
Minimal Reproducible Example: Frame Bytes Conversion + BoxMOT Tracking


"""

import cv2
import numpy as np
from boxmot import BotSort
from ultralytics import YOLO  # or your detector

# ============ STEP 1: Read frame and convert to bytes ============
cap = cv2.VideoCapture("your_video.mp4")
ret, frame_bgr = cap.read()
H, W = frame_bgr.shape[:2]
FRAME_SIZE = H * W * 3

# Convert BGR -> RGB (like streamer does)
frame_rgb = cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2RGB)

# Convert frame to bytes (this is what gets written to shared memory)
frame_bytes = frame_rgb.tobytes()

# ============ STEP 2: Convert bytes back to frame ============
# Reconstruct frame from bytes (this is what player_service does)
frame_reconstructed = np.frombuffer(frame_bytes, dtype=np.uint8).reshape((H, W, 3))

# Verify frames are identical
assert np.array_equal(frame_rgb, frame_reconstructed), "Frame reconstruction failed!"

# ============ STEP 3: Object Detection (minimal) ============
# Using YOLO as an example - replace with your detector
model = YOLO("yolov8n.pt")
results = model(frame_reconstructed)

# Extract detections in BoxMOT format: (N, 6) [x1, y1, x2, y2, conf, class_id]
detections = []
for r in results:
    boxes = r.boxes
    for i in range(len(boxes)):
        x1, y1, x2, y2 = boxes.xyxy[i].cpu().numpy()
        conf = boxes.conf[i].cpu().numpy()
        cls = boxes.cls[i].cpu().numpy()
        detections.append([x1, y1, x2, y2, conf, cls])

detections = np.array(detections) if detections else np.empty((0, 6))

# ============ STEP 4: BoxMOT Tracking (standard syntax) ============
tracker = BotSort(
    reid_weights="osnet_x0_25_msmt17.pt",  # or your reid weights
    device=0,  # GPU device
    half=False
)

# Update tracker with detections and frame
# NOTE: BoxMOT expects frame as numpy array (H, W, 3)
tracks = tracker.update(detections, frame_reconstructed)


print(f"Found {len(tracks)} tracks")
for t in tracks:
    x1, y1, x2, y2, track_id, conf, class_id = t[:7]
    print(f"  Track {int(track_id)}: bbox=({x1:.0f},{y1:.0f},{x2:.0f},{y2:.0f}), conf={conf:.2f}, class={int(class_id)}")

cap.release()

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions