-
-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Description
Search before asking
- I have searched the BoxMOT issues and discussions and found no similar questions.
Bug
I pass frames by writing them onto the memory and allowing a python script to read the bytes, reconstruct the frames and then pass the frames to Object Detection and then boxmot.
When a normal OpenCV script is used with the botsort tracker, there are no issues, everything is solid.
When my implementation is used with the botsort tracker, there is a warning:
MainProcess/MainThread | WARNING | /opt/conda/envs/tracking/lib/python3.9/site-packages/boxmot/motion/cmc/ecc.py:99 | apply - Affine matrix could not be generated: OpenCV(4.12.0) /io/opencv/modules/video/src/ecc.cpp:574: error: (-7:Iterations do not converge) NaN encountered. in function 'findTransformECC'
All other models work fine with my implementation, the tracking also works however the results have declined due to a failed motion estimation obviously and there is also a reduction of about ~10ms in tracking time due to this.
Why is this happening? I saw the comment in code that the error is common in practice, but what I don't understand is why it's failing for this specific implementation. What is different?
Environment
boxmot==13.0.17
torch==2.6.0
opencv==4.1
Minimal Reproducible Example
"""
Minimal Reproducible Example: Frame Bytes Conversion + BoxMOT Tracking
"""
import cv2
import numpy as np
from boxmot import BotSort
from ultralytics import YOLO # or your detector
# ============ STEP 1: Read frame and convert to bytes ============
cap = cv2.VideoCapture("your_video.mp4")
ret, frame_bgr = cap.read()
H, W = frame_bgr.shape[:2]
FRAME_SIZE = H * W * 3
# Convert BGR -> RGB (like streamer does)
frame_rgb = cv2.cvtColor(frame_bgr, cv2.COLOR_BGR2RGB)
# Convert frame to bytes (this is what gets written to shared memory)
frame_bytes = frame_rgb.tobytes()
# ============ STEP 2: Convert bytes back to frame ============
# Reconstruct frame from bytes (this is what player_service does)
frame_reconstructed = np.frombuffer(frame_bytes, dtype=np.uint8).reshape((H, W, 3))
# Verify frames are identical
assert np.array_equal(frame_rgb, frame_reconstructed), "Frame reconstruction failed!"
# ============ STEP 3: Object Detection (minimal) ============
# Using YOLO as an example - replace with your detector
model = YOLO("yolov8n.pt")
results = model(frame_reconstructed)
# Extract detections in BoxMOT format: (N, 6) [x1, y1, x2, y2, conf, class_id]
detections = []
for r in results:
boxes = r.boxes
for i in range(len(boxes)):
x1, y1, x2, y2 = boxes.xyxy[i].cpu().numpy()
conf = boxes.conf[i].cpu().numpy()
cls = boxes.cls[i].cpu().numpy()
detections.append([x1, y1, x2, y2, conf, cls])
detections = np.array(detections) if detections else np.empty((0, 6))
# ============ STEP 4: BoxMOT Tracking (standard syntax) ============
tracker = BotSort(
reid_weights="osnet_x0_25_msmt17.pt", # or your reid weights
device=0, # GPU device
half=False
)
# Update tracker with detections and frame
# NOTE: BoxMOT expects frame as numpy array (H, W, 3)
tracks = tracker.update(detections, frame_reconstructed)
print(f"Found {len(tracks)} tracks")
for t in tracks:
x1, y1, x2, y2, track_id, conf, class_id = t[:7]
print(f" Track {int(track_id)}: bbox=({x1:.0f},{y1:.0f},{x2:.0f},{y2:.0f}), conf={conf:.2f}, class={int(class_id)}")
cap.release()