Tvenver
diff --git a/‎CHANGELOG.md‎
Lines changed: 19 additions & 0 deletions b/‎CHANGELOG.md‎
Lines changed: 19 additions & 0 deletions
diff --git a/‎README.md‎
Lines changed: 41 additions & 2 deletions b/‎README.md‎
Lines changed: 41 additions & 2 deletions
diff --git a/‎detection_config.yaml‎
Lines changed: 55 additions & 4 deletions b/‎detection_config.yaml‎
Lines changed: 55 additions & 4 deletions
diff --git a/‎notebooks/full_pipeline.ipynb‎
Lines changed: 18 additions & 2 deletions b/‎notebooks/full_pipeline.ipynb‎
Lines changed: 18 additions & 2 deletions
diff --git a/‎pyproject.toml‎
Lines changed: 1 addition & 1 deletion b/‎pyproject.toml‎
Lines changed: 1 addition & 1 deletion
@@ -4,6 +4,25 @@ All notable changes to this project will be documented in this file.
 The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/),
 and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
 
+## [2.0.5] - 2025-02-04
+
+### Added
+- **JPEG support**: `prepare()` now fully supports `.jpeg` files in addition to `.jpg` and `.png`
+- **Full detection configuration**: All 24 detection parameters now exposed in `detection_config.yaml` with comprehensive documentation
+
+### Changed
+- **Streamlined inference API**: `species_list` is now optional and automatically loaded from model checkpoint (still can be overridden if needed)
+- **Frame-based tracking**: Standardized on `max_lost_frames` (frame-based) instead of `lost_track_seconds` for consistent behavior across different FPS
+- **Refactored detection modules**: Moved all hardcoded values to `detection_config.yaml` for better configurability
+  - GMM parameters (`gmm_history`, `gmm_var_threshold`)
+  - Morphological filtering (`morph_kernel_size`)
+  - Cohesiveness filters (`min_motion_ratio`)
+  - Track consistency (`max_area_change_ratio`)
+  - Path topology (`revisit_radius`)
+
+### Fixed
+- **Indentation error** in `prepare.py` file corruption detection loop
+
 ## [2.0.4] - 2025-02-02
 
 ### Added
 
@@ -134,6 +134,8 @@ results = bplusplus.validate(
 #### Step 5: Run Inference on Video
 Process a video file to detect, classify, and track insects using motion-based detection. The pipeline uses background subtraction (GMM) to detect moving insects, tracks them across frames, and classifies confirmed tracks.
 
+**Note:** The species list and taxonomy are automatically loaded from the model checkpoint, so you don't need to provide them again.
+
 **Output files generated in `output_dir`:**
 - `{video}_annotated.mp4` - Video showing confirmed tracks with classifications
 - `{video}_debug.mp4` - Debug video with motion mask and all detections
@@ -146,10 +148,10 @@ OUTPUT_DIR = Path("./output")
 HIERARCHICAL_MODEL_PATH = TRAINED_MODEL_DIR / "best_multitask.pt"
 
 results = bplusplus.inference(
-    species_list=names,
     hierarchical_model_path=HIERARCHICAL_MODEL_PATH,
     video_path=VIDEO_INPUT_PATH,
     output_dir=OUTPUT_DIR,
+    # species_list=names,   # Optional: override species from checkpoint
     fps=None,               # None = process all frames
     backbone="resnet50",    # Must match training
     save_video=True,        # Set to False to skip video rendering (only CSV output)
@@ -172,7 +174,44 @@ results = bplusplus.inference(
 )
 ```
 
-Download a template config from the [releases page](https://github.com/Tvenver/Bplusplus/releases). Parameters control cohesiveness filtering, shape filtering, tracking behavior, and path topology analysis for confirming insect-like movement.
+Download a template config from the [releases page](https://github.com/Tvenver/Bplusplus/releases).
+
+<details>
+<summary><b>Full Configuration Parameters</b> (click to expand)</summary>
+
+| Parameter | Default | Description |
+|-----------|---------|-------------|
+| **GMM Background Subtractor** | | *Motion detection model* |
+| `gmm_history` | 500 | Frames to build background model |
+| `gmm_var_threshold` | 16 | Variance threshold for foreground detection |
+| **Morphological Filtering** | | *Noise removal* |
+| `morph_kernel_size` | 3 | Morphological kernel size (NxN) |
+| **Cohesiveness** | | *Filters scattered motion (plants) vs compact motion (insects)* |
+| `min_largest_blob_ratio` | 0.80 | Min ratio of largest blob to total motion |
+| `max_num_blobs` | 5 | Max separate blobs allowed in detection |
+| `min_motion_ratio` | 0.15 | Min ratio of motion pixels to bbox area |
+| **Shape** | | *Filters by contour properties* |
+| `min_area` | 200 | Min detection area (px²) |
+| `max_area` | 40000 | Max detection area (px²) |
+| `min_density` | 3.0 | Min area/perimeter ratio |
+| `min_solidity` | 0.55 | Min convex hull fill ratio |
+| **Tracking** | | *Controls track behavior* |
+| `min_displacement` | 50 | Min net movement for confirmation (px) |
+| `min_path_points` | 10 | Min points before path analysis |
+| `max_frame_jump` | 100 | Max jump between frames (px) |
+| `max_lost_frames` | 45 | Frames before lost track deleted (e.g., 45 @ 30fps = 1.5s) |
+| `max_area_change_ratio` | 3.0 | Max area change ratio between frames |
+| **Tracker Matching** | | *Hungarian algorithm cost function* |
+| `tracker_w_dist` | 0.6 | Weight for distance cost (0-1) |
+| `tracker_w_area` | 0.4 | Weight for area cost (0-1) |
+| `tracker_cost_threshold` | 0.3 | Max cost for valid match (0-1) |
+| **Path Topology** | | *Confirms insect-like movement patterns* |
+| `max_revisit_ratio` | 0.30 | Max ratio of revisited positions |
+| `min_progression_ratio` | 0.70 | Min forward progression |
+| `max_directional_variance` | 0.90 | Max heading variance |
+| `revisit_radius` | 50 | Radius (px) for revisit detection |
+
+</details>
 
 ### Customization
 
 
@@ -8,6 +8,28 @@
 #   results = inference(..., config="detection_config.yaml")
 # =============================================================================
 
+# -----------------------------------------------------------------------------
+# GMM BACKGROUND SUBTRACTOR PARAMETERS
+# -----------------------------------------------------------------------------
+# Controls the Gaussian Mixture Model for motion detection
+
+# Number of frames to build background model
+# Higher = more stable background, slower adaptation to lighting changes
+gmm_history: 500
+
+# Variance threshold for foreground detection
+# Higher = less sensitive, fewer false positives from noise
+gmm_var_threshold: 16
+
+# -----------------------------------------------------------------------------
+# MORPHOLOGICAL FILTERING
+# -----------------------------------------------------------------------------
+# Noise removal after motion detection
+
+# Size of the morphological kernel (NxN ellipse)
+# Larger = removes more noise but may lose small detections
+morph_kernel_size: 3
+
 # -----------------------------------------------------------------------------
 # COHESIVENESS PARAMETERS
 # -----------------------------------------------------------------------------
@@ -21,6 +43,10 @@ min_largest_blob_ratio: 0.80
 # Lower = stricter, rejects scattered motion
 max_num_blobs: 5
 
+# Min ratio of motion pixels to bounding box area
+# Higher = requires more filled bounding box
+min_motion_ratio: 0.15
+
 # -----------------------------------------------------------------------------
 # SHAPE PARAMETERS
 # -----------------------------------------------------------------------------
@@ -55,9 +81,30 @@ min_path_points: 10
 # Larger = more tolerant of fast movement, but may link separate objects
 max_frame_jump: 100
 
-# How long to remember a lost track (seconds)
-# Helps re-link tracks after brief occlusions
-lost_track_seconds: 1.5
+# How many frames to remember a lost track before deleting
+# Helps re-link tracks after brief occlusions (e.g., 45 frames @ 30fps = 1.5s)
+max_lost_frames: 45
+
+# Max allowed area change ratio between frames
+# Prevents linking very different sized detections
+max_area_change_ratio: 3.0
+
+# -----------------------------------------------------------------------------
+# TRACKER MATCHING PARAMETERS
+# -----------------------------------------------------------------------------
+# Hungarian algorithm cost function weights
+
+# Weight for distance cost (0-1)
+# Higher = position matters more for matching
+tracker_w_dist: 0.6
+
+# Weight for area cost (0-1)
+# Higher = size similarity matters more for matching
+tracker_w_area: 0.4
+
+# Maximum cost for valid track-detection match (0-1)
+# Lower = stricter matching, more new tracks created
+tracker_cost_threshold: 0.3
 
 # -----------------------------------------------------------------------------
 # PATH TOPOLOGY PARAMETERS
@@ -74,4 +121,8 @@ min_progression_ratio: 0.70
 
 # Max directional variance (insects maintain relatively consistent heading)
 # Lower = stricter, requires more consistent direction
-max_directional_variance: 0.90
+max_directional_variance: 0.90
+
+# Radius in pixels for determining if a position is "revisited"
+# Smaller = stricter definition of revisiting
+revisit_radius: 50
@@ -281,6 +281,8 @@
     "\n",
     "Runs motion-based insect detection and hierarchical classification on video files. Detects moving insects using background subtraction (GMM), tracks them across frames, classifies each detection, and aggregates predictions per track.\n",
     "\n",
+    "**Note:** The species list and taxonomy are automatically loaded from the model checkpoint, so you don't need to provide them again.\n",
+    "\n",
     "**Output files generated:**\n",
     "- `{video}_annotated.mp4` - Video with detection boxes and track paths (if `save_video=True`)\n",
     "- `{video}_debug.mp4` - Side-by-side view with GMM motion mask (if `save_video=True`)\n",
@@ -297,10 +299,10 @@
    "outputs": [],
    "source": [
     "results = bplusplus.inference(\n",
-    "    species_list=names,\n",
     "    hierarchical_model_path=RESNET_MULTITASK_WEIGHTS,\n",
     "    video_path=\"./10.mp4\",\n",
     "    output_dir=\"./output\",\n",
+    "    # species_list=names,  # Optional: override species from checkpoint\n",
     "    fps=None,              # None = all frames\n",
     "    backbone=\"resnet50\",   # Must match training\n",
     "    save_video=True,       # Set to False to skip video rendering (only CSV output)\n",
@@ -327,11 +329,19 @@
     ")\n",
     "```\n",
     "\n",
+    "#### Full Configuration Parameters\n",
+    "\n",
     "| Parameter | Default | Description |\n",
     "|-----------|---------|-------------|\n",
+    "| **GMM Background Subtractor** | | *Motion detection model* |\n",
+    "| `gmm_history` | 500 | Frames to build background model |\n",
+    "| `gmm_var_threshold` | 16 | Variance threshold for foreground detection |\n",
+    "| **Morphological Filtering** | | *Noise removal* |\n",
+    "| `morph_kernel_size` | 3 | Morphological kernel size (NxN) |\n",
     "| **Cohesiveness** | | *Filters scattered motion (plants) vs compact motion (insects)* |\n",
     "| `min_largest_blob_ratio` | 0.80 | Min ratio of largest blob to total motion |\n",
     "| `max_num_blobs` | 5 | Max separate blobs allowed in detection |\n",
+    "| `min_motion_ratio` | 0.15 | Min ratio of motion pixels to bbox area |\n",
     "| **Shape** | | *Filters by contour properties* |\n",
     "| `min_area` | 200 | Min detection area (px²) |\n",
     "| `max_area` | 40000 | Max detection area (px²) |\n",
@@ -341,11 +351,17 @@
     "| `min_displacement` | 50 | Min net movement for confirmation (px) |\n",
     "| `min_path_points` | 10 | Min points before path analysis |\n",
     "| `max_frame_jump` | 100 | Max jump between frames (px) |\n",
-    "| `lost_track_seconds` | 1.5 | How long to remember lost tracks (s) |\n",
+    "| `max_lost_frames` | 45 | Frames before lost track deleted (e.g., 45 @ 30fps = 1.5s) |\n",
+    "| `max_area_change_ratio` | 3.0 | Max area change ratio between frames |\n",
+    "| **Tracker Matching** | | *Hungarian algorithm cost function* |\n",
+    "| `tracker_w_dist` | 0.6 | Weight for distance cost (0-1) |\n",
+    "| `tracker_w_area` | 0.4 | Weight for area cost (0-1) |\n",
+    "| `tracker_cost_threshold` | 0.3 | Max cost for valid match (0-1) |\n",
     "| **Path Topology** | | *Confirms insect-like movement patterns* |\n",
     "| `max_revisit_ratio` | 0.30 | Max ratio of revisited positions |\n",
     "| `min_progression_ratio` | 0.70 | Min forward progression |\n",
     "| `max_directional_variance` | 0.90 | Max heading variance |\n",
+    "| `revisit_radius` | 50 | Radius (px) for revisit detection |\n",
     "\n"
    ]
   }
 
@@ -1,6 +1,6 @@
 [tool.poetry]
 name = "bplusplus"
-version = "2.0.4"
+version = "2.0.5"
 description = "A simple method to create AI models for biodiversity, with collect and prepare pipeline"
 authors = ["Titus Venverloo <tvenver@mit.edu>", "Deniz Aydemir <deniz@aydemir.us>", "Orlando Closs <orlandocloss@pm.me>", "Ase Hatveit <aase@mit.edu>"]
 license = "MIT"