Skip to content

Cyano0/agri-human-dataset-tools

Repository files navigation

Agri-Human Dataset → KITTI Conversion Toolkit

This repository provides a complete, reproducible pipeline for converting the Agri-Human dataset (ROS-bag–derived, multi-sensor recordings) into KITTI-style datasets.

Supported outputs:

  • RAW KITTI (images + LiDAR + calib + optional OXTS)
  • KITTI Object / Tracking (2D + optional LiDAR-derived 3D labels)
  • KITTI Depth Completion (RGB + depth + optional LiDAR)
  • Custom multi-camera export (RGB + fisheyes + labels)

The pipeline is anchor-based, session-safe, and can be run CLI-first or YAML-driven.


🚀 TL;DR (Quick Start)

A) CLI pipeline (most common)

# 1) Synchronise sensors (creates sync.json per session)
python sync_and_match.py --root dataset_test --anchor cam_zed_rgb

# 2) Build manifest + train/val/test splits
python build_manifest_and_splits.py --root dataset_test

# 3) Export KITTI Object dataset (2D + 3D from LiDAR if available)
python kitti_export_object.py   --root dataset_test/test   --out kitti_object   --anchor_camera cam_zed_rgb   --require_image --require_lidar   --use_lidar_3d --prefer_2d   --ann_source cam_zed_rgb   --calib_root dataset_test

B) YAML-driven (recommended for reproducibility)

python kitti_export_ctl.py --config configs/kitti_export.yaml

1. Original Dataset Format

Each recording session lives in a *_label/ folder.

dataset_root/
├── calibration/
│   ├── intrinsics.json
│   └── extrinsics.json
│
├── footpath1_..._label/
│   ├── sensor_data/
│   │   ├── cam_zed_rgb/        *.png
│   │   ├── cam_zed_depth/      *.npy
│   │   ├── cam_fish_front/     *.png
│   │   ├── cam_fish_left/      *.png
│   │   ├── cam_fish_right/     *.png
│   │   └── lidar/              *.pcd
│   │
│   ├── annotations/
│   │   ├── cam_zed_rgb_ann.json
│   │   ├── cam_fish_front_ann.json
│   │   ├── cam_fish_left_ann.json
│   │   ├── cam_fish_right_ann.json
│   │   └── lidar_ann.json        # 3D bounding boxes (in LiDAR frame)
│   │
│   ├── metadata/
│   │   ├── gps_fix.jsonl
│   │   ├── gps_odom.jsonl
│   │   ├── yaw.jsonl
│   │   ├── odom_global.jsonl
│   │   ├── odom_local.jsonl
│   │   └── tf/
│   │       ├── map__to__odom.jsonl
│   │       └── odom__to__base_link.jsonl
│   │
│   └── sync.json                 # generated by sync_and_match.py
├─ out_straw_..._label/
│  └─ ...

Key properties

  • Filenames encode timestamps (sec_nanosec.ext)
  • 3D boxes are labelled in LiDAR frame
  • 2D boxes are labelled per-camera
  • Calibration is static and shared across sessions

2. Pipeline Overview (ASCII Diagram)

Raw Dataset
   |
   v
+--------------------+
| sync_and_match.py  |
|  - choose anchor   |
|  - timestamp match |
+--------------------+
          |
          v
      sync.json
          |
          v
+-----------------------------+
| build_manifest_and_splits.py|
|  - manifest_samples.tsv    |
|  - train/val/test splits   |
+-----------------------------+
          |
          v
+----------------------------------------------+
| KITTI Exporters                              |
|                                              |
|  kitti_export_raw.py     → RAW KITTI         |
|  kitti_export_object.py  → Object / Tracking |
|  kitti_export_depth.py   → Depth Completion  |
|  kitti_export_custom.py  → Multi-camera      |
+----------------------------------------------+

3. Synchronisation (sync_and_match.py)

What it does

  • Select an anchor modality (e.g. cam_zed_rgb)
  • Match other modalities to the anchor by timestamp tolerance
  • Write one sync.json per session

Why it matters

All downstream steps (manifest, splits, exports) rely on sync.json as the canonical sample index.


4. Manifest & Splits (build_manifest_and_splits.py)

Outputs

manifest_samples.tsv
splits/
└── default/
    ├── train.txt
    ├── val.txt
    └── test.txt

What’s inside

  • manifest_samples.tsv: one row per synced sample (session + timestamp + file references)
  • splits/default/*.txt: lists of sample IDs (session+timestamp) for stable train/val/test

Notes

  • Splits are session-level (no temporal leakage across splits).
  • You can reuse the same manifest/splits for every export mode.

5. KITTI Export Modes (Outputs)

All exporters share a backend: kitti_export_common.py.

5.1 RAW KITTI (kitti_export_raw.py)

Produces:

kitti_out/
├── image_2/
├── velodyne/
├── calib/
├── oxts/            (optional)
└── timestamps.txt

5.2 KITTI Object / Tracking (kitti_export_object.py)

Produces:

kitti_out/
├── image_2/
├── velodyne/
├── label_2/
├── calib/
├── oxts/            (optional)
└── timestamps.txt

Labels

  • 2D labels come from the chosen --ann_source camera annotation JSON.
  • 3D labels come from annotations/lidar_ann.json if --use_lidar_3d is enabled.
  • If a camera is fisheye, projection uses a fisheye model (see --anchor_model).

5.3 KITTI Depth Completion (kitti_export_depth.py)

Produces:

kitti_out/
├── image_2/
├── depth_2/         (.npy or png16)
├── calib/
└── velodyne/        (optional)

5.4 Custom Multi-camera (kitti_export_custom.py)

Produces (example):

kitti_out/
├── image_2/                 (anchor RGB)
├── label_2/
├── image_fish_front/
├── label_fish_front/
├── image_fish_left/
├── label_fish_left/
├── image_fish_right/
├── label_fish_right/
├── velodyne/
└── calib/

This is not standard KITTI, but is a practical layout for multi-camera research.


6. CLI Parameters (User-facing reference)

Below are the most important flags for each exporter. Run --help on any script for the complete list.


6.1 kitti_export_object.py parameters

Required

Flag Meaning
--root Directory containing *_label/ sessions (or a split folder like dataset_test/test)
--out Output folder for the KITTI dataset

Anchor & geometry

Flag Meaning
--anchor_camera The camera whose frames define the exported frame index (image_2/)
--anchor_model Projection model for anchor camera: auto, pinhole, fisheye
--camera_optical_frame Optical frame of the anchor camera (used for LiDAR→camera transform)
--lidar_frame LiDAR TF frame name used for 3D labels (e.g. front_lidar_link)

Requirements (drop incomplete samples)

Flag Meaning
--require_image Only export samples that have the anchor image
--require_lidar Only export samples that have LiDAR

Labels & annotation sources

Flag Meaning
--ann_source Which camera’s annotation JSON to use for 2D labels (e.g. cam_zed_rgb)
--ann_key_mode How to match frames to annotations: auto, stem, exact, timestamp
--use_lidar_3d Include LiDAR 3D boxes from annotations/lidar_ann.json
--prefer_2d If both exist, write 2D-first then append 3D
--prefer_lidar_3d If both exist, write 3D-first then append 2D (zero-3D)

Calibration & metadata

Flag Meaning
--calib_root Folder containing calibration/intrinsics.json and calibration/extrinsics.json

Manifest / split selection

Flag Meaning
--manifest_tsv Use an existing manifest to select samples
--split_tag One of: train, val, test (uses files under splits/default/)

6.2 kitti_export_depth.py parameters

Required

Flag Meaning
--root Directory containing *_label/ sessions
--out Output folder

Depth sources

Flag Meaning
--depth_camera Depth modality folder name (typically cam_zed_depth)
--depth_write_png16 Convert .npy depth to KITTI-style 16-bit PNG (else keep .npy)

Optional

Flag Meaning
--anchor_camera If you want RGB exported too (usually cam_zed_rgb)
--require_image Require RGB
--require_lidar Also export LiDAR to velodyne/
--calib_root Calibration root

6.3 kitti_export_custom.py parameters

Required

Flag Meaning
--root Directory containing *_label/ sessions
--out Output folder

Multi-camera options

Flag Meaning
--anchor_camera Anchor RGB camera written to image_2/
--fisheyes List of fisheye modalities to export (e.g. cam_fish_front cam_fish_left cam_fish_right)
--ann_source List of modalities to read 2D annotations from (can include fisheyes)

3D labels + projection

Flag Meaning
--use_lidar_3d Include 3D boxes from lidar_ann.json
--anchor_model auto/pinhole/fisheye for anchor projection
--lidar_frame / --camera_optical_frame Frames used to compute LiDAR→camera transforms

7. Example Commands

Object export (2D + LiDAR 3D)

python kitti_export_object.py   --root dataset_test/test   --out kitti_object   --anchor_camera cam_zed_rgb   --camera_optical_frame front_left_camera_optical_frame   --lidar_frame front_lidar_link   --require_image --require_lidar   --use_lidar_3d --prefer_2d   --ann_source cam_zed_rgb   --calib_root dataset_test

Depth export (PNG16)

python kitti_export_depth.py   --root dataset_test/test   --out kitti_depth   --anchor_camera cam_zed_rgb --require_image   --depth_camera cam_zed_depth   --depth_write_png16   --calib_root dataset_test

Custom multi-camera export

python kitti_export_custom.py   --root dataset_test/test   --out kitti_custom   --anchor_camera cam_zed_rgb   --fisheyes cam_fish_front cam_fish_left cam_fish_right   --ann_source cam_zed_rgb cam_fish_front cam_fish_left cam_fish_right   --use_lidar_3d --prefer_2d   --lidar_frame front_lidar_link   --camera_optical_frame front_left_camera_optical_frame   --calib_root dataset_test

8. YAML Configs (Keys aligned with CLI flags)

The control script kitti_export_ctl.py reads YAML jobs. Keys are designed to match CLI flags.

configs/kitti_export.yaml example

jobs:
  - mode: object
    root: dataset_test/test
    out: kitti_object
    anchor_camera: cam_zed_rgb
    anchor_model: auto
    require_image: true
    require_lidar: true
    ann_source: [cam_zed_rgb]
    ann_key_mode: stem
    use_lidar_3d: true
    prefer_2d: true
    lidar_frame: front_lidar_link
    camera_optical_frame: front_left_camera_optical_frame
    calib_root: dataset_test

  - mode: depth
    root: dataset_test/test
    out: kitti_depth
    anchor_camera: cam_zed_rgb
    require_image: true
    depth_camera: cam_zed_depth
    depth_write_png16: true
    calib_root: dataset_test

Run:

python kitti_export_ctl.py --config configs/kitti_export.yaml

9. Where to find train/val/test documents

After you run:

python build_manifest_and_splits.py --root dataset_test

you will have:

  • manifest_samples.tsv
  • splits/default/train.txt
  • splits/default/val.txt
  • splits/default/test.txt

You can pass --manifest_tsv and --split_tag train|val|test to exporters to reuse the same split.


10. Common Mistakes & Fixes

“labels.txt are all empty”

Most common causes:

  • --ann_source points to the wrong modality (no matching annotation JSON)
  • --ann_key_mode mismatch (filenames vs stems vs timestamps)
  • your annotation JSON schema differs from what the parser expects

Fix:

  • run with --debug_labels
  • set --ann_key_mode stem if your JSON stores 1731409262_182444459.png as stem keys

“[warn] intrinsics missing or no K; writing identity.”

Cause:

  • intrinsics.json not found, wrong --calib_root, or unexpected schema

Fix:

  • ensure calibration/intrinsics.json exists under --calib_root
  • confirm K/camera_matrix fields exist for the selected camera

“[warn] no TF path lidar_frame -> camera_optical_frame; identity Tr.”

Cause:

  • frame names mismatch OR TF graph in extrinsics.json missing the chain

Fix:

  • verify exact frame strings in calibration/extrinsics.json
  • use the true camera optical frame (e.g. front_left_camera_optical_frame)

Fisheye projections look wrong

Cause:

  • using pinhole projection on fisheye images, or wrong --anchor_model

Fix:

  • set --anchor_model fisheye (or keep auto if intrinsics specify equidistant)
  • for custom export, ensure fisheye intrinsics have distortion_model: equidistant

Export contains fewer frames than expected

Cause:

  • --require_image / --require_lidar filters out incomplete samples
  • sync thresholds too strict

Fix:

  • relax sync threshold in sync_and_match.py / YAML
  • disable a requirement if appropriate

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages