This repository provides a complete, reproducible pipeline for converting the Agri-Human dataset (ROS-bag–derived, multi-sensor recordings) into KITTI-style datasets.
Supported outputs:
- RAW KITTI (images + LiDAR + calib + optional OXTS)
- KITTI Object / Tracking (2D + optional LiDAR-derived 3D labels)
- KITTI Depth Completion (RGB + depth + optional LiDAR)
- Custom multi-camera export (RGB + fisheyes + labels)
The pipeline is anchor-based, session-safe, and can be run CLI-first or YAML-driven.
# 1) Synchronise sensors (creates sync.json per session)
python sync_and_match.py --root dataset_test --anchor cam_zed_rgb
# 2) Build manifest + train/val/test splits
python build_manifest_and_splits.py --root dataset_test
# 3) Export KITTI Object dataset (2D + 3D from LiDAR if available)
python kitti_export_object.py --root dataset_test/test --out kitti_object --anchor_camera cam_zed_rgb --require_image --require_lidar --use_lidar_3d --prefer_2d --ann_source cam_zed_rgb --calib_root dataset_testpython kitti_export_ctl.py --config configs/kitti_export.yamlEach recording session lives in a *_label/ folder.
dataset_root/
├── calibration/
│ ├── intrinsics.json
│ └── extrinsics.json
│
├── footpath1_..._label/
│ ├── sensor_data/
│ │ ├── cam_zed_rgb/ *.png
│ │ ├── cam_zed_depth/ *.npy
│ │ ├── cam_fish_front/ *.png
│ │ ├── cam_fish_left/ *.png
│ │ ├── cam_fish_right/ *.png
│ │ └── lidar/ *.pcd
│ │
│ ├── annotations/
│ │ ├── cam_zed_rgb_ann.json
│ │ ├── cam_fish_front_ann.json
│ │ ├── cam_fish_left_ann.json
│ │ ├── cam_fish_right_ann.json
│ │ └── lidar_ann.json # 3D bounding boxes (in LiDAR frame)
│ │
│ ├── metadata/
│ │ ├── gps_fix.jsonl
│ │ ├── gps_odom.jsonl
│ │ ├── yaw.jsonl
│ │ ├── odom_global.jsonl
│ │ ├── odom_local.jsonl
│ │ └── tf/
│ │ ├── map__to__odom.jsonl
│ │ └── odom__to__base_link.jsonl
│ │
│ └── sync.json # generated by sync_and_match.py
├─ out_straw_..._label/
│ └─ ...
- Filenames encode timestamps (
sec_nanosec.ext) - 3D boxes are labelled in LiDAR frame
- 2D boxes are labelled per-camera
- Calibration is static and shared across sessions
Raw Dataset
|
v
+--------------------+
| sync_and_match.py |
| - choose anchor |
| - timestamp match |
+--------------------+
|
v
sync.json
|
v
+-----------------------------+
| build_manifest_and_splits.py|
| - manifest_samples.tsv |
| - train/val/test splits |
+-----------------------------+
|
v
+----------------------------------------------+
| KITTI Exporters |
| |
| kitti_export_raw.py → RAW KITTI |
| kitti_export_object.py → Object / Tracking |
| kitti_export_depth.py → Depth Completion |
| kitti_export_custom.py → Multi-camera |
+----------------------------------------------+
- Select an anchor modality (e.g.
cam_zed_rgb) - Match other modalities to the anchor by timestamp tolerance
- Write one
sync.jsonper session
All downstream steps (manifest, splits, exports) rely on sync.json as the canonical sample index.
manifest_samples.tsv
splits/
└── default/
├── train.txt
├── val.txt
└── test.txt
manifest_samples.tsv: one row per synced sample (session + timestamp + file references)splits/default/*.txt: lists of sample IDs (session+timestamp) for stable train/val/test
- Splits are session-level (no temporal leakage across splits).
- You can reuse the same manifest/splits for every export mode.
All exporters share a backend: kitti_export_common.py.
Produces:
kitti_out/
├── image_2/
├── velodyne/
├── calib/
├── oxts/ (optional)
└── timestamps.txt
Produces:
kitti_out/
├── image_2/
├── velodyne/
├── label_2/
├── calib/
├── oxts/ (optional)
└── timestamps.txt
Labels
- 2D labels come from the chosen
--ann_sourcecamera annotation JSON. - 3D labels come from
annotations/lidar_ann.jsonif--use_lidar_3dis enabled. - If a camera is fisheye, projection uses a fisheye model (see
--anchor_model).
Produces:
kitti_out/
├── image_2/
├── depth_2/ (.npy or png16)
├── calib/
└── velodyne/ (optional)
Produces (example):
kitti_out/
├── image_2/ (anchor RGB)
├── label_2/
├── image_fish_front/
├── label_fish_front/
├── image_fish_left/
├── label_fish_left/
├── image_fish_right/
├── label_fish_right/
├── velodyne/
└── calib/
This is not standard KITTI, but is a practical layout for multi-camera research.
Below are the most important flags for each exporter. Run --help on any script for the complete list.
| Flag | Meaning |
|---|---|
--root |
Directory containing *_label/ sessions (or a split folder like dataset_test/test) |
--out |
Output folder for the KITTI dataset |
| Flag | Meaning |
|---|---|
--anchor_camera |
The camera whose frames define the exported frame index (image_2/) |
--anchor_model |
Projection model for anchor camera: auto, pinhole, fisheye |
--camera_optical_frame |
Optical frame of the anchor camera (used for LiDAR→camera transform) |
--lidar_frame |
LiDAR TF frame name used for 3D labels (e.g. front_lidar_link) |
| Flag | Meaning |
|---|---|
--require_image |
Only export samples that have the anchor image |
--require_lidar |
Only export samples that have LiDAR |
| Flag | Meaning |
|---|---|
--ann_source |
Which camera’s annotation JSON to use for 2D labels (e.g. cam_zed_rgb) |
--ann_key_mode |
How to match frames to annotations: auto, stem, exact, timestamp |
--use_lidar_3d |
Include LiDAR 3D boxes from annotations/lidar_ann.json |
--prefer_2d |
If both exist, write 2D-first then append 3D |
--prefer_lidar_3d |
If both exist, write 3D-first then append 2D (zero-3D) |
| Flag | Meaning |
|---|---|
--calib_root |
Folder containing calibration/intrinsics.json and calibration/extrinsics.json |
| Flag | Meaning |
|---|---|
--manifest_tsv |
Use an existing manifest to select samples |
--split_tag |
One of: train, val, test (uses files under splits/default/) |
| Flag | Meaning |
|---|---|
--root |
Directory containing *_label/ sessions |
--out |
Output folder |
| Flag | Meaning |
|---|---|
--depth_camera |
Depth modality folder name (typically cam_zed_depth) |
--depth_write_png16 |
Convert .npy depth to KITTI-style 16-bit PNG (else keep .npy) |
| Flag | Meaning |
|---|---|
--anchor_camera |
If you want RGB exported too (usually cam_zed_rgb) |
--require_image |
Require RGB |
--require_lidar |
Also export LiDAR to velodyne/ |
--calib_root |
Calibration root |
| Flag | Meaning |
|---|---|
--root |
Directory containing *_label/ sessions |
--out |
Output folder |
| Flag | Meaning |
|---|---|
--anchor_camera |
Anchor RGB camera written to image_2/ |
--fisheyes |
List of fisheye modalities to export (e.g. cam_fish_front cam_fish_left cam_fish_right) |
--ann_source |
List of modalities to read 2D annotations from (can include fisheyes) |
| Flag | Meaning |
|---|---|
--use_lidar_3d |
Include 3D boxes from lidar_ann.json |
--anchor_model |
auto/pinhole/fisheye for anchor projection |
--lidar_frame / --camera_optical_frame |
Frames used to compute LiDAR→camera transforms |
python kitti_export_object.py --root dataset_test/test --out kitti_object --anchor_camera cam_zed_rgb --camera_optical_frame front_left_camera_optical_frame --lidar_frame front_lidar_link --require_image --require_lidar --use_lidar_3d --prefer_2d --ann_source cam_zed_rgb --calib_root dataset_testpython kitti_export_depth.py --root dataset_test/test --out kitti_depth --anchor_camera cam_zed_rgb --require_image --depth_camera cam_zed_depth --depth_write_png16 --calib_root dataset_testpython kitti_export_custom.py --root dataset_test/test --out kitti_custom --anchor_camera cam_zed_rgb --fisheyes cam_fish_front cam_fish_left cam_fish_right --ann_source cam_zed_rgb cam_fish_front cam_fish_left cam_fish_right --use_lidar_3d --prefer_2d --lidar_frame front_lidar_link --camera_optical_frame front_left_camera_optical_frame --calib_root dataset_testThe control script kitti_export_ctl.py reads YAML jobs. Keys are designed to match CLI flags.
jobs:
- mode: object
root: dataset_test/test
out: kitti_object
anchor_camera: cam_zed_rgb
anchor_model: auto
require_image: true
require_lidar: true
ann_source: [cam_zed_rgb]
ann_key_mode: stem
use_lidar_3d: true
prefer_2d: true
lidar_frame: front_lidar_link
camera_optical_frame: front_left_camera_optical_frame
calib_root: dataset_test
- mode: depth
root: dataset_test/test
out: kitti_depth
anchor_camera: cam_zed_rgb
require_image: true
depth_camera: cam_zed_depth
depth_write_png16: true
calib_root: dataset_testRun:
python kitti_export_ctl.py --config configs/kitti_export.yamlAfter you run:
python build_manifest_and_splits.py --root dataset_testyou will have:
manifest_samples.tsvsplits/default/train.txtsplits/default/val.txtsplits/default/test.txt
You can pass --manifest_tsv and --split_tag train|val|test to exporters to reuse the same split.
Most common causes:
--ann_sourcepoints to the wrong modality (no matching annotation JSON)--ann_key_modemismatch (filenames vs stems vs timestamps)- your annotation JSON schema differs from what the parser expects
Fix:
- run with
--debug_labels - set
--ann_key_mode stemif your JSON stores1731409262_182444459.pngas stem keys
Cause:
intrinsics.jsonnot found, wrong--calib_root, or unexpected schema
Fix:
- ensure
calibration/intrinsics.jsonexists under--calib_root - confirm
K/camera_matrixfields exist for the selected camera
Cause:
- frame names mismatch OR TF graph in
extrinsics.jsonmissing the chain
Fix:
- verify exact frame strings in
calibration/extrinsics.json - use the true camera optical frame (e.g.
front_left_camera_optical_frame)
Cause:
- using pinhole projection on fisheye images, or wrong
--anchor_model
Fix:
- set
--anchor_model fisheye(or keepautoif intrinsics specify equidistant) - for custom export, ensure fisheye intrinsics have
distortion_model: equidistant
Cause:
--require_image/--require_lidarfilters out incomplete samples- sync thresholds too strict
Fix:
- relax sync threshold in
sync_and_match.py/ YAML - disable a requirement if appropriate