Skip to content

Commit 4026e97

Browse files
committed
reworked draft
1 parent 96b0f8b commit 4026e97

File tree

2 files changed

+25
-13
lines changed

2 files changed

+25
-13
lines changed

docs/paper.bib

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -59,7 +59,6 @@ @inproceedings{KITTI2012
5959
@misc{PDAL2022,
6060
author = {PDAL Contributors},
6161
title = {PDAL Point Data Abstraction Library},
62-
month = aug,
6362
year = 2022,
6463
doi = {10.5281/zenodo.2616780},
6564
url = {https://doi.org/10.5281/zenodo.2616780}
@@ -89,6 +88,19 @@ @article{Schmid2025
8988
year = {2025}
9089
}
9190

91+
@article{Bazzano2025,
92+
author = {Bazzano, Cristina F. and Alves, Luiz F. G. and Telles, Guilherme P. and Trivella, Daniela B. B.},
93+
doi = {10.1038/s41597-025-06002-8},
94+
issn = {2052-4463},
95+
issue = {1},
96+
journal = {Scientific Data},
97+
month = {10},
98+
pages = {1726},
99+
title = {Labeled dataset of X-ray protein ligand images in 3D point cloud and validated deep learning models},
100+
volume = {12},
101+
year = {2025}
102+
}
103+
92104
@INPROCEEDINGS{Foote2013,
93105
author={Foote, Tully},
94106
booktitle={2013 IEEE Conference on Technologies for Practical Robot Applications (TePRA)},
@@ -98,5 +110,5 @@ @INPROCEEDINGS{Foote2013
98110
number={},
99111
pages={1-6},
100112
keywords={Irrigation;Accuracy},
101-
doi={10.1109/TePRA.2013.6556373}}
102-
113+
doi={10.1109/TePRA.2013.6556373}
114+
}

docs/paper.md

Lines changed: 10 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -26,33 +26,33 @@ bibliography: paper.bib
2626

2727
PointCloudCrafter is a C++ command-line interface (CLI) toolkit for the processing, manipulation, and analysis of three-dimensional point cloud data. The software provides a collection of compiled executables for integration into automated research pipelines, enabling common point cloud operations without requiring custom code.
2828

29-
The toolkit supports conversion and processing of data from multiple acquisition and storage formats, including ROS2 bag recordings, binary files [@nuScenes2019; @KITTI2012], and standard PLY and PCD files. A central design feature of PointCloudCrafter is its schema-agnostic handling of per-point attributes. Arbitrary scalar fields associated with individual points are preserved throughout conversion, transformation, filtering, and aggregation steps, ensuring that auxiliary metadata remains available for downstream analysis.
29+
The toolkit supports the conversion and processing of data from multiple acquisition and storage formats, including ROS2 bag recordings, binary files, plain-text files, and standard PLY and PCD files. A central design feature of PointCloudCrafter is its schema-agnostic handling of per-point attributes. Arbitrary scalar fields associated with individual points are preserved throughout conversion, transformation, filtering, and aggregation steps, ensuring that auxiliary metadata remains available for downstream analysis. Furthermore, it allows the accumulation of multiple point clouds based on their timestamps and transforms.
3030

3131
By combining support for robotic middleware data with batch-oriented file processing, PointCloudCrafter facilitates reproducible workflows across a range of scientific and engineering applications that rely on large-scale three-dimensional point cloud data.
3232

3333
# Statement of need
3434

35-
Three-dimensional point cloud data is widely used in robotics, autonomous driving, remote sensing, and scientific data analysis. In practice, such data is commonly encountered in two distinct representations. During acquisition and experimentation, point clouds are often recorded as time-stamped data streams using robotic middleware such as ROS2. For benchmarking, archival, and offline analysis, the same data is typically stored and distributed as static files in dataset-specific or standardized formats.
35+
Three-dimensional point cloud data is widely used in robotics, autonomous driving, remote sensing, and scientific data analysis. In practice, such data is commonly encountered in two distinct representations. During acquisition and experimentation, point clouds are often recorded as time-stamped data streams using robotic middleware such as ROS2. For benchmarking, archival, and offline analysis, the same data is typically stored and distributed as static files in dataset-specific or standardized formats. Especially, the use of automated pipelines for neural network training often requires the input data to be static files with a specific naming scheme.
3636

3737
Transitioning between these representations, or performing large-scale batch operations on point cloud collections, remains a recurring challenge.
3838

39-
The Point Cloud Library (PCL) [@Rusu2011] provides a comprehensive set of data structures and algorithms for three-dimensional processing and serves as the technical foundation of PointCloudCrafter, but its use generally requires direct integration into custom software projects. Graphical tools such as CloudCompare are well suited for interactive inspection but are not designed for automated, headless processing of large datasets. Python-based libraries such as Open3D [@Zhou2018] and Pyoints [@Lamprecht2019] offer accessible interfaces for point cloud manipulation, but performance and input-output overhead can become limiting factors when processing large directories of high-resolution scans.
40-
PDAL [@PDAL2022] is a mature framework for large-scale geospatial point cloud processing and provides a flexible pipeline abstraction for working with formats such as LAS and LAZ. While PDAL is well suited for geospatial analysis and national lidar datasets, it does not natively integrate with robotic middleware or support time-resolved transformations based on sensor pose information.
41-
Recent toolkits such as PointClouds.jl [@Schmid2025] provide efficient and flexible programmatic interfaces for point cloud processing within the Julia ecosystem, with a focus on interactive development and integration with geospatial data sources. While well suited for algorithm development and exploratory analysis, such approaches require users to implement custom processing logic and are not designed as standalone command-line tools for automated conversion of robotic middleware data. Libraries such as libpointmatcher [@Pomerleau2013] focus on modular implementations of point cloud registration algorithms and are typically embedded within state estimation or SLAM systems rather than used for dataset-level preprocessing and format conversion.
39+
The Point Cloud Library (PCL) [@Rusu2011] provides a comprehensive set of data structures and algorithms for three-dimensional processing and serves as the technical foundation of PointCloudCrafter, but its use generally requires direct integration into custom software projects. Graphical tools such as CloudCompare are well-suited for interactive inspection but are not designed for automated, headless processing of large datasets. Python-based libraries such as Open3D [@Zhou2018] and Pyoints [@Lamprecht2019] offer accessible interfaces for point cloud manipulation, but performance and input-output overhead can become limiting factors when processing large directories of high-resolution scans.
40+
PDAL [@PDAL2022] is a mature framework for large-scale geospatial point cloud processing and provides a flexible pipeline abstraction for working with formats such as LAS and LAZ. While PDAL is well-suited for geospatial analysis and national lidar datasets, it does not natively integrate with robotic middleware or support time-resolved transformations based on sensor pose information.
41+
Recent toolkits such as PointClouds.jl [@Schmid2025] provide efficient and flexible programmatic interfaces for point cloud processing within the Julia ecosystem, with a focus on interactive development and integration with geospatial data sources. While well-suited for algorithm development and exploratory analysis, such approaches require users to implement custom processing logic and are not designed as standalone command-line tools for automated conversion of robotic middleware data. Libraries such as libpointmatcher [@Pomerleau2013] focus on modular implementations of point cloud registration algorithms and are typically embedded within state estimation or SLAM systems rather than used for dataset-level preprocessing and format conversion.
4242

43-
A further limitation of many existing tools is the implicit assumption of fixed-point schemas. Conversion utilities and parsers often restrict data to spatial coordinates and optional color information, discarding non-standard scalar fields. However, many scientific applications depend on additional per-point attributes. Automotive LiDAR sensors provide channels such as intensity, timestamp, and ring index, which are required for calibration, motion correction, and mapping. In other domains, including structural biology, point clouds may encode physicochemical properties such as hydrophobicity or electrostatic potential [@Parigger2024]. Preserving these attributes across processing steps is essential for reproducibility and downstream analysis.
43+
A further limitation of many existing tools is the implicit assumption of fixed-point schemas. Conversion utilities and parsers often restrict data to spatial coordinates and optional color information, discarding non-standard scalar fields. However, many scientific applications depend on additional per-point attributes. Automotive LiDAR sensors provide channels such as intensity, timestamp, and ring index, which are required for calibration, motion correction, and mapping. Furthermore, novel imaging RaDAR sensors export point cloud data with metadata, such as Doppler velocity, RaDAR-cross-section, or signal-to-noise-ratio. In other domains, including structural biology, point clouds may encode physicochemical properties such as hydrophobicity or electrostatic potential [@Parigger2024, @Bazzano2025]. Preserving these attributes across processing steps is essential for reproducibility and downstream analysis.
4444

4545
PointCloudCrafter addresses these needs by providing a lightweight CLI toolkit focused on batch-oriented point cloud conversion and processing. The primary software contribution is a schema-agnostic, command-line-driven framework that unifies middleware-based and file-based point cloud processing while preserving arbitrary per-point attributes. Arbitrary scalar fields refer to dynamically discovered per-point attributes whose names, data types, and memory layout are preserved without enforcing a predefined schema.
4646

4747
# Functionality
4848

4949
PointCloudCrafter provides two primary execution modes, *rosbag* and *file*, for streaming and static data workflows, respectively.
5050

51-
The *rosbag* mode enables extraction of `sensor_msgs::msg::PointCloud2` topics from ROS2 bag recordings. This mode integrates with the `TF2` library to perform time-resolved pose lookups based on the recorded transform tree. Each point cloud can be transformed from the sensor coordinate frame into a user-specified fixed reference frame using the pose corresponding to the exact acquisition timestamp. This enables motion-consistent export of point cloud data from dynamic sensor platforms.
51+
The *rosbag* mode enables extraction of `sensor_msgs::msg::PointCloud2` topics from ROS2 bag recordings. This mode integrates with the `TF2` library [@Foote2013] to perform time-resolved pose lookups based on the recorded transform tree. Each point cloud can be transformed from the sensor coordinate frame into a user-specified fixed reference frame using the pose corresponding to the exact acquisition timestamp. This enables motion-consistent export of point cloud data from dynamic sensor platforms and a timestamp-based fusion of different point cloud sensor topics.
5252

53-
The *file* mode focuses on batch processing of static point cloud files. Supported formats include common text and binary formats such as PLY, PCD, OBJ, and TXT, as well as dataset-specific binary formats used in KITTI and nuScenes. In this mode, users can apply rigid-body transformations, merge multiple point clouds into a single representation, and perform filtering operations. Implemented filters include geometric cropping using box, sphere, or cylindrical regions, voxel grid downsampling, and statistical or radius-based outlier removal.
54-
55-
Both execution modes rely internally on the `PCLPointCloud2` data structure, enabling schema-agnostic handling of point attributes. All processing steps are fully parameterized via command-line options, enabling deterministic, reproducible execution across datasets.
53+
The *file* mode focuses on batch processing of static point cloud files. Supported formats include common text and binary formats such as PLY, PCD, OBJ, and TXT, as well as dataset-specific binary formats used in KITTI [@KITTI2012] and nuScenes [@nuScenes2019].
54+
Both execution modes rely internally on the `PCLPointCloud2` data structure, enabling schema-agnostic handling of point attributes. They allow users to apply rigid-body transformations, merge multiple point clouds into a single representation, and perform filtering operations. Implemented filters include geometric cropping using box, sphere, or cylindrical regions, voxel grid downsampling, and statistical or radius-based outlier removal. As a result of the implemented data formats, the processed point clouds can also be converted into other data formats.
55+
All processing steps are fully parameterized via command-line options, enabling deterministic, reproducible execution across datasets.
5656

5757
A typical use case is the extraction of LiDAR scans from ROS2 bag recordings, transformation into a global reference frame using an externally estimated trajectory, and aggregation into a static map for offline evaluation. This workflow can be executed in a single batch process while preserving sensor-specific scalar fields required for subsequent analysis.
5858

0 commit comments

Comments
 (0)