Skip to content

Commit aa1a74e

Browse files
committed
initial commit for rosbag article
sqlite more pros and cons restructure
1 parent d1f6424 commit aa1a74e

File tree

1 file changed

+185
-0
lines changed

1 file changed

+185
-0
lines changed

articles/rosbags.md

Lines changed: 185 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,185 @@
1+
---
2+
layout: default
3+
title: ROS 2.0 ROSbags
4+
abstract:
5+
Research and ideas on how to realize an efficient implementation for rosbags in the ROS2.0 ecosystem
6+
published: false
7+
author: '[Karsten Knese](https://github.com/karsten1987)'
8+
---
9+
10+
{:toc}
11+
12+
# {{ page.title }}
13+
14+
<div class="abstract" markdown="1">
15+
{{ page.abstract }}
16+
</div>
17+
18+
Original Author: {{ page.author }}
19+
20+
21+
## Motivation
22+
23+
ROSbags have been proven to be a core and essential component of ROS1.
24+
The capability to record and replay robotic data of all types became crucial to data analysis and debugging.
25+
This functionality has to be available in ROS2.0 as well.
26+
The technical challenges shall be examined within this article.
27+
We shall the discuss the requirements on such a tool, the technical challenges and changes to be made to the current ROS2.0 system.
28+
29+
30+
## Requirements for storage format
31+
32+
To be chosen from section "Relaxed or dismissed requirements" in the "Alternatives" section
33+
34+
35+
## Proposal for data storage format
36+
37+
To be chosen from section "Dismissed data storage formats" in the "Alternatives" section
38+
39+
40+
## Alternatives
41+
42+
### Relaxed or dismissed requirements
43+
We need a data storage format which allows to sufficiently store and replay transmitted data with the least possible cost overhead.
44+
There are a few requirements for writing and reading to such a data storage format:
45+
46+
#### Scalability
47+
Nowadays, robotic systems can comprise a large number of sensors publishing data in parallel.
48+
This can easily lead to a significantly large amount of data over time.
49+
The chosen format has thus be able to scale up to a huge file size (> 1 TB).
50+
51+
#### Parallel I/O
52+
Processing time increases with slow file I/O.
53+
In order to provide efficient data processing, a parallel read and write to the file from multiple processes should be available.
54+
This would allow multiple processes (e.g. one per sensor) directly write to a commonly shared bag file without having a single recoding instance subscribing to all topics..
55+
56+
#### Compression
57+
When the file size becomes larger or disk space is only limited, it should be possible to compress the bag file.
58+
Compression can either be happening during write time or in a post-processing step.
59+
60+
#### Random access
61+
It must be possible to grant random read access to the file and extract specific individual messages.
62+
Random access further means that it should be possible to extract the n-th message of one topic without having to scroll through all messages in the same chunk.
63+
64+
#### Range access
65+
It further should be possible to only replay/read a section of the record, specified by a range of time.
66+
It should be possible to access a range of messages in terms of timestamps from `tx` to `tx+n`.
67+
68+
#### Variable Chunk sizes
69+
The chunk sizes must be configurable to fit various large message types in order to guarantee best performance for various large message types.
70+
It should further be possible to configure the condition on when to write such a chunk permanently to disk (e.g. in a given time interval or when chunk size is reached).
71+
72+
#### Backwards compatible with ROS1
73+
A general requirement is to be backwards compatible with existing ROS1 bags.
74+
This compatibility can either be via a conversion script, which permanently converts ROS1 bags into ROS2 bags or a bridge API which allows to manually open existing ROS1 bags and publish them into the ROS2 system.
75+
76+
77+
### Dismissed data storage formats
78+
79+
In the following, we are iterating over a couple of data formats, which may be suitable for the underlying ROSbag implementation.
80+
We hereby iterate over existing third party software as well as examining of maintaining a self-made format.
81+
82+
83+
#### HDF5
84+
85+
One very popular framework for storing scientific data is [HDF5](https://support.hdfgroup.org/HDF5/).
86+
It basically has all the necessary requirements listed above such as random access, parallelism and compression.
87+
It is further designed for highly complex data with an extensive amount of data.
88+
HDF5 is open source and its source can be freely obtained from [bitbucket](https://bitbucket.hdfgroup.org/projects/HDFFV/repos/hdf5/browse) and is under a [permissive license](https://bitbucket.hdfgroup.org/projects/HDFFV/repos/hdf5/browse/COPYING) of the HDF group.
89+
Multiple language wrapper or bindings are available, namely C/C++, python, fortran or java.
90+
91+
##### Pros
92+
- Open Source and standard specification for file format
93+
- Fulfills the requirements given
94+
- Multi language support
95+
- Large community of users
96+
97+
##### Cons
98+
- Depending on a slowly developing standard
99+
- The table dimensions of each chunk have to be of fixed size, known at startup.
100+
101+
There is a popular [blog post](http://cyrille.rossant.net/moving-away-hdf5/) by Cyrille Rossant, which gives a short introduction, but also discusses some controversy with HDF5.
102+
103+
104+
#### Existing ROS1 format
105+
106+
Alternatively, the existing ROS1 format can be continued to be used. The format description can be found [here](http://wiki.ros.org/Bags/Format/2.0).
107+
108+
##### Pros
109+
- Already ROS specific and evaluated for ROS messages, existing code could be reused
110+
111+
##### Cons
112+
- No random access
113+
114+
115+
#### SQLite
116+
117+
A third alternative which provides capabilities for data logging, and is used as such, is [SQLite](https://www.sqlite.org/about.html)
118+
Despite other relational database systems, it doesn't require any external SQL Server, but is self-contained.
119+
It is also open source, [extensively tested](https://www.sqlite.org/testing.html) and known to have a large [community](https://www.sqlite.org/support.html).
120+
The Gazebo team created a nicely composed [comparison](https://osrfoundation.atlassian.net/wiki/spaces/GAZ/pages/99844178/SQLite3+Proposal) between SQLite and the existing rosbag format.
121+
122+
##### Pros
123+
- Table dimensions do not have to be known at startup and can be flexibly extended.
124+
- Ability to query the tables with classical relational SQL syntax.
125+
126+
##### Cons
127+
- tbd
128+
129+
130+
## API requirements
131+
132+
TODO: Describe idea of having a plugin-like API where a methods for (de-)serialization and storage file format can be chosen dynamically. This allows an optimal rosbag configuration per use-case. For example could it be possible to use CDR and sqlite when data is written sequentially and thus writing speed matters. Other possibilities could be using deserialized JSON and MongoDB when writing speed doesn't matter too much, rather than indexing and introspecting.
133+
134+
## API for recording and replaying
135+
136+
The most important requirement for rosbags is being capable to record all available topics with minimal deserializing overhead.
137+
We shall therefore implement a set of functions in the rmw layer, which allows users to take messages raw, meaning in a serialized form object to the underlying middleware.
138+
In the case of DDS, such a raw message shall correspond to the CDR data being sent over the wire.
139+
Simultaneously, there shall be an API to convert a ROS2.0 message into its binary representation, such as CDR, in order to record data which is not being sent over the wire but created manually.
140+
141+
The same requirements are set for publishing stored data, where already serialized data shall be transmitted over the wire without the need of serializing.
142+
Analog to taking a message in its raw format, we shall implement a rmw function which allows publishing a raw message on a topic.
143+
In order to read a serialized message from a rosbag, we shall have a function which converts a serialized binary representation in its corresponding ROS2.0 message.
144+
145+
Given the requirements above, we propose the following rmw API:
146+
147+
```c
148+
rmw_ret_t
149+
rmw_take_raw(const rmw_subscription_t * subscription, rmw_message_raw_t * raw_message, bool * taken);
150+
151+
rmw_ret_t
152+
rmw_to_raw_message(const void * ros_message, const rosidl_message_type_support_t * type_support, rmw_message_raw_t * raw_message);
153+
154+
rmw_ret_t
155+
rmw_publish_raw(const rmw_publisher_t * publisher, const rmw_message_raw_t * raw_message);
156+
157+
rmw_ret_t
158+
rmw_from_raw_message(const void * rmw_message_raw_t, const rosidl_message_type_support_t * type_support, void * ros_message);
159+
```
160+
161+
The in the code snippet mentioned raw message shall be defined as follows:
162+
163+
```c
164+
typedef struct rmw_message_raw_t
165+
{
166+
unsigned int encoding_identifier;
167+
unsigned int length;
168+
char * raw_data;
169+
} rmw_message_raw_t;
170+
```
171+
172+
The `encoding_identifier` here indicates which encoding is used in this raw message, e.g. CDR in the case of DDS.
173+
The `raw_data` field shall contain all message data needed to extract a ROS message given a respective type support, which contains all necessary information on how to concert the raw data into its corresponding ros message type.
174+
An example for CDR data in case of DDS:
175+
176+
The ROS message string
177+
```
178+
std_msgs::msg::String msg;
179+
msg.data = "hello world 42";
180+
```
181+
translates into a rmw_message_raw_t
182+
```
183+
length: 24
184+
data (in hex): 0x00 0x01 0x00 0x00 0x0f 0x00 0x00 0x00 0x68 0x65 0x6c 0x6c 0x6f 0x20 0x77 0x6f 0x72 0x6c 0x64 0x20 0x34 0x32 0x00 0x00
185+
```

0 commit comments

Comments
 (0)