Skip to content

Further improvements to DepthAI integration#1458

Merged
matlabbe merged 5 commits intointrolab:masterfrom
borongyuan:master
Mar 8, 2025
Merged

Further improvements to DepthAI integration#1458
matlabbe merged 5 commits intointrolab:masterfrom
borongyuan:master

Conversation

@borongyuan
Copy link
Copy Markdown
Contributor

Hi, we have some new improvements for DepthAI integration. This includes reorganizing parameters and code structure, performing advanced tuning on the SGBM pipeline, and adding support for more OAK models. These improvements will be committed and explained in the coming days.

@borongyuan
Copy link
Copy Markdown
Contributor Author

borongyuan commented Mar 2, 2025

Luxonis' documentation does not accurately describe the usage restrictions of some params. Subpixel mode can actually be used with extended disparity. Using extended disparity or disparity companding can increase the actual disparity search range. But they cannot be used at the same time. Using extended disparity will allocate more hardware resources to StereoDepthNode, while using disparity companding will not. What is not documented is that for DisparityWidth::DISPARITY_64, N=48, M=16, T=0, so the disparity search range is extended to 80. In depthai-core v2.29.0, since they updated the filtering implementation of StereoDepthNode, incorrect depth maps will be generated when using disparity companding. It is recommended not to enable this option for now until they fix this part.

The other major limitation is the disparity range supported by the medial filter. I added the check logic in CameraDepthAI::setDisparityWidthAndFilter(). The checks for other functional options have also been moved to the front to make the subsequent pipeline configuration clearer.

@borongyuan
Copy link
Copy Markdown
Contributor Author

For StereoDepthNode, I wrote a tool to fine-tune the SGBM parameters. We will not add the code of this tool, but just update the new params we found, including alpha, beta, P1 and P2 cost, etc. There is also a new mask designed for the masked census transform, which is a hyperparameter. 0X5092A28C5152428 corresponds to the following pattern. It can be seen as a cover of a series of ellipses from small to large. This allows the algorithm to better consider small goals when considering larger blocks.

Luxonis is also developing an automatic param search solution. They should provide us with better params later.

@borongyuan
Copy link
Copy Markdown
Contributor Author

dai::Device can actually be initialized without configuring the pipeline. This allows us to obtain the required hardware information before configuring the pipeline.

OAK China provided me with OAK-D LR and OAK-D SR for testing. In order to be compatible with different devices using a variety of CMOS sensors, we need a smarter configuration method. For all the devices I have tested, their sensors can provide 400P or 800P input to the StereoDepthNode. So there are actually only two types of depth resolution we need to deal with. As for the resolution of sensors, if you check their image height, there are several possibilities. If we consider their image width, we can only handle cases that are multiples of 640. Then image height and isp scale can be automatically inferred. We can only offer 640 and 1280 options. Because the maximum depth map is only 800P, if you want to provide a 1920 option, you need to scale and interpolate the depth map, and this configuration is meaningless for VSLAM. The three output modes are currently compatible with almost all models of OAK cameras.

@borongyuan
Copy link
Copy Markdown
Contributor Author

We recommend no longer distinguishing between cameras connected via USB or network, and using jpeg compression transport for all images except the depth image. If subpixel is not used, the quality of the depth image is still relatively poor. Therefore we would rather enable subpixel and use a lower resolution. We have tested this on the PoE model. Although the depth image is not compressed, as long as other images are compressed, the bandwidth is basically sufficient. Also, if a camera connected via USB uses USB 2.0 for some reason, it is less likely to run into bandwidth bottlenecks.

Is it possible to create SensorData directly from compressed data. I know this is possible in ROS. In this way, data can be decoded only when they are needed, avoiding unnecessary encoding and decoding operations.

@borongyuan
Copy link
Copy Markdown
Contributor Author

For IMU, due to the chip shortage in previous years, a batch of products used BMI270. We found that for PoE models using different IMU types, the imu local transform is not consistent and needs to be rotated 180 degrees. This problem has not been found on other models.

In addition, BMI270 can only be configured for RAW output. For BNO085/086, if RAW data is used, the accelerometer output will have an obvious zero-g offset (sometimes up to 0.1g). Therefore, it is necessary to configure it to use calibrated data to enable internal compensation for the constant bias. For gyroscope, internal compensation can also be enabled, but this is not recommended. Because when rotating at a constant speed, their compensation algorithm will be deceived. Therefore, the zero bias processing of gyroscope should be left to VIO. BNO085/086 also comes with a 9-axis fusion algorithm, so it can directly provide a rotation vector without the need for IMU filtering on the host.

@borongyuan borongyuan marked this pull request as ready for review March 2, 2025 09:45
@matlabbe
Copy link
Copy Markdown
Member

matlabbe commented Mar 4, 2025

Great, thanks for this PR and detailled description. I'll give a try in coming days, but I from a quick overview I don't see issues reading the changes.

For

Is it possible to create SensorData directly from compressed data. I know this is possible in ROS. In this way, data can be decoded only when they are needed, avoiding unnecessary encoding and decoding operations.

I need to doublecheck if raw image is checked somewhere in the rtabmap updates. Normally it expects a raw image for features extraction, but if features are also already provided, it may not need the raw image and just forward the compressed image directly to the database (I could update the conditions to support this if it doesn't now). For SensorData, it is possible to initialize it just with compressed data. The only thing to check is that for color images, it expects JPEG or PNG formats (standard opencv encode/decode), so theoretically, we would only need to initialize a cv::Mat(1, {COMPRESSED_DATA_LENGTH_IN_BYTES}, CV_8UC1, {DATA_PTR}).clone(). The SensorData constructor automatically detects if the color image is compressed by looking at the format and size of the matrix (expecting one row and uchar type):

if(rgb.rows == 1)
{
UASSERT(rgb.type() == CV_8UC1); // Bytes
_imageCompressed = rgb;
if(clearData)
{
_imageRaw = cv::Mat();
}
}

@matlabbe
Copy link
Copy Markdown
Member

matlabbe commented Mar 8, 2025

Working on my OAK-D and OAK-D Pro wide.

@matlabbe matlabbe merged commit d88353d into introlab:master Mar 8, 2025
6 checks passed
@matlabbe
Copy link
Copy Markdown
Member

matlabbe commented Mar 9, 2025

To add to my previous comment, with this PR #1463, we can do:

diff --git a/corelib/src/camera/CameraDepthAI.cpp b/corelib/src/camera/CameraDepthAI.cpp
index c5bdf4ac..1ed1f55a 100644
--- a/corelib/src/camera/CameraDepthAI.cpp
+++ b/corelib/src/camera/CameraDepthAI.cpp
@@ -738,9 +738,10 @@ SensorData CameraDepthAI::captureImage(SensorCaptureInfo * info)
 
        double stamp = std::chrono::duration<double>(depthOrRight->getTimestampDevice(dai::CameraExposureOffset::MIDDLE).time_since_epoch()).count();
        if(outputMode_)
-               data = SensorData(cv::imdecode(rgbOrLeft->getData(), cv::IMREAD_ANYCOLOR), depthOrRight->getCvFrame(), stereoModel_.left(), this->getNextSeqID(), stamp);
+               data = SensorData(cv::Mat(1, rgbOrLeft->getData().size(), CV_8UC1, rgbOrLeft->getData().data()).clone(), depthOrRight->getCvFrame(), stereoModel_.left(), this->getNextSeqID(), stamp);
        else
-               data = SensorData(cv::imdecode(rgbOrLeft->getData(), cv::IMREAD_GRAYSCALE), cv::imdecode(depthOrRight->getData(), cv::IMREAD_GRAYSCALE), stereoModel_, this->getNextSeqID(), stamp);
+               data = SensorData(cv::Mat(1, rgbOrLeft->getData().size(), CV_8UC1, rgbOrLeft->getData().data()).clone(), cv::Mat(1, depthOrRight->getData().size(), CV_8UC1, depthOrRight->getData().data()).clone(), stereoModel_, this->getNextSeqID(), stamp);
 
        if(imuPublished_ && !publishInterIMU_)
        {

In the standalone library, we don't save that much time because Odometry will decompress the data anyway. However, if Odometry is running slower than the camera frame rate, then we save decompression time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants