Skip to content

Conversation

@berndpfrommer
Copy link
Contributor

@berndpfrommer berndpfrommer commented Jul 15, 2025

This PR covers several changes that aim at making the encoder and decoder more generally useful.

  • The encoder avoid explicit reference to VAAPI and cuda to avoid spurious error messages when e.g. cuda is not available
  • Improved flushing of encoder: only need to give frame_id, not full header (was not used before)
  • Added support for discovery of decoders

Copy link

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass. Mostly stylistic. Functionally looks correct.

I'll test it locally.

@berndpfrommer
Copy link
Contributor Author

Thanks for looking over the PR!
I have been making a number of other changes since then. I will piggyback your feedback onto the next commit.

@berndpfrommer berndpfrommer force-pushed the support_fallback_decoder branch 2 times, most recently from 928f92a to 782d60a Compare July 18, 2025 05:29
@berndpfrommer berndpfrommer force-pushed the support_fallback_decoder branch from 782d60a to 5eed638 Compare July 18, 2025 05:38
src/encoder.cpp Outdated
frames_ctx->format = utils::find_hw_config(&usesHardwareFrames_, hwDevType, codec);

if (usesHardwareFrames_) {
const auto fmts = utils::get_hwframe_transfer_formats(hw_frames_ref);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@berndpfrommer I finally got around to try this on a machine with an RTX 4090. It failed with "cannot find valid sw pixel format!" when trying to encode images using hevc_nvenc.

After some digging, I think the problem may be in how we look for supported pixel formats. utils::get_hwframe_transfer_formats() hits av_hwframe_transfer_get_formats(). When frames_ctx->format is cuda, there are no transfer formats available (see https://ffmpeg.org/doxygen/trunk/hwcontext__cuda_8c_source.html#l00216). IIUC what goes in goes out (see https://ffmpeg.org/pipermail/libav-user/2018-June/011179.html).

I believe we should be using av_hwdevice_get_hwframe_constraints() to fetch the set of supported pixel formats, but don't quote me on that.

Copy link
Contributor Author

@berndpfrommer berndpfrommer Jul 19, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am able to replicate the issue.

The old code only used hardware frames from vaapi, not for cuda, and it worked. Somehow although nvenc is hardware accelerated, it does not use hardware frames, or open a hardware device. Apparently, no call to av_hwdevice_ctx_create() is necessary for nvenc.

What I don't know is when to use hw frames and when not, and I want to avoid hardcoding "vaapi" as it was before. I'm looking for any flags or other indicators that would tell when to use hwframes.

@berndpfrommer
Copy link
Contributor Author

@hidmic I have found a way to tell hevc_nvenc apart from hevc_vaapi by looking at the the hardware formats to transfer TO (rather than FROM). In the case of nvenc one does not even have to open a hardware device. Apparently this is handled internally by the nvenc library.
I could encode/decode using hevc (nvenc/cuvid and vaapi), and av1 (librav1e/libaom-av1). I assume h264 should also work. Note that the ROS parameter names have changed now, creating backwards incompatibility. I did not want to carry forward the "tune" etc parameters. They are now all settable via the av_options parameter.
Your help in testing is very much appreciated. Please let me know if things are still broken for you.

@berndpfrommer
Copy link
Contributor Author

I have more changes coming. The subject of image formats is really tricky.
Here's the issue: no encoders support the bayer images natively. Few even support single-channel formats like mono8. This means bayer images need to first be converted to e.g. rbg8 color before encoding, which inflates the data, and can actually increase the bit rate when configuring for lossless encoding (I had this happen with hevc_nvenc).
Single-channel image encoding can be achieved by converting e.g. mono8 -> nv12 (and setting the color channels to zero), and then converting back to the single channel encoding after decoding. But on the decoder side there is no way to tell
what the original image format was. I will embed the image format in the encoding field of the message, separated with a slash: hevc/bayer_rggb8. This way the decoder knows that a decoded nv12 formatted image actually once was a bayer image, and can reformat accordingly.

@hidmic
Copy link

hidmic commented Jul 24, 2025

I have found a way to tell hevc_nvenc apart from hevc_vaapi by looking at the the hardware formats to transfer TO (rather than FROM).

Wonderful. I'll give it a shot.

Note that the ROS parameter names have changed now, creating backwards incompatibility. I did not want to carry forward the "tune" etc parameters. They are now all settable via the av_options parameter.

Understood, but won't this be a blocker for releasing this package against existing LTS distributions?

I will embed the image format in the encoding field of the message, separated with a slash: hevc/bayer_rggb8

That sounds reasonable, but do note that sensor_msgs/msg/CompressedImage has an format pattern already that should cover this use case. Packet != image and formats need not match, but it'd be nice to minimize the variance.

@berndpfrommer
Copy link
Contributor Author

berndpfrommer commented Jul 26, 2025

Note that the ROS parameter names have changed now, creating backwards incompatibility. I did not want to carry forward the "tune" etc parameters. They are now all settable via the av_options parameter.

Understood, but won't this be a blocker for releasing this package against existing LTS distributions?

I really don't know in how far the policies apply only to core ros packages (like rclcpp) or if they apply to all packages. For my packages I usually just bump the major version and release them. The user base is small, and the packages are not mature. If I stuck to the official core policy my humble/jazzy packages would be hopelessly outdated. So far nobody has screamed at me for modifying my APIs on LTS distros.
About message formats, that's another thing, so I'm glad you pointed that one out (see below). It's very tedious to change data that has been collected, so I definitely want to be get that right.

I will embed the image format in the encoding field of the message, separated with a slash: hevc/bayer_rggb8

That sounds reasonable, but do note that sensor_msgs/msg/CompressedImage has an format pattern already that should cover this use case. Packet != image and formats need not match, but it'd be nice to minimize the variance.
Thanks for pointing this out. I will change from / to semicolon and organize it similarly to the compressed_image_transport.

@hidmic
Copy link

hidmic commented Jul 28, 2025

I can confirm encoding using hevc_nvenc works as intended. Decoding on republishing isn't 🤔

On what distro are you testing this @berndpfrommer ?

@berndpfrommer
Copy link
Contributor Author

I'm developing on rolling.
Can you provide a console log that would shed more light on what's going wrong?
I have added an encoder/decoder test to the ffmpeg_image_transport repo. Does colcon test pass? It passes for me on humble/jazzy/rolling.
Thanks,
Bernd

@berndpfrommer
Copy link
Contributor Author

I have successfully tested this PR now under vaapi and nvenc. Merging it into master and will be releasing it on Rolling.

@berndpfrommer berndpfrommer merged commit 7c221cd into master Aug 6, 2025
3 checks passed
@hidmic
Copy link

hidmic commented Aug 6, 2025

Ah, you beat me to it @berndpfrommer. Working through a long backlog, didn't get back here in time. Sorry for that, and thank you for pushing!

What about the ffmpeg_image_transport change? I see the branch is up to date and ready to be PR'd too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants