Skip to content

Conversation

thecapn32
Copy link

This is an attempt to support H.264 video compressed format on native_sim.
There is H.264 video provided in C array .h file and what is doing now is read video data from that file and send it over TCP. Due to this the frame rates are wrong.

This works needs to be done:

  • add mechanism to send h264 frames to maintain 30FPS
  • dynamic way to adjust video format in video-sw-generator ( now the H.264 format is static)

Copy link

github-actions bot commented Sep 3, 2025

Hello @thecapn32, and thank you very much for your first pull request to the Zephyr project!
Our Continuous Integration pipeline will execute a series of checks on your Pull Request commit messages and code, and you are expected to address any failures by updating the PR. Please take a look at our commit message guidelines to find out how to format your commit messages, and at our contribution workflow to understand how to update your Pull Request. If you haven't already, please make sure to review the project's Contributor Expectations and update (by amending and force-pushing the commits) your pull request if necessary.
If you are stuck or need help please join us on Discord and ask your question there. Additionally, you can escalate the review when applicable. 😊

@josuah josuah self-requested a review September 3, 2025 13:09
Copy link
Contributor

@josuah josuah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this improvement! Timely with the #92884 pull request doing the same thing but with real hardware, as well as every future video device.

Some quick feedback first, I will have to give a more in-depth round later.

Comment on lines +510 to +513
.fmt.width = 640, \
.fmt.height = 320, \
.fmt.pitch = 0, \
.fmt.pixelformat = VIDEO_PIX_FMT_H264, \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was probably convenient for testing, but it is better to keep it to what it was, and instead use the video_set_format() API from the application to select H.264.

if (video_bits_per_pixel(fmt.pixelformat) > 0) {
buffer_size = fmt.pitch * fmt.height;
} else {
buffer_size = fmt.width * fmt.height / 10;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is probably going to need some other strategy as hard to tune the compression ratio.

This might as well be something the user decides, depending on what is expected. For instance, buffer_size = fmt.width * fmt.height / CONFIG_VIDEO_MIN_COMPRESSION_RATIO, or even buffer_size = CONFIG_VIDEO_COMPRESSED_BUFFER_SIZE.

Comment on lines +1566 to +1570
/**
* H264 without start code
*/
#define VIDEO_PIX_FMT_H264_MVC VIDEO_FOURCC('M', '2', '6', '4')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we can keep 3D TV H.264 for another day and not introduce H264_MVC for now? https://en.wikipedia.org/wiki/Multiview_Video_Coding

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to move forward on PR #92884, I've cherry-picked this commit but without the M264 definition.

Comment on lines +227 to +229
/* Calculate copy size */
size_t remaining = h264_test_data_len - position;
copy_size = (remaining > buffer_size) ? buffer_size : remaining;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This fills a buffer with H.264 data completely, which does sound like a valid approach.
Though this might not fit the description of V4L2_PIX_FMT_H264:

H264 Access Unit. The decoder expects one Access Unit per buffer. The encoder generates one Access Unit per buffer. If ioctl VIDIOC_ENUM_FMT reports V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM then the decoder has no requirements since it can parse all the information from the raw bytestream. -- https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/pixfmt-compressed.html#compressed-formats

I need to check https://en.wikipedia.org/wiki/Network_Abstraction_Layer to be sure of what that means in detail, but if wanting to use the H.264, you might need one buffer per source image frame.

That is, after every buffer, it is expected that transmitting that updates the frame immediately on the viewer.

This mean that every buffer will have a different size.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@josuah NAL units are seperated by start codes, can sending each frame at 33ms interval work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like I was mistaken: there are two valid approaches for V4L2_PIX_FMT_H264.

If I understand it right, what you propose is the stateful video encoder, where the data produced is immediately usable for being transferred (i.e. over the network).

A stateful video encoder takes raw video frames in display order and encodes them into a bytestream. It generates complete chunks of the bytestream, including all metadata, headers, etc. The resulting bytestream does not require any further post-processing by the client. - https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-encoder.html

And it seems like it matches the description of V4L2_PIX_FMT_H264 when V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM is enabled...

All good then! No need to change, but this opens the question of how to make the distinction bewteen "stateful" and "stateless" encoder in Zephyr.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Clicking view this file shows something like this:

unsigned char h264_test_data[] = {
  0x00, 0x00, 0x00, 0x01, 0x67, 0x42, 0xc0, 0x1e, 0x8c, 0x68, 0x0a, 0x03,
  0xdb, 0x01, 0x01, 0xe1, 0x10, 0x8d, 0x40, 0x00, 0x00, 0x00, 0x01, 0x68,
  0xce, 0x3c, 0x80, 0x00, 0x00, 0x00, 0x01, 0x65, 0xb8, 0x00, 0x04, 0x08,
  0xf8, 0x84, 0x44, 0x44, 0x18, 0x11, 0xd5, 0x93, 0x80, 0xba, 0x90, 0x00,
  0x75, 0x5f, 0xe0, 0x01, 0xcb, 0x33, 0x44, 0x51, 0xa0, 0xd4, 0xdd, 0xfe,
...
  0xb4, 0x32, 0x5b, 0x5f, 0xf3, 0xc6, 0x2d, 0xc6, 0x1e, 0xb1, 0xfe, 0x87,
  0xbc, 0x0d, 0xdb, 0x8a, 0x58, 0x4b, 0xf9, 0xe0, 0x9d, 0x44, 0xe7, 0xaf,
  0x08, 0x5b, 0x9e, 0xa0
};
unsigned int h264_test_data_len = 315304;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it is interesting to include the file as a binary. I think there are some places in Zephyr where this is done and will try to give pointers on how to do this.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related #42580 (comment)

I think the best approach would be to remove the blob, and add instructions in the sample doc showing how to generate the header from a video file.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would also remove the amount of data on the repo.
Gstreamer and FFmpeg can generate files.

@JarmouniA Would it make sense to add these commands to CMake so that the user does not have to manually copy-paste a command in order to build? This would prevent to use it for i.e. CI.

It does not hurt to also document it though.

@thecapn32 thecapn32 requested a review from josuah September 3, 2025 13:30
Copy link

sonarqubecloud bot commented Sep 3, 2025

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants