-
Notifications
You must be signed in to change notification settings - Fork 8k
video: H.264 video format support #95407
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Hello @thecapn32, and thank you very much for your first pull request to the Zephyr project! |
dcda85f
to
ef080c2
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this improvement! Timely with the #92884 pull request doing the same thing but with real hardware, as well as every future video device.
Some quick feedback first, I will have to give a more in-depth round later.
.fmt.width = 640, \ | ||
.fmt.height = 320, \ | ||
.fmt.pitch = 0, \ | ||
.fmt.pixelformat = VIDEO_PIX_FMT_H264, \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This was probably convenient for testing, but it is better to keep it to what it was, and instead use the video_set_format()
API from the application to select H.264.
if (video_bits_per_pixel(fmt.pixelformat) > 0) { | ||
buffer_size = fmt.pitch * fmt.height; | ||
} else { | ||
buffer_size = fmt.width * fmt.height / 10; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably going to need some other strategy as hard to tune the compression ratio.
This might as well be something the user decides, depending on what is expected. For instance, buffer_size = fmt.width * fmt.height / CONFIG_VIDEO_MIN_COMPRESSION_RATIO
, or even buffer_size = CONFIG_VIDEO_COMPRESSED_BUFFER_SIZE
.
/** | ||
* H264 without start code | ||
*/ | ||
#define VIDEO_PIX_FMT_H264_MVC VIDEO_FOURCC('M', '2', '6', '4') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we can keep 3D TV H.264 for another day and not introduce H264_MVC
for now? https://en.wikipedia.org/wiki/Multiview_Video_Coding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In order to move forward on PR #92884, I've cherry-picked this commit but without the M264 definition.
/* Calculate copy size */ | ||
size_t remaining = h264_test_data_len - position; | ||
copy_size = (remaining > buffer_size) ? buffer_size : remaining; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fills a buffer with H.264 data completely, which does sound like a valid approach.
Though this might not fit the description of V4L2_PIX_FMT_H264
:
H264 Access Unit. The decoder expects one Access Unit per buffer. The encoder generates one Access Unit per buffer. If ioctl VIDIOC_ENUM_FMT reports V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM then the decoder has no requirements since it can parse all the information from the raw bytestream. -- https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/pixfmt-compressed.html#compressed-formats
I need to check https://en.wikipedia.org/wiki/Network_Abstraction_Layer to be sure of what that means in detail, but if wanting to use the H.264, you might need one buffer per source image frame.
That is, after every buffer, it is expected that transmitting that updates the frame immediately on the viewer.
This mean that every buffer will have a different size.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@josuah NAL units are seperated by start codes, can sending each frame at 33ms interval work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like I was mistaken: there are two valid approaches for V4L2_PIX_FMT_H264
.
If I understand it right, what you propose is the stateful video encoder, where the data produced is immediately usable for being transferred (i.e. over the network).
A stateful video encoder takes raw video frames in display order and encodes them into a bytestream. It generates complete chunks of the bytestream, including all metadata, headers, etc. The resulting bytestream does not require any further post-processing by the client. - https://www.kernel.org/doc/html/latest/userspace-api/media/v4l/dev-encoder.html
And it seems like it matches the description of V4L2_PIX_FMT_H264
when V4L2_FMT_FLAG_CONTINUOUS_BYTESTREAM
is enabled...
All good then! No need to change, but this opens the question of how to make the distinction bewteen "stateful" and "stateless" encoder in Zephyr.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Clicking view this file
shows something like this:
unsigned char h264_test_data[] = {
0x00, 0x00, 0x00, 0x01, 0x67, 0x42, 0xc0, 0x1e, 0x8c, 0x68, 0x0a, 0x03,
0xdb, 0x01, 0x01, 0xe1, 0x10, 0x8d, 0x40, 0x00, 0x00, 0x00, 0x01, 0x68,
0xce, 0x3c, 0x80, 0x00, 0x00, 0x00, 0x01, 0x65, 0xb8, 0x00, 0x04, 0x08,
0xf8, 0x84, 0x44, 0x44, 0x18, 0x11, 0xd5, 0x93, 0x80, 0xba, 0x90, 0x00,
0x75, 0x5f, 0xe0, 0x01, 0xcb, 0x33, 0x44, 0x51, 0xa0, 0xd4, 0xdd, 0xfe,
...
0xb4, 0x32, 0x5b, 0x5f, 0xf3, 0xc6, 0x2d, 0xc6, 0x1e, 0xb1, 0xfe, 0x87,
0xbc, 0x0d, 0xdb, 0x8a, 0x58, 0x4b, 0xf9, 0xe0, 0x9d, 0x44, 0xe7, 0xaf,
0x08, 0x5b, 0x9e, 0xa0
};
unsigned int h264_test_data_len = 315304;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe it is interesting to include the file as a binary. I think there are some places in Zephyr where this is done and will try to give pointers on how to do this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Related #42580 (comment)
I think the best approach would be to remove the blob, and add instructions in the sample doc showing how to generate the header from a video file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This would also remove the amount of data on the repo.
Gstreamer and FFmpeg can generate files.
@JarmouniA Would it make sense to add these commands to CMake so that the user does not have to manually copy-paste a command in order to build? This would prevent to use it for i.e. CI.
It does not hurt to also document it though.
|
This is an attempt to support H.264 video compressed format on native_sim.
There is H.264 video provided in C array .h file and what is doing now is read video data from that file and send it over TCP. Due to this the frame rates are wrong.
This works needs to be done: