Skip to content

Conversation

steel-bucket
Copy link
Contributor

In raising this pull request, I confirm the following (please check boxes):

  • I have read and understood the contributors guide.
  • I have checked that another pull request for this purpose does not exist.
  • I have considered, and confirmed that this submission will be valuable to others.
  • I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
  • I give this submission freely, and claim no ownership to its content.
  • I have mentioned this change in the changelog.

My familiarity with the project is as follows (check one):

  • I have never used CCExtractor.
  • I have used CCExtractor just a couple of times.
  • I absolutely love CCExtractor, but have not contributed previously.
  • I am an active contributor to CCExtractor.

This PR resolves one TODO and one FIXME in the avc_functions.c.
The first TODO was - TODO: Do something if newsize == -1 (broken NAL)
So, what my changes do is, print out a more verbose error message.
The first FIXME was to check the set_fts function for CCX_NAL_TYPE_SEI which was only being checked in slice_header for NAL Type as CCX_NAL_TYPE_CODED_SLICE_IDR_PICTURE or CCX_NAL_TYPE_CODED_SLICE_NON_IDR_PICTURE_1.

There are more TODOs in the AVC Functions library, which need to be resolved before or during the porting to Rust.
This PR will help in debugging and resolving the AVC Issues currently prevalent in CCExtractor.( #1626 , #1597 , #1592 )

@prateekmedia prateekmedia requested a review from Copilot July 15, 2025 19:53
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR addresses two specific items in the AVC functions library: a TODO comment about handling broken NAL units and a FIXME comment about calling set_fts for SEI NAL units. The changes improve error handling for corrupted AVC/H.264 streams and ensure proper timestamp handling for SEI units.

  • Enhanced error messaging for corrupted NAL units with detailed explanations
  • Enabled set_fts function call for SEI NAL units to fix timestamp handling
  • Updated changelog to document the fixes

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
src/lib_ccx/avc_functions.c Improved error message for corrupted NAL units and enabled set_fts call for SEI units
docs/CHANGES.TXT Added changelog entry documenting the TODO and FIXME fixes

Comment on lines +100 to +103
mprint("\rWarning: Invalid prevention bytes detected in NAL unit type %u (0x%02X). "
"This NAL unit contains an illegal byte sequence (0x000000, 0x000001, or 0x000002) or "
"improper prevention byte (0x03). "
"This may indicate a corrupted AVC/H.264 stream. NAL unit skipped.\n",
Copy link
Preview

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message incorrectly describes the failure condition. The remove_03emu function can fail for reasons other than invalid prevention bytes, such as general parsing errors or memory issues. Consider using a more generic message like 'Failed to process NAL unit type %u (0x%02X) due to emulation prevention byte removal failure.'

Suggested change
mprint("\rWarning: Invalid prevention bytes detected in NAL unit type %u (0x%02X). "
"This NAL unit contains an illegal byte sequence (0x000000, 0x000001, or 0x000002) or "
"improper prevention byte (0x03). "
"This may indicate a corrupted AVC/H.264 stream. NAL unit skipped.\n",
mprint("\rWarning: Failed to process NAL unit type %u (0x%02X) due to emulation prevention byte removal failure. "
"This may indicate a corrupted AVC/H.264 stream or an internal processing error. NAL unit skipped.\n",

Copilot uses AI. Check for mistakes.

Comment on lines +101 to +102
"This NAL unit contains an illegal byte sequence (0x000000, 0x000001, or 0x000002) or "
"improper prevention byte (0x03). "
Copy link
Preview

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error message contains technical inaccuracies. The sequences 0x000000, 0x000001, and 0x000002 are not inherently illegal in H.264 streams - 0x000001 is actually the standard start code prefix. The message should focus on the actual failure rather than speculating about specific byte patterns.

Suggested change
"This NAL unit contains an illegal byte sequence (0x000000, 0x000001, or 0x000002) or "
"improper prevention byte (0x03). "
"This NAL unit contains improper prevention bytes (0x03) or other anomalies. "

Copilot uses AI. Check for mistakes.

Copy link
Contributor Author

@steel-bucket steel-bucket Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this suggestion is incorrect, The comment at Line 274 states that 0x000000, 0x000001 or 0x000002 shall not occur at any byte-aligned position

Comment on lines 105 to +110
return;
}

dvprint("BEGIN NAL unit type: %d length %d ref_idc: %d - Buffered captions before: %d\n",
nal_unit_type, NAL_stop - NAL_start - 1, dec_ctx->avc_ctx->nal_ref_idc, !dec_ctx->avc_ctx->cc_buffer_saved);

Copy link
Preview

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moving the dvprint statement after the error check means debug information won't be printed for NAL units that fail emulation prevention byte removal. This could make debugging more difficult. Consider keeping the debug print before the error check or adding a separate debug message for failed NAL units.

Suggested change
return;
}
dvprint("BEGIN NAL unit type: %d length %d ref_idc: %d - Buffered captions before: %d\n",
nal_unit_type, NAL_stop - NAL_start - 1, dec_ctx->avc_ctx->nal_ref_idc, !dec_ctx->avc_ctx->cc_buffer_saved);
dvprint("Failed to process NAL unit type: %d - Emulation prevention byte removal failed.\n",
nal_unit_type);
return;
}

Copilot uses AI. Check for mistakes.

@steel-bucket
Copy link
Contributor Author

@prateekmedia I have AVC Functions in 11th week of my Plan, should we scrap this PR as the code will be redundant anyway?

@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit afde4d6...:
Report Name Tests Passed
Broken 13/13
CEA-708 13/14
DVB 7/7
DVD 0/3
DVR-MS 2/2
General 22/27
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 83/86
Teletext 21/21
WTV 13/13
XDS 33/34

Your PR breaks these cases:

NOTE: The following tests have been failing on the master branch as well as the PR:

  • ccextractor --service 1 --out=txt f17524b53f..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 5ae2007a79..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 1e44efd810..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 add511677c..., Last passed:

    Never

  • ccextractor --autoprogram --out=srt --latin1 e9b9008fdf..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 27e46255f0..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 1974a299f0..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 132d7df7e9..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 99e5eaafdc..., Last passed:

    Never

  • ccextractor --unicode c83f765c66..., Last passed:

    Never

  • ccextractor --in=ps e9b9008fdf..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 0069dffd21..., Last passed:

    Never


It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit afde4d6...:
Report Name Tests Passed
Broken 13/13
CEA-708 13/14
DVB 2/7
DVD 0/3
DVR-MS 2/2
General 22/27
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 82/86
Teletext 7/21
WTV 10/13
XDS 33/34

Your PR breaks these cases:

NOTE: The following tests have been failing on the master branch as well as the PR:

  • ccextractor --service 1 --out=txt f17524b53f..., Last passed:

    Never

  • ccextractor --stdout --quiet --no-fontcolor 79a51f3500..., Last passed:

    Never

  • ccextractor --stdout --quiet --no-fontcolor 767b546f96..., Last passed:

    Never

  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 5ae2007a79..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 1e44efd810..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 add511677c..., Last passed:

    Never

  • ccextractor --autoprogram --out=srt --latin1 e9b9008fdf..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 27e46255f0..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 1974a299f0..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 132d7df7e9..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 99e5eaafdc..., Last passed:

    Never

  • ccextractor --unicode c83f765c66..., Last passed:

    Never

  • ccextractor --in=ps e9b9008fdf..., Last passed:

    Never

  • ccextractor --out=srt --latin1 f23a544ba8..., Last passed:

    Never

  • ccextractor --out=srt --latin1 10f0f77cf4..., Last passed:

    Test 5913

  • ccextractor --out=srt --latin1 df3b4d62d3..., Last passed:

    Never

  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 0069dffd21..., Last passed:

    Never


It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants