Skip to content

Conversation

@trufio465-bot
Copy link

[FIX]

In raising this pull request, I confirm the following (please check boxes):

  • I have read and understood the contributors guide.
  • I have checked that another pull request for this purpose does not exist.
  • I have considered, and confirmed that this submission will be valuable to others.
  • I accept that this submission may not be used, and the pull request closed at the will of the maintainer.
  • I give this submission freely, and claim no ownership to its content.
  • I have mentioned this change in the changelog.

My familiarity with the project is as follows (check one):

  • I have never used CCExtractor.
  • I have used CCExtractor just a couple of times.
  • I absolutely love CCExtractor, but have not contributed previously.
  • I am an active contributor to CCExtractor.

This PR fixes an issue where CCExtractor failed to extract EIA-608 captions from HEVC (H.265) transport streams.

Problem:
CCExtractor's transport stream parser did not recognize HEVC (stream type 0x24) as a valid video stream type capable of carrying embedded captions. This resulted in "No captions were found in input" for HEVC files, even when captions were present and playable in other media players.

Solution:
Comprehensive HEVC support has been added across the codebase:

  1. Defined CCX_STREAM_TYPE_VIDEO_HEVC (0x24) and CCX_HEVC buffer type.
  2. Updated the TS parser to recognize HEVC streams during PMT processing and caption detection.
  3. Integrated HEVC streams into the existing AVC (H.264) caption processing pipeline, as both use similar SEI mechanisms for caption embedding.
  4. Ensured HEVC streams are correctly handled in stream information, buffer management, and sequencing logic.
  5. Updated Rust bindings for consistency.

Impact:
CCExtractor can now successfully detect and extract EIA-608 captions from HEVC transport stream files.

@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit a34ba0f...:
Report Name Tests Passed
Broken 13/13
CEA-708 13/14
DVB 4/7
DVD 3/3
DVR-MS 2/2
General 27/27
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 86/86
Teletext 21/21
WTV 10/13
XDS 34/34

Your PR breaks these cases:

NOTE: The following tests have been failing on the master branch as well as the PR:

  • ccextractor --stdout --quiet --no-fontcolor 79a51f3500..., Last passed:

    Never

  • ccextractor --stdout --quiet --no-fontcolor 767b546f96..., Last passed:

    Never

  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2..., Last passed:

    Never

  • ccextractor --out=srt --latin1 f23a544ba8..., Last passed:

    Never

  • ccextractor --out=srt --latin1 10f0f77cf4..., Last passed:

    Test 5993

  • ccextractor --out=srt --latin1 df3b4d62d3..., Last passed:

    Never


It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

@cfsmp3
Copy link
Contributor

cfsmp3 commented Dec 14, 2025

@trufio465-bot Do you have a sample for this that you can share?

Copy link
Contributor

@cfsmp3 cfsmp3 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for working on HEVC support. I've done a deep review and found some issues:

Current Status

  • ❌ Has merge conflicts with current master
  • ❌ Breaks one CEA-708 test (sample 115) which is unrelated to HEVC
  • ⚠️ Master already has partial HEVC support (merged after this PR was created)

What Master Already Has

Since this PR was created, master has gained:

  • CCX_STREAM_TYPE_VIDEO_HEVC = 0x24 in the stream type enum
  • HEVC handling in get_buffer_type() (returns CCX_H264 to reuse the AVC path)
  • PMT parsing for HEVC streams in ts_tables.c
  • HEVC description in init_ts()

Valuable Changes That Master May Still Need

Your PR correctly identifies that several functions only check for MPEG2 when they should also check for H.264 and HEVC:

  1. ts_info.c:get_video_stream() - Currently only returns PIDs for MPEG2 streams, missing H.264 and HEVC
  2. ts_functions.c:copy_payload_to_capbuf() - The "don't ignore if analyzing video" check only exempts MPEG2
  3. ts_functions.c:ts_readstream() - Same issue

These could cause problems when --analyze_video_stream is used with H.264/HEVC content.

Suggested Approach

Rather than adding a new CCX_HEVC buffer type, I recommend keeping master's approach of routing HEVC through the CCX_H264 code path (since both use similar SEI mechanisms for caption embedding). The fixes needed are simpler:

  1. Update get_video_stream() to include H.264 and HEVC
  2. Update copy_payload_to_capbuf() check to include H.264 and HEVC
  3. Update ts_readstream() similarly

This avoids adding new enum values and separate handling in process_data() for what is essentially the same processing path.

Questions

  1. Do you have an HEVC sample with captions you can share for testing? (Asked previously on Dec 14)
  2. Can you investigate why the CEA-708 MPEG-PS test (sample 115) fails with your changes? This file is not HEVC, so the regression suggests an unintended side effect.

Next Steps

If you'd like to continue with this PR:

  1. Rebase onto current master
  2. Remove the CCX_HEVC buffer type addition (master's CCX_H264 reuse is cleaner)
  3. Keep just the 3 function fixes mentioned above
  4. Investigate the test regression

Alternatively, I can create a simpler PR with just the needed fixes based on your analysis, and credit you for identifying the issue.

cfsmp3 added a commit that referenced this pull request Dec 19, 2025
Fixes #1690 - Captions fail to extract on HEVC video stream

HEVC video streams with embedded EIA-608/708 captions weren't being
extracted, even though VLC/MPV could display them.

Root causes fixed:
1. HEVC stream type (0x24) wasn't recognized for CC extraction
2. HEVC NAL parsing used H.264 format (1-byte) instead of HEVC (2-byte)
3. HEVC SEI types (39/40) weren't handled (only H.264 SEI type 6)
4. CC data accumulation across SEIs caused u8 overflow/garbled output

Changes:
- C code: Add HEVC stream detection, CCX_HEVC buffer type, is_hevc flag
- Rust code: HEVC NAL header parsing (2-byte, type=(byte[0]>>1)&0x3F),
  HEVC SEI handling (PREFIX_SEI=39, SUFFIX_SEI=40), immediate CC flush

Thanks to @trufio465-bot for the initial research in PR #1735.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
@cfsmp3
Copy link
Contributor

cfsmp3 commented Dec 19, 2025

Closing in favor of #1852 which provides a complete fix for issue #1690. Thank you @trufio465-bot for the initial research - it helped identify the key areas that needed modification.

@cfsmp3 cfsmp3 closed this Dec 19, 2025
cfsmp3 added a commit that referenced this pull request Dec 19, 2025
Fixes #1690 - Captions fail to extract on HEVC video stream

HEVC video streams with embedded EIA-608/708 captions weren't being
extracted, even though VLC/MPV could display them.

Root causes fixed:
1. HEVC stream type (0x24) wasn't recognized for CC extraction
2. HEVC NAL parsing used H.264 format (1-byte) instead of HEVC (2-byte)
3. HEVC SEI types (39/40) weren't handled (only H.264 SEI type 6)
4. CC data accumulation across SEIs caused u8 overflow/garbled output

Changes:
- C code: Add HEVC stream detection, CCX_HEVC buffer type, is_hevc flag
- Rust code: HEVC NAL header parsing (2-byte, type=(byte[0]>>1)&0x3F),
  HEVC SEI handling (PREFIX_SEI=39, SUFFIX_SEI=40), immediate CC flush

Thanks to @trufio465-bot for the initial research in PR #1735.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
cfsmp3 added a commit that referenced this pull request Dec 20, 2025
Fixes #1690 - Captions fail to extract on HEVC video stream

HEVC video streams with embedded EIA-608/708 captions weren't being
extracted, even though VLC/MPV could display them.

Root causes fixed:
1. HEVC stream type (0x24) wasn't recognized for CC extraction
2. HEVC NAL parsing used H.264 format (1-byte) instead of HEVC (2-byte)
3. HEVC SEI types (39/40) weren't handled (only H.264 SEI type 6)
4. CC data accumulation across SEIs caused u8 overflow/garbled output

Changes:
- C code: Add HEVC stream detection, CCX_HEVC buffer type, is_hevc flag
- Rust code: HEVC NAL header parsing (2-byte, type=(byte[0]>>1)&0x3F),
  HEVC SEI handling (PREFIX_SEI=39, SUFFIX_SEI=40), immediate CC flush

Thanks to @trufio465-bot for the initial research in PR #1735.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants