-
Notifications
You must be signed in to change notification settings - Fork 515
fix(ts): Skip broken PES packets instead of terminating file processing #1858
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Fixes #1690 - Captions fail to extract on HEVC video stream HEVC video streams with embedded EIA-608/708 captions weren't being extracted, even though VLC/MPV could display them. Root causes fixed: 1. HEVC stream type (0x24) wasn't recognized for CC extraction 2. HEVC NAL parsing used H.264 format (1-byte) instead of HEVC (2-byte) 3. HEVC SEI types (39/40) weren't handled (only H.264 SEI type 6) 4. CC data accumulation across SEIs caused u8 overflow/garbled output Changes: - C code: Add HEVC stream detection, CCX_HEVC buffer type, is_hevc flag - Rust code: HEVC NAL header parsing (2-byte, type=(byte[0]>>1)&0x3F), HEVC SEI handling (PREFIX_SEI=39, SUFFIX_SEI=40), immediate CC flush Thanks to @trufio465-bot for the initial research in PR #1735. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
HEVC uses B-frames extensively, causing CC data to arrive in decode order instead of presentation order. This was causing character pairs to be scrambled (e.g., "MEDIOCRE" became "MIOEDCRE"). Changes: - Implement PTS-based sequence numbering for HEVC CC data (similar to H.264) - Change flush logic to only trigger on IDR frames (not every VCL NAL) - Add HEVC fallback detection for streams without PAT/PMT Fixes #1639 (ATSC 3.0 HEVC caption extraction) Tested with issue_1639_sample.ts and caption_test_1690.ts 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
The HEVC NAL type constants are defined for completeness and reference, but not all are currently used in the codebase. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
Fixes #1455 When read_video_pes_header() encounters a malformed or truncated PES packet (returns -1), copy_capbuf_demux_data() previously returned CCX_EOF which terminated the entire file processing. This was overly aggressive - a single broken PES packet should be skipped, not terminate the entire file. UK Freeview DVB recordings from September 2022 onwards contain some malformed PES packets in the DVB subtitle stream that triggered this condition, causing ccextractor to stop at 0% with "Processing ended prematurely" error even though VLC could display the subtitles. The fix changes the error handling to skip the broken packet and continue processing: - Before: return CCX_EOF (terminates file) - After: return CCX_OK (skips packet, continues) Test results with UK Freeview sample: - Before: 0% processed, 0 subtitles extracted - After: 100% processed, 10 subtitles extracted correctly 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <[email protected]>
CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit b9aabcd...:
Your PR breaks these cases:
NOTE: The following tests have been failing on the master branch as well as the PR:
Congratulations: Merging this PR would fix the following tests:
It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you). Check the result page for more info. |
CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit b9aabcd...:
Your PR breaks these cases:
NOTE: The following tests have been failing on the master branch as well as the PR:
Congratulations: Merging this PR would fix the following tests:
It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you). Check the result page for more info. |
Summary
Problem
UK Freeview DVB recordings from September 2022 onwards would fail at 0% with:
VLC and other players could display the subtitles correctly, but ccextractor would terminate immediately.
Root Cause
When
read_video_pes_header()encounters a malformed or truncated PES packet, it returns -1. The calling functioncopy_capbuf_demux_data()then returnedCCX_EOF(-101), which terminated the entire file. This was overly aggressive - a single broken PES packet should be skipped, not terminate the file.Fix
Changed error handling to skip broken packets and continue:
Test Results
Sample subtitle extracted: "of eight 400 ounce gold bars that were stolen in 1998."
All 299 Rust tests pass.
Test plan
🤖 Generated with Claude Code