Skip to content

Conversation

@Rahul-2k4
Copy link
Contributor

@Rahul-2k4 Rahul-2k4 commented Dec 17, 2025

Description

This PR adds support for extracting multiple DVB subtitle streams into separate output files via a new opt-in flag: --split-dvb-subs.

It resolves the long-standing request in #447, where users want CCExtractor to automatically handle transport streams containing multiple DVB subtitle tracks (e.g. multi-language broadcasts) without pre-selecting PIDs or running the tool multiple times.

Default behavior is unchanged unless the flag is explicitly enabled.

### Motivation

  • Many DVB transport streams contain more than one subtitle stream (often per language).
  • Currently, CCExtractor extracts only a single stream, which makes multi-language workflows cumbersome.
  • With --split-dvb-subs, CCExtractor:
  • Discovers all DVB subtitle streams via PMT parsing
  • Creates isolated per-PID decoder pipelines
  • Produces one output file per stream (PID or language-suffixed)
  • This makes CCExtractor significantly more convenient for real-world DVB content.

Summary of Changes

  • Added --split-dvb-subs CLI option (disabled by default)
  • Enhanced PMT parsing to discover multiple DVB subtitle streams
  • Refactored DVB subtitle decoder to be instance-safe (no static globals)
  • Introduced per-stream pipelines (decoder + timing + encoder + writer)
  • Implemented PID-based packet routing
  • Added strict option-conflict validation
  • Preserved legacy single-stream behavior

Testing

Built and tested with and without --split-dvb-subs
Verified extraction on DVB TS files containing multiple subtitle streams

Confirmed:

  • One output file per stream
  • Bit-exact legacy behavior when the flag is not used
  • Clean shutdown with proper resource cleanup

Scope / Notes

  • This PR intentionally limits scope to DVB subtitle streams only
  • Teletext multi-stream extraction is not included and can be addressed separately
  • The design is extensible for future language filtering or stream selection

Closing

I’ve tried to keep this change conservative, backward-compatible, and well-isolated from existing logic.
Feedback is very welcome I’m happy to revise or split the PR if needed.

Rahul-2k4 and others added 6 commits December 17, 2025 00:18
- Add missing fields to ccx_decoders_dvb_context: private_data, cfg, initialized_ocr
- Add dvb_decoder_ctx field to ccx_stream_metadata
- Add language field to cap_info structure
- Add split_dvb_subs field to lib_ccx_ctx
- Initialize split_dvb_subs from options in init_libraries
- Fix all references to use correct struct field names (lang vs language, stream_pid vs pid)
- Update Rust bindings to include new language field in cap_info
- Match dvb_init_decoder, dvb_free_decoder, and dvb_decode signatures between header and implementation

Co-authored-by: Rahul-2k4 <[email protected]>
Apply clang-format and rustfmt formatting fixes
@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit f6cb862...:
Report Name Tests Passed
Broken 13/13
CEA-708 14/14
DVB 5/7
DVD 3/3
DVR-MS 2/2
General 27/27
Hardsubx 0/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 75/86
Teletext 3/21
WTV 13/13
XDS 34/34

Your PR breaks these cases:

NOTE: The following tests have been failing on the master branch as well as the PR:

Congratulations: Merging this PR would fix the following tests:

  • ccextractor --out=srt --latin1 --autoprogram 73d9313d64..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 8e8229b88b..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 7236304cfc..., Last passed: Never
  • ccextractor --out=srt --latin1 06b3a9237d..., Last passed: Never
  • ccextractor --out=srt --latin1 83f8cceb74..., Last passed: Never
  • ccextractor --out=srt --latin1 611b4a9235..., Last passed: Never
  • ccextractor --out=srt --latin1 b46e9e8e3f..., Last passed: Never
  • ccextractor --out=srt --latin1 89e417e622..., Last passed: Never
  • ccextractor --out=srt --latin1 d59eadc4ed..., Last passed: Never
  • ccextractor --out=sami --latin1 --autoprogram --no-goptime 5b4e0a6034..., Last passed: Never
  • ccextractor --service 1 --out=ttxt da904de35d..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 5ae2007a79..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 1e44efd810..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 add511677c..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 9a496d3828..., Last passed: Never
  • ccextractor --out=srt --latin1 --autoprogram 56c9f34548..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 e9b9008fdf..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 c032183ef0..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 d037c7509e..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 1974a299f0..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 132d7df7e9..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 99e5eaafdc..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 b22260d065..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla 7aad20907e..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla c41f73056a..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla 5d3a29f9f8..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla 70000200c0..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla 6dc772d881..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla adce82fd39..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 01509e4d27..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla ab9cf8cfad..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --output-field 2 5d3a29f9f8..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --output-field 2 c41f73056a..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --sentencecap c032183ef0..., Last passed: Never
  • ccextractor --autoprogram --out=bin --latin1 c032183ef0..., Last passed: Never
  • ccextractor --hauppauge --autoprogram --out=srt --latin1 a03b5b2a56..., Last passed: Never
  • ccextractor --autoprogram --out=srt --hauppauge --latin1 553d78e755..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --hauppauge --ucla --latin1 553d78e755..., Last passed: Never
  • ccextractor --out=ttxt c83f765c66..., Last passed: Never
  • ccextractor --goptime c83f765c66..., Last passed: Never
  • ccextractor --in=es dc7169d7c4..., Last passed: Never
  • ccextractor --in=wtv b46e9e8e3f..., Last passed: Never
  • ccextractor --wtvmpeg2 10f0f77cf4..., Last passed: Never
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --endcreditstext "CCextractor Ends crdit Testing" addf5e2fc9..., Last passed: Never
  • ccextractor --endcreditsforatleast 3 --endcreditstext "CCextractor Ends crdit Testing" addf5e2fc9..., Last passed: Never
  • ccextractor --endcreditsforatmost 2 --endcreditstext "CCextractor Ends crdit Testing" addf5e2fc9..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 c0d2fba8c0..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 297a44921a..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 8c1615c1a8..., Last passed: Never
  • ccextractor --out=srt --latin1 f23a544ba8..., Last passed: Never
  • ccextractor --out=srt --latin1 97cc394d87..., Last passed: Never
  • ccextractor --out=srt --latin1 10f0f77cf4..., Last passed: Never
  • ccextractor --out=srt --latin1 df3b4d62d3..., Last passed: Never
  • ccextractor --out=srt --latin1 d7e7dbdf68..., Last passed: Never
  • ccextractor --out=srt --latin1 76734ac4a7..., Last passed: Never
  • ccextractor --out=srt --latin1 c791382c94..., Last passed: Never
  • ccextractor --out=srt --latin1 f673b2f916..., Last passed: Never
  • ccextractor --out=srt --latin1 da75bdee47..., Last passed: Never
  • ccextractor --out=srt --latin1 bd6f33a669..., Last passed: Never
  • ccextractor --out=srt --latin1 0e5e6b26be..., Last passed: Never
  • ccextractor --out=srt --latin1 a226cc302d..., Last passed: Never
  • ccextractor --out=srt --latin1 ae6327683e..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 725a49f871..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --xds --ucla c813e713a0..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds b992e0cccb..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds d0291cdcf6..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds c8dc039a88..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --ucla 53339f3455..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 83b03036a2..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 7d3f25c32c..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --ucla 7d3f25c32c..., Last passed: Never

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit f6cb862...:
Report Name Tests Passed
Broken 12/13
CEA-708 14/14
DVB 5/7
DVD 3/3
DVR-MS 2/2
General 27/27
Hardsubx 0/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 75/86
Teletext 0/21
WTV 13/13
XDS 34/34

Your PR breaks these cases:

NOTE: The following tests have been failing on the master branch as well as the PR:

Congratulations: Merging this PR would fix the following tests:

  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 8e8229b88b..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 7236304cfc..., Last passed: Never
  • ccextractor --out=srt --latin1 06b3a9237d..., Last passed: Never
  • ccextractor --out=srt --latin1 83f8cceb74..., Last passed: Never
  • ccextractor --out=srt --latin1 611b4a9235..., Last passed: Never
  • ccextractor --out=srt --latin1 b46e9e8e3f..., Last passed: Never
  • ccextractor --out=srt --latin1 89e417e622..., Last passed: Never
  • ccextractor --out=srt --latin1 d59eadc4ed..., Last passed: Never
  • ccextractor --out=sami --latin1 --autoprogram --no-goptime 5b4e0a6034..., Last passed: Never
  • ccextractor --service 1 --out=ttxt da904de35d..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 5ae2007a79..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 1e44efd810..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 add511677c..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 9a496d3828..., Last passed: Never
  • ccextractor --out=srt --latin1 --autoprogram 56c9f34548..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 e9b9008fdf..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 c032183ef0..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 d037c7509e..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 1974a299f0..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 132d7df7e9..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 99e5eaafdc..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 b22260d065..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla 7aad20907e..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla c41f73056a..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla 5d3a29f9f8..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla 70000200c0..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla 6dc772d881..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla adce82fd39..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 01509e4d27..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla ab9cf8cfad..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --output-field 2 5d3a29f9f8..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --output-field 2 c41f73056a..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --sentencecap c032183ef0..., Last passed: Never
  • ccextractor --autoprogram --out=bin --latin1 c032183ef0..., Last passed: Never
  • ccextractor --hauppauge --autoprogram --out=srt --latin1 a03b5b2a56..., Last passed: Never
  • ccextractor --autoprogram --out=srt --hauppauge --latin1 553d78e755..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --hauppauge --ucla --latin1 553d78e755..., Last passed: Never
  • ccextractor --out=ttxt c83f765c66..., Last passed: Never
  • ccextractor --goptime c83f765c66..., Last passed: Never
  • ccextractor --in=es dc7169d7c4..., Last passed: Never
  • ccextractor --in=wtv b46e9e8e3f..., Last passed: Never
  • ccextractor --wtvmpeg2 10f0f77cf4..., Last passed: Never
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --endcreditstext "CCextractor Ends crdit Testing" addf5e2fc9..., Last passed: Never
  • ccextractor --endcreditsforatleast 3 --endcreditstext "CCextractor Ends crdit Testing" addf5e2fc9..., Last passed: Never
  • ccextractor --endcreditsforatmost 2 --endcreditstext "CCextractor Ends crdit Testing" addf5e2fc9..., Last passed: Never
  • ccextractor --out=srt --latin1 f23a544ba8..., Last passed: Never
  • ccextractor --out=srt --latin1 97cc394d87..., Last passed: Never
  • ccextractor --out=srt --latin1 10f0f77cf4..., Last passed: Never
  • ccextractor --out=srt --latin1 df3b4d62d3..., Last passed: Never
  • ccextractor --out=srt --latin1 d7e7dbdf68..., Last passed: Never
  • ccextractor --out=srt --latin1 76734ac4a7..., Last passed: Never
  • ccextractor --out=srt --latin1 c791382c94..., Last passed: Never
  • ccextractor --out=srt --latin1 f673b2f916..., Last passed: Never
  • ccextractor --out=srt --latin1 da75bdee47..., Last passed: Never
  • ccextractor --out=srt --latin1 bd6f33a669..., Last passed: Never
  • ccextractor --out=srt --latin1 0e5e6b26be..., Last passed: Never
  • ccextractor --out=srt --latin1 a226cc302d..., Last passed: Never
  • ccextractor --out=srt --latin1 ae6327683e..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 725a49f871..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --xds --ucla c813e713a0..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds b992e0cccb..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds d0291cdcf6..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds c8dc039a88..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --ucla 53339f3455..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 83b03036a2..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla --xds 7d3f25c32c..., Last passed: Never
  • ccextractor --autoprogram --out=srt --latin1 --ucla 7d3f25c32c..., Last passed: Never

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

@Rahul-2k4 Rahul-2k4 deleted the branch CCExtractor:master December 18, 2025 18:40
@Rahul-2k4 Rahul-2k4 closed this Dec 18, 2025
@Rahul-2k4 Rahul-2k4 deleted the master branch December 18, 2025 18:40
@Rahul-2k4 Rahul-2k4 restored the master branch December 18, 2025 18:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants