Skip to content

feat(rust): Add persistent DtvccRust context for CEA-708 decoder (Pha…#1782

Merged
cfsmp3 merged 10 commits intoCCExtractor:masterfrom
cfsmp3:fix/1499-dtvcc-persistent-state
Dec 31, 2025
Merged

feat(rust): Add persistent DtvccRust context for CEA-708 decoder (Pha…#1782
cfsmp3 merged 10 commits intoCCExtractor:masterfrom
cfsmp3:fix/1499-dtvcc-persistent-state

Conversation

@cfsmp3
Copy link
Contributor

@cfsmp3 cfsmp3 commented Dec 8, 2025

Fix CEA-708 Decoder State Persistence (Issue #1499)

This PR fully implements the fix for issue #1499. The Rust CEA-708 decoder was creating a new Dtvcc struct on every call to ccxr_process_cc_data(), causing all state to be reset and breaking stateful caption processing.

Based on #1618, which doesn't merge cleanly and contains implementations that differ from current code.

Changes by Phase

Phase 1: Rust Core

  • Added DtvccRust struct in decoder/mod.rs that owns its decoder state
  • Added CCX_DTVCC_MAX_SERVICES constant (63)
  • Added FFI functions in lib.rs:
    • ccxr_dtvcc_init(): Create persistent context
    • ccxr_dtvcc_free(): Free context and all owned memory
    • ccxr_dtvcc_set_encoder(): Set encoder (not available at init)
    • ccxr_dtvcc_process_data(): Process CC data on persistent context
    • ccxr_flush_active_decoders(): Flush all active decoders
    • ccxr_dtvcc_is_active(): Check if context is active
  • Used heap allocation for large structs to avoid stack overflow
  • Added unit tests for DtvccRust

Phase 2: C Headers

  • Added void *dtvcc_rust field to lib_cc_decode struct in ccx_decoders_structs.h
  • Added extern declarations in ccx_dtvcc.h for init/free/process_data/is_active
  • Added extern declaration in lib_ccx.h for ccxr_dtvcc_set_encoder
  • Added extern declaration in ccx_decoders_common.h for ccxr_flush_active_decoders

Phase 3: C Implementation

  • Modified ccx_decoders_common.c:
    • init_cc_decode(): Use ccxr_dtvcc_init() when Rust enabled
    • dinit_cc_decode(): Use ccxr_dtvcc_free() when Rust enabled
    • flush_cc_decode(): Use ccxr_flush_active_decoders() when Rust enabled
  • Modified general_loop.c: Set encoder via ccxr_dtvcc_set_encoder() at 3 locations
  • Modified mp4.c: Use ccxr_dtvcc_set_encoder() and ccxr_dtvcc_process_data()
  • All changes guarded with #ifndef DISABLE_RUST

Phase 4: Bug Fix & Testing

  • Fixed ccxr_process_cc_data() to use persistent DtvccRust from dec_ctx.dtvcc_rust instead of creating new Dtvcc from dec_ctx.dtvcc (which is NULL when Rust enabled)
  • Added do_cb_dtvcc_rust() function for processing with DtvccRust

Testing

Automated Tests

  • All 269 Rust unit tests pass
  • Cargo clippy passes with no errors
  • CI builds pass on Linux, Mac, Windows

Manual Testing

  • Tested with CEA-708 transport stream file (ANDE.ts):
    • ~10 minute file with 25,598 DTVCC packets
    • Successfully extracted 21KB of captions
    • No crashes, proper SRT output with timestamps
  • Tested with CEA-708 program stream file:
    • Processed without crashes
    • Verified state persistence across calls

Files Modified

File Changes
src/rust/src/decoder/mod.rs Added DtvccRust struct and methods
src/rust/src/lib.rs Added FFI functions, fixed ccxr_process_cc_data
src/lib_ccx/ccx_decoders_structs.h Added dtvcc_rust field
src/lib_ccx/ccx_dtvcc.h Added extern declarations
src/lib_ccx/lib_ccx.h Added ccxr_dtvcc_set_encoder declaration
src/lib_ccx/ccx_decoders_common.h Added ccxr_flush_active_decoders declaration
src/lib_ccx/ccx_decoders_common.c Use Rust init/free/flush
src/lib_ccx/general_loop.c Set encoder via Rust FFI
src/lib_ccx/mp4.c Use Rust FFI for encoder and processing

Fixes: #1499

const CCX_DTVCC_MAX_COLUMNS: u8 = 32 * 2;

/// Maximum number of CEA-708 services
pub const CCX_DTVCC_MAX_SERVICES: usize = 63;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We already have this: pub const DTVCC_MAX_SERVICES: usize = 63;
CCExtractor/ccextractor/src/rust/lib_ccxr/src/common/constants.rs

@cfsmp3 cfsmp3 force-pushed the fix/1499-dtvcc-persistent-state branch from 18e2e32 to eb241d2 Compare December 12, 2025 16:39
cfsmp3 added a commit to cfsmp3/ccextractor that referenced this pull request Dec 12, 2025
The cb_708 counter was being incremented twice for each CEA-708 data block:
1. In do_cb_dtvcc_rust() in Rust (src/rust/src/lib.rs)
2. In do_cb() in C (src/lib_ccx/ccx_decoders_common.c)

Since FTS calculation uses cb_708 (fts = fts_now + fts_global + cb_708 * 1001 / 30),
the double-increment caused timestamps to advance ~2x as fast as expected,
resulting in incorrect milliseconds in start timestamps.

This fix removes the increment from the Rust code since the C code already
handles it in do_cb().

Fixes timestamp issues reported in PR CCExtractor#1782 tests where start times like
00:00:20,688 were incorrectly output as 00:00:20,737.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
cfsmp3 added a commit to cfsmp3/ccextractor that referenced this pull request Dec 13, 2025
The cb_708 counter was being incremented twice for each CEA-708 data block:
1. In do_cb_dtvcc_rust() in Rust (src/rust/src/lib.rs)
2. In do_cb() in C (src/lib_ccx/ccx_decoders_common.c)

Since FTS calculation uses cb_708 (fts = fts_now + fts_global + cb_708 * 1001 / 30),
the double-increment caused timestamps to advance ~2x as fast as expected,
resulting in incorrect milliseconds in start timestamps.

This fix removes the increment from the Rust code since the C code already
handles it in do_cb().

Fixes timestamp issues reported in PR CCExtractor#1782 tests where start times like
00:00:20,688 were incorrectly output as 00:00:20,737.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@cfsmp3 cfsmp3 force-pushed the fix/1499-dtvcc-persistent-state branch from abcfafe to 8ed6cc3 Compare December 13, 2025 01:38
cfsmp3 added a commit to cfsmp3/ccextractor that referenced this pull request Dec 13, 2025
The cb_708 counter was being incremented twice for each CEA-708 data block:
1. In do_cb_dtvcc_rust() in Rust (src/rust/src/lib.rs)
2. In do_cb() in C (src/lib_ccx/ccx_decoders_common.c)

Since FTS calculation uses cb_708 (fts = fts_now + fts_global + cb_708 * 1001 / 30),
the double-increment caused timestamps to advance ~2x as fast as expected,
resulting in incorrect milliseconds in start timestamps.

This fix removes the increment from the Rust code since the C code already
handles it in do_cb().

Fixes timestamp issues reported in PR CCExtractor#1782 tests where start times like
00:00:20,688 were incorrectly output as 00:00:20,737.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@cfsmp3 cfsmp3 force-pushed the fix/1499-dtvcc-persistent-state branch 2 times, most recently from 4ff33fc to cf74762 Compare December 14, 2025 10:21
cfsmp3 added a commit to cfsmp3/ccextractor that referenced this pull request Dec 14, 2025
The cb_708 counter was being incremented twice for each CEA-708 data block:
1. In do_cb_dtvcc_rust() in Rust (src/rust/src/lib.rs)
2. In do_cb() in C (src/lib_ccx/ccx_decoders_common.c)

Since FTS calculation uses cb_708 (fts = fts_now + fts_global + cb_708 * 1001 / 30),
the double-increment caused timestamps to advance ~2x as fast as expected,
resulting in incorrect milliseconds in start timestamps.

This fix removes the increment from the Rust code since the C code already
handles it in do_cb().

Fixes timestamp issues reported in PR CCExtractor#1782 tests where start times like
00:00:20,688 were incorrectly output as 00:00:20,737.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
cfsmp3 added a commit to cfsmp3/ccextractor that referenced this pull request Dec 29, 2025
The cb_708 counter was being incremented twice for each CEA-708 data block:
1. In do_cb_dtvcc_rust() in Rust (src/rust/src/lib.rs)
2. In do_cb() in C (src/lib_ccx/ccx_decoders_common.c)

Since FTS calculation uses cb_708 (fts = fts_now + fts_global + cb_708 * 1001 / 30),
the double-increment caused timestamps to advance ~2x as fast as expected,
resulting in incorrect milliseconds in start timestamps.

This fix removes the increment from the Rust code since the C code already
handles it in do_cb().

Fixes timestamp issues reported in PR CCExtractor#1782 tests where start times like
00:00:20,688 were incorrectly output as 00:00:20,737.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@cfsmp3 cfsmp3 force-pushed the fix/1499-dtvcc-persistent-state branch from cf74762 to a90b00e Compare December 29, 2025 21:55
cfsmp3 and others added 9 commits December 31, 2025 14:15
…se 1)

This is Phase 1 of the fix for issue CCExtractor#1499. It adds the Rust-side
infrastructure for a persistent CEA-708 decoder context without
modifying any C code, ensuring backward compatibility.

Problem:
The current Rust CEA-708 decoder creates a new Dtvcc struct on every
call to ccxr_process_cc_data(), causing all state to be reset. This
breaks stateful caption processing.

Solution:
Add a new DtvccRust struct that:
- Owns its decoder state (rather than borrowing from C)
- Persists across processing calls
- Is managed via FFI functions callable from C

Changes:
- Add DtvccRust struct in decoder/mod.rs with owned decoders
- Add CCX_DTVCC_MAX_SERVICES constant (63)
- Add FFI functions in lib.rs:
  - ccxr_dtvcc_init(): Create persistent context
  - ccxr_dtvcc_free(): Free context and all owned memory
  - ccxr_dtvcc_set_encoder(): Set encoder (not available at init)
  - ccxr_dtvcc_process_data(): Process CC data
  - ccxr_flush_active_decoders(): Flush all active decoders
  - ccxr_dtvcc_is_active(): Check if context is active
- Add unit tests for DtvccRust
- Use heap allocation for large structs to avoid stack overflow

The existing Dtvcc struct and ccxr_process_cc_data() remain unchanged
for backward compatibility. Phase 2-3 will add C header declarations
and modify C code to use the new functions.

Fixes: CCExtractor#1499 (partial)

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove duplicate CCX_DTVCC_MAX_SERVICES constant from decoder/mod.rs
- Import existing DTVCC_MAX_SERVICES from lib_ccxr::common
- Fix clippy uninlined_format_args warnings in avc/core.rs and decoder/mod.rs

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add void *dtvcc_rust field to lib_cc_decode struct
- Declare ccxr_dtvcc_init, ccxr_dtvcc_free, ccxr_dtvcc_process_data in ccx_dtvcc.h
- Declare ccxr_dtvcc_set_encoder in lib_ccx.h
- Declare ccxr_flush_active_decoders in ccx_decoders_common.h
- All declarations guarded with #ifndef DISABLE_RUST
- Update implementation plan to mark Phase 2 complete

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- init_cc_decode(): Initialize dtvcc_rust via ccxr_dtvcc_init()
- dinit_cc_decode(): Free dtvcc_rust via ccxr_dtvcc_free()
- flush_cc_decode(): Flush via ccxr_flush_active_decoders()
- general_loop.c: Set encoder via ccxr_dtvcc_set_encoder() (3 locations)
- mp4.c: Use ccxr_dtvcc_set_encoder() and ccxr_dtvcc_process_data()
- Add ccxr_dtvcc_is_active() declaration to ccx_dtvcc.h
- Fix clippy warnings in tv_screen.rs (unused assignments)
- All changes guarded with #ifndef DISABLE_RUST
- Update implementation plan to mark Phase 3 complete

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove extra space before comment in ccx_decoders_common.c
- Fix comment indentation in mp4.c

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The ccxr_process_cc_data function was still accessing dec_ctx.dtvcc
(which is NULL when Rust is enabled), causing a null pointer panic.

Changed to use dec_ctx.dtvcc_rust (the persistent DtvccRust context)
instead, which fixes the crash when processing CEA-708 data.

Added do_cb_dtvcc_rust() function that works with DtvccRust instead
of the old Dtvcc struct.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Move PLAN_PR1618_REIMPLEMENTATION.md to local plans/ folder
- Add plans/ to .gitignore to keep plans local

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The cb_708 counter was being incremented twice for each CEA-708 data block:
1. In do_cb_dtvcc_rust() in Rust (src/rust/src/lib.rs)
2. In do_cb() in C (src/lib_ccx/ccx_decoders_common.c)

Since FTS calculation uses cb_708 (fts = fts_now + fts_global + cb_708 * 1001 / 30),
the double-increment caused timestamps to advance ~2x as fast as expected,
resulting in incorrect milliseconds in start timestamps.

This fix removes the increment from the Rust code since the C code already
handles it in do_cb().

Fixes timestamp issues reported in PR CCExtractor#1782 tests where start times like
00:00:20,688 were incorrectly output as 00:00:20,737.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@cfsmp3 cfsmp3 force-pushed the fix/1499-dtvcc-persistent-state branch from a90b00e to ead4cbb Compare December 31, 2025 13:19
When Rust CEA-708 decoder is enabled, dec_ctx.dtvcc is set to NULL
and dec_ctx.dtvcc_rust holds the actual DtvccRust context. The null
check was incorrectly checking dtvcc, causing the function to return
early and skip all CEA-708 data processing.

This fixes tests 21, 31, 32, 105, 137, 141-149 which were failing
with exit code 10 (EXIT_NO_CAPTIONS) because no captions were being
extracted from CEA-708 streams.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on linux. Below is a summary of the test results, when compared to test for commit f2aeef1...:
Report Name Tests Passed
Broken 13/13
CEA-708 14/14
DVB 7/7
DVD 3/3
DVR-MS 2/2
General 25/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 81/86
Teletext 21/21
WTV 13/13
XDS 34/34

Your PR breaks these cases:

  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65...
  • ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b...
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9...

Congratulations: Merging this PR would fix the following tests:

  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2..., Last passed: Never

It seems that not all tests were passed completely. This is an indication that the output of some files is not as expected (but might be according to you).

Check the result page for more info.

cfsmp3 added a commit that referenced this pull request Dec 31, 2025
The cb_708 counter was being incremented twice for each CEA-708 data block:
1. In do_cb_dtvcc_rust() in Rust (src/rust/src/lib.rs)
2. In do_cb() in C (src/lib_ccx/ccx_decoders_common.c)

Since FTS calculation uses cb_708 (fts = fts_now + fts_global + cb_708 * 1001 / 30),
the double-increment caused timestamps to advance ~2x as fast as expected,
resulting in incorrect milliseconds in start timestamps.

This fix removes the increment from the Rust code since the C code already
handles it in do_cb().

Fixes timestamp issues reported in PR #1782 tests where start times like
00:00:20,688 were incorrectly output as 00:00:20,737.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@ccextractor-bot
Copy link
Collaborator

CCExtractor CI platform finished running the test files on windows. Below is a summary of the test results, when compared to test for commit f2aeef1...:
Report Name Tests Passed
Broken 13/13
CEA-708 14/14
DVB 7/7
DVD 3/3
DVR-MS 2/2
General 27/27
Hardsubx 1/1
Hauppage 3/3
MP4 3/3
NoCC 10/10
Options 86/86
Teletext 21/21
WTV 13/13
XDS 34/34

Congratulations: Merging this PR would fix the following tests:

  • ccextractor --autoprogram --out=srt --latin1 --quant 0 85271be4d2..., Last passed: Never
  • ccextractor --autoprogram --out=ttxt --latin1 --ucla dab1c1bd65..., Last passed: Never
  • ccextractor --out=srt --latin1 --autoprogram 29e5ffd34b..., Last passed: Never
  • ccextractor --out=spupng c83f765c66..., Last passed: Never
  • ccextractor --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsnotbefore 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsnotafter 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsforatleast 1 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never
  • ccextractor --startcreditsforatmost 2 --startcreditstext "CCextractor Start crdit Testing" c4dd893cb9..., Last passed: Never

All tests passed completely.

Check the result page for more info.

@cfsmp3 cfsmp3 merged commit b23866f into CCExtractor:master Dec 31, 2025
24 of 25 checks passed
@cfsmp3 cfsmp3 deleted the fix/1499-dtvcc-persistent-state branch December 31, 2025 23:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Initialize data structures correctly for the rust CEA-708 decoder

3 participants