Av1#7
Open
zsiec wants to merge 14 commits into
Open
Conversation
Implement AV1 bitstream parsing in demux/av1.go with four exported functions: - ParseAV1SequenceHeader: extracts profile, level, tier, bit depth, resolution, and chroma subsampling from sequence header OBUs - CodecString: generates RFC 6381 codec strings (e.g. "av01.0.08M.08") - IsAV1Keyframe: scans temporal units for KEY_FRAME in OBU_FRAME/FRAME_HEADER - FindSequenceHeaderOBU: locates sequence header OBU within a temporal unit Internal helpers include OBU header parsing, LEB128 decoding, and an error-tracking MSB-first bit reader with UVLC support.
When decoder_model_present_for_this_op is 1, the AV1 spec (section 5.5.1) requires reading decoder_buffer_delay, encoder_buffer_delay, and low_delay_mode_flag. The parser was reading the flag but not consuming these fields, causing the bit reader to be misaligned for all subsequent fields in the sequence header.
…ue receiver in av1.go - Remove OBUMetadata, OBUTileGroup, OBURedundantFrameHeader, OBUTileList, OBUPadding (unused) - Remove redundant len(payload) < 1 checks in IsAV1Keyframe (already guarded by payloadSize check) - Change CodecString() to value receiver for consistency with SPSInfo/HEVCSPSInfo
Build the 4-byte AV1 codec configuration header from a parsed sequence header OBU per the AV1-ISOBMFF spec, appending the raw OBU as configOBUs. Tests cover 8-bit 1080p, 10-bit 480p, and nil/invalid inputs.
Add AV1 as a recognized codec in WriteVideoFrame() and buildVideoInfo(), alongside existing H.264/H.265 handling. The MoQ writer now emits AV1CodecConfigurationRecord on keyframes, and the pipeline extracts resolution and codec string from AV1 sequence header OBUs. AV1 frames use raw OBU pass-through (no AnnexB conversion needed).
Add ingest/dash/ package with MPD XML parsing using go-mpd library, representation selection (by ID or highest bandwidth), segment template URL resolution, and live segment number computation. Includes tests for dynamic/static MPDs, template substitution, and edge cases.
Parse fMP4 init segments to extract AV1/AAC codec configuration, track metadata, and trex boxes. Parse media segments to extract individual samples with microsecond-precision PTS/DTS timestamps and keyframe flags using the mp4ff library.
Introduces dashPipeline which converts mediaSample values from the fMP4 segment parser into media.VideoFrame and media.AudioFrame for broadcast via the Relay. Keyframes start new groups and carry the AV1 sequence header OBU as SPS for late-joining decoder initialization. Implements distribution.StatsProvider for the stats overlay.
Implements the Puller type that manages active DASH pull connections. Pull() fetches and validates the MPD manifest synchronously, then starts a background goroutine that continuously downloads segments, parses them via parseInitSegment/parseMediaSegment, and feeds samples through the dashPipeline to the distribution Relay. Stop() cancels an active pull, and ActivePulls() returns a snapshot of all pulls. Interfaces (DistributionServer, StreamManager) decouple from concrete types. The fetchURL helper uses http.NewRequestWithContext for proper cancellation. Tests cover validation, duplicate detection, lifecycle management, HTTP fetching, and end-to-end pull/stop with httptest.
Add /api/dash-pull endpoints (GET, POST, DELETE, OPTIONS) mirroring the existing SRT pull API pattern. Wire the DASH puller into cmd/prism/main.go with DASHPull/DASHStop/DASHList callbacks, a streamManagerAdapter to bridge the interface mismatch, and DASH_SOURCE env var for startup pulls.
- Support contentType attr on AdaptationSet (ffmpeg puts mimeType on Representation, not AdaptationSet) - Support $Number%05d$ printf-style zero-padded segment numbers in SegmentTemplate media patterns - Fall back to codec-prefix heuristic when neither mimeType nor contentType is available
SetVideoInfo was passing "av1" instead of the RFC 6381 codec string (e.g., "av01.0.05M.08") and omitting the AV1CodecConfigurationRecord. This caused the browser's WebCodecs VideoDecoder to reject the config with "Unknown or ambiguous codec name". Now parses the sequence header OBU from the init segment to generate the correct codec string and decoder config for the MoQ catalog.
DASH segments arrive as bursts (2s of frames at once). Without pacing, the player sees burst-pause-burst-pause jitter. Now processVideoSamples and processAudioSamples sleep between frames based on PTS deltas, and run concurrently so video and audio interleave naturally.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Brief description of the change.
Test Plan
make checkpasses