-
Notifications
You must be signed in to change notification settings - Fork 14
Audio utils #170
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Audio utils #170
Conversation
…() no-op if the samples are already that way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the PcmData class from a NamedTuple to a regular class, adds new audio processing methods, and introduces the AudioFormat enum for better type safety. Key changes include:
- Converted
PcmDatafromNamedTupleto a mutable class withcopy(),clear(), and modifiedappend()for in-place operations - Added
AudioFormatenum (S16, F32) with validation for audio format consistency - Implemented audio manipulation methods:
chunks(),sliding_window(),head(), andtail() - Enhanced resampling to support float32 format
- Added telemetry span attributes for SDP in peer connection handling
- Modified test fixtures to be opt-in rather than auto-used
Reviewed Changes
Copilot reviewed 6 out of 6 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| getstream/video/rtc/track_util.py | Core refactoring: converted PcmData to class, added AudioFormat enum, implemented new methods (copy, clear, chunks, sliding_window, head, tail), enhanced resample for f32 support |
| tests/rtc/test_track_util.py | Added comprehensive tests for new PcmData methods; changed monkeypatch fixture to opt-in; updated format parameter usage |
| tests/rtc/test_pcm_data.py | Added tests for copy(), clear(), and append() behavior with the new mutable class design |
| getstream/video/rtc/peer_connection.py | Added telemetry span attributes to track SDP in offer creation and local description setting |
| getstream/video/rtc/pc.py | Wrapped setRemoteDescription with telemetry span including SDP attribute |
| getstream/video/rtc/connection_manager.py | Added span variable binding for telemetry context (no functional change) |
Comments suppressed due to low confidence (3)
getstream/video/rtc/track_util.py:259
- Parameter order is inconsistent with the new constructor signature. The constructor expects
(sample_rate, format, samples=None, ...), but this call usessamples=first. While functional with keyword arguments, this inconsistency could cause confusion. Recommend reordering to match the constructor signature:cls(sample_rate=sample_rate, format=format, samples=np.array([], dtype=dtype), channels=channels).
return cls(
samples=np.array([], dtype=dtype),
sample_rate=sample_rate,
format=format,
channels=channels,
)
getstream/video/rtc/track_util.py:287
- Parameter order is inconsistent with the new constructor signature. The constructor expects
(sample_rate, format, samples=None, ...), but this call usessamples=first. While functional with keyword arguments, this inconsistency could cause confusion. Recommend reordering to match the constructor signature:cls(sample_rate=sample_rate, format=format, samples=arr, channels=channels).
return cls(
samples=arr, sample_rate=sample_rate, format=format, channels=channels
)
tests/rtc/test_pcm_data.py:356
- Parameter order is inconsistent with the new constructor signature. The constructor expects
(sample_rate, format, samples=None, ...), but this call usessamples=first. While functional with keyword arguments, this inconsistency could cause confusion. Recommend reordering to match the constructor signature throughout the test file:PcmData(sample_rate=48000, format=\"s16\", samples=other, channels=2).
pcm_other = PcmData(samples=other, sample_rate=48000, format="s16", channels=2)
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Note Other AI code review bot(s) detectedCodeRabbit has detected other AI code review bot(s) in this pull request and will avoid duplicating their findings in the review comments. This may lead to a less comprehensive review. WalkthroughAdds telemetry spans around multiple RTC negotiation points and replaces the PcmData NamedTuple with a full PcmData class plus an AudioFormat enum and many audio helpers; extensive unit tests for the new PcmData behavior were added. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Caller
participant PeerConn as PeerConnection
participant Tracer as Telemetry
participant RTC as RTCPeer
Note over Caller,PeerConn: Publish offer flow with tracing
Caller->>PeerConn: add_tracks(...)
PeerConn->>Tracer: start span "publishOffer"
PeerConn->>RTC: createOffer()
RTC-->>PeerConn: offer
PeerConn->>Tracer: record attribute sdp
PeerConn->>Tracer: start span "publishOffer.set_local_description"
PeerConn->>RTC: setLocalDescription(offer)
RTC-->>PeerConn: ack
Tracer-->>PeerConn: end spans
PeerConn-->>Caller: offer applied
sequenceDiagram
autonumber
participant Caller
participant ConnMgr as ConnectionManager
participant Tracer as Telemetry
participant RTCsub as SubscriberRTC
participant RTCpub as PublisherRTC
Note over Caller,ConnMgr: Subscriber offer and publisher answer handling
Caller->>ConnMgr: on_subscriber_offer(sdp)
ConnMgr->>Tracer: start span "subscriber.set_local_description"
ConnMgr->>RTCsub: setLocalDescription(sdp)
RTCsub-->>ConnMgr: ack
Tracer-->>ConnMgr: end span
ConnMgr->>Tracer: start span "publisher.pc.handle_answer"
ConnMgr->>RTCpub: setRemoteDescription(answer)
RTCpub-->>ConnMgr: ack
Tracer-->>ConnMgr: end span
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
✨ Finishing touches
🧪 Generate unit tests (beta)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro Disabled knowledge base sources:
📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Co-authored-by: Copilot <[email protected]>
Co-authored-by: Copilot <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
♻️ Duplicate comments (2)
tests/rtc/test_track_util.py (1)
355-357: Consider consistent parameter ordering for clarity.While keyword arguments make the order flexible, inconsistent ordering with the constructor signature could cause confusion. The constructor expects
(sample_rate, format, samples=None, ...), but this call and others throughout the file vary in their parameter order.For consistency, consider ordering parameters to match the constructor signature:
pcm = PcmData( - sample_rate=16000, format=AudioFormat.S16, samples=samples, channels=1 + sample_rate=16000, format=AudioFormat.S16, samples=samples, channels=1 )Note: Line 386 and other locations use
samples=first, which differs from this ordering.getstream/video/rtc/track_util.py (1)
102-132: Fix default dtype to respect the format parameter.The default samples array always uses
np.int16regardless of theformatparameter. This creates a dtype mismatch whenformat="f32".Apply this diff to fix the default dtype:
def __init__( self, sample_rate: int, format: AudioFormatType, samples: NDArray = None, pts: Optional[int] = None, dts: Optional[int] = None, time_base: Optional[float] = None, channels: int = 1, ): """ Initialize PcmData. Args: sample_rate: The sample rate of the audio data format: The format of the audio data (use AudioFormat.S16 or AudioFormat.F32) samples: The audio samples as a numpy array (default: empty int16 array) pts: The presentation timestamp of the audio data dts: The decode timestamp of the audio data time_base: The time base for converting timestamps to seconds channels: Number of audio channels (1=mono, 2=stereo) """ self.format: AudioFormatType = format self.sample_rate: int = sample_rate - self.samples: NDArray = ( - np.array([], dtype=np.int16) if samples is None else samples - ) + if samples is None: + dtype = np.float32 if format in ("f32", "float32", AudioFormat.F32) else np.int16 + self.samples = np.array([], dtype=dtype) + else: + self.samples = samples self.pts: Optional[int] = pts self.dts: Optional[int] = dts self.time_base: Optional[float] = time_base self.channels: int = channels
🧹 Nitpick comments (1)
getstream/video/rtc/track_util.py (1)
1079-1081: Verify timestamp calculation precision.The chunk timestamp calculation divides by sample_rate and time_base:
chunk_pts = self.pts + int(i / self.sample_rate / self.time_base)This could lose precision if
time_baseis very small. Consider reordering the operations:- chunk_pts = self.pts + int(i / self.sample_rate / self.time_base) + chunk_pts = self.pts + int(i / (self.sample_rate * self.time_base))This groups the divisors together, potentially improving numerical stability.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
getstream/video/rtc/track_util.py(19 hunks)tests/rtc/test_track_util.py(13 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/rtc/test_track_util.py (1)
getstream/video/rtc/track_util.py (9)
PcmData(88-1375)AudioFormat(30-80)chunks(1013-1133)sliding_window(1135-1159)resample(364-528)to_float32(613-676)tail(1161-1267)duration(139-195)head(1269-1375)
🔇 Additional comments (13)
tests/rtc/test_track_util.py (4)
7-11: LGTM!The imports correctly include the new
AudioFormatenum.
75-106: LGTM!The fixture is now opt-in as intended. The comments clearly explain why some tests need the real PyAV AudioFrame.
347-499: LGTM!Comprehensive test coverage for the new chunking and sliding window functionality, including edge cases like padding and stereo audio.
501-948: LGTM!Excellent test coverage for no-op optimizations and head/tail methods. The real-world use case test at line 899 validates the intended behavior against the original implementation.
getstream/video/rtc/track_util.py (9)
30-81: LGTM!The
AudioFormatenum is well-designed with string inheritance for backward compatibility and a helpfulvalidate()method.
364-528: LGTM!The resample method correctly handles both S16 and F32 formats, detecting input format from dtype and maintaining format consistency in the output.
613-676: LGTM!The
to_float32()method correctly implements the no-op optimization when already in F32 format and properly converts S16 to float32.
678-826: LGTM!The
append()method correctly implements in-place modification and returnsselffor chaining, as documented. Edge cases are well handled.
827-853: LGTM!The
copy()method correctly creates a deep copy, usingndarray.copy()for numpy arrays andcopy.deepcopy()for other types.
855-880: LGTM!The
clear()method correctly determines the appropriate dtype based on format and follows Python collection conventions by returning None implicitly.
1135-1159: LGTM!The
sliding_window()method correctly converts millisecond parameters to sample counts and delegates tochunks().
1161-1375: LGTM!The
head()andtail()methods are well-implemented with proper parameter validation, support for both mono and multi-channel audio, and flexible padding options.
1683-1719: LGTM!The
_normalize_audio_format()helper correctly validates format and performs both sample rate and format conversions as needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (4)
getstream/video/rtc/track_util.py (4)
102-141: Consider adding format validation in the constructor.The factory methods (
from_bytes,from_data,from_response) validate the format parameter usingAudioFormat.validate(), but the constructor doesn't. This inconsistency allows invalid formats to be created via direct instantiation.Apply this diff to add validation:
def __init__( self, sample_rate: int, format: AudioFormatType, samples: NDArray = None, pts: Optional[int] = None, dts: Optional[int] = None, time_base: Optional[float] = None, channels: int = 1, ): """ Initialize PcmData. Args: sample_rate: The sample rate of the audio data format: The format of the audio data (use AudioFormat.S16 or AudioFormat.F32) samples: The audio samples as a numpy array (default: empty array with dtype matching format) pts: The presentation timestamp of the audio data dts: The decode timestamp of the audio data time_base: The time base for converting timestamps to seconds channels: Number of audio channels (1=mono, 2=stereo) """ + # Validate format + AudioFormat.validate(format) self.format: AudioFormatType = format self.sample_rate: int = sample_rate
397-405: Consider adding a comment to clarify format detection logic.The condition on lines 399-401 uses OR logic to determine if float format should be used, checking both the actual dtype and the format attribute. This permissive approach handles cases where dtype and format might be mismatched, but could benefit from a comment explaining the intent.
# Determine the format based on the input dtype if isinstance(cmaj, np.ndarray): + # Use float format if either dtype is float32 OR format attribute indicates float + # This handles both correctly-typed data and data that needs format conversion if cmaj.dtype == np.float32 or ( self.format and self.format.lower() in ("f32", "float32") ): input_format = "fltp" # planar float32 else: input_format = "s16p" # planar int16
577-580: Improve error handling in the fallback case.The final fallback
return bytes(arr)at line 580 could raise an unclearTypeErrorfor unsupported types. Consider adding explicit handling or logging to make debugging easier.# Fallback if isinstance(arr, (bytes, bytearray)): return bytes(arr) - return bytes(arr) + # Attempt conversion with better error message + try: + return bytes(arr) + except TypeError as e: + raise TypeError( + f"Cannot convert samples of type {type(arr)} to bytes. " + f"Expected numpy array, bytes, or bytearray." + ) from e
1694-1730: LGTM! Useful normalization helper with proper clipping.The function correctly handles format validation, resampling, and format conversion. Notably, it includes proper clipping when converting f32 to s16 (lines 1722). Consider using this helper more broadly in the module to ensure consistent format handling.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
getstream/video/rtc/track_util.py(20 hunks)tests/rtc/test_pcm_data.py(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- tests/rtc/test_pcm_data.py
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (4)
- GitHub Check: Tests (3.10)
- GitHub Check: Tests (3.13)
- GitHub Check: Tests (3.11)
- GitHub Check: Tests (3.12)
🔇 Additional comments (7)
getstream/video/rtc/track_util.py (7)
30-81: LGTM! Well-designed AudioFormat enum.The enum with string inheritance provides good backward compatibility while adding type safety. The
validate()method is a helpful addition for input validation.
102-141: LGTM! Constructor properly handles default dtype based on format.The default samples array now correctly determines dtype based on the format parameter, addressing the previous review concern. The logic matches the pattern used in
clear()for consistency.
689-836: LGTM! Well-implemented in-place append with proper format handling.The method correctly modifies the object in-place and returns
selffor chaining. The format conversion logic is thorough, including proper clipping when converting float32 to int16 (lines 739-742).
838-891: LGTM! Standard copy and clear implementations.Both methods follow Python conventions:
copy()performs deep copy with proper handling of numpy arrays vs other typesclear()returnsNoneto match standard collection behavior and correctly determines dtype based on format
1024-1144: LGTM! Comprehensive chunking implementation with proper timestamp handling.The method correctly handles both mono and multi-channel audio, with accurate timestamp calculations for each chunk (lines 1092, 1132). The overlap and padding options provide flexibility for windowing use cases.
1146-1170: LGTM! Clean time-based windowing abstraction.The method provides a convenient time-based interface (milliseconds) that correctly delegates to the sample-based
chunks()method.
1172-1386: LGTM! Symmetric head and tail implementations with proper validation.Both methods provide clean interfaces for extracting time-based segments with optional padding. The
pad_atparameter validation (lines 1206-1207, 1314-1315) prevents invalid values, and both handle mono and multi-channel audio correctly.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
Disabled knowledge base sources:
- Linear integration is disabled by default for public repositories
You can enable these sources in your CodeRabbit configuration.
📒 Files selected for processing (2)
getstream/video/rtc/track_util.py(20 hunks)tests/rtc/test_pcm_data.py(2 hunks)
🧰 Additional context used
🧬 Code graph analysis (1)
tests/rtc/test_pcm_data.py (1)
getstream/video/rtc/track_util.py (8)
PcmData(88-1414)copy(866-892)id(1537-1538)append(717-864)clear(894-919)resample(401-571)AudioFormat(30-80)from_data(329-399)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: Tests (3.11)
Adds several methods to PcmData util to simplify PCM management
Summary by CodeRabbit
New Features
Tests
Refactor