Skip to content

Add presentationTimestamp to RTCEncodedVideoFrameMetadata #173

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 3 commits into from
Apr 27, 2023
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
14 changes: 13 additions & 1 deletion index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -330,7 +330,8 @@ dictionary RTCEncodedVideoFrameMetadata {
unsigned long temporalIndex;
unsigned long synchronizationSource;
octet payloadType;
sequence<unsigned long> contributingSources;
sequence<unsigned long> contributingSources;
long long presentationTimestamp; // microseconds
};
</pre>

Expand Down Expand Up @@ -365,6 +366,17 @@ dictionary RTCEncodedVideoFrameMetadata {
The list of contribution sources (csrc list) as defined in [[RFC3550]].
</p>
</dd>
<dt>
<dfn>presentationTimestamp</dfn> of type <span class=
"idlMemberType">long long</span>
</dt>
<dd>
<p>
The media presentation timestamp (PTS) in microseconds of raw frame, matching the
{{VideoFrame/timestamp}} for raw frames which correspond to this frame and the
{{VideoFrameCallbackMetadata/mediaTime}} given if this frame is decoded and rendererd.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VideoFrameCallbackMetadata/mediaTime is a double in seconds, maybe we should remove the reference to this one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

VideoFrameCallbackMetadata/mediaTime is a double in seconds, maybe we should remove the reference to this one.

Do you mean milliseconds? I also found w3c/webcodecs#122 on why WebCodecs uses microseconds for timestamps.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member

@jan-ivar jan-ivar Apr 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, it uses seconds to match mediaElement.currentTime. 🤦

While I think it makes sense for a method on HTMLVideoElement to stay consistent with its attributes, this here is arguably a lower-level API.

To make matters complicated, W3C design principle § 8.3. Use milliseconds for time measurement says... that.

And in w3c/webcodecs#122 they use microseconds, out of concern for audio AV drift.

I see the original PR was for captureTimestamp and then it was changed to presentationTimestamp... What's the use case for this value? We may need to consider this to figure out which unit to stay consistent with.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our model is VideoFrame -> encoder -> encoded chunk -> encoded transform.
Seems best to be consistent with VideoFrame here, and remove the reference to rvfc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see the OP was edited and originally used microseconds. Can someone clarify what changed? Sorry if I missed it, I find the github thread here a bit hard to follow.

Normally, an issue is opened first for discussion, which tends to leave a thread that is more easy to follow, then a PR is opened later once discussion solidifies (github's PR workflow tends to hide things once resolved, which suits a review process more than a discussion imho).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the discussion and links!
Consistency with VideoFrame is indeed the motivating usecase here as Youenn said - so that an app is able to associate a raw frame before encoding with an encoded frame afterwards, when applying transforms on both sides. I'm happy to just remove the reference to rVFC here - would definitely simplify the text.

Apologies for editing the in-flight PR. The intention was always to have a field here which matched VideoFrame.timestamp, but the previous PR #137 that I was asked to adopt seemed to mistake which timestamp this corresponded to. Given noone but Harald had commented here yet, I thought it easier to just modify in place before kicking off the review more widely.

</p>
</dd>
</dl>


Expand Down