From bee604f30b115ddc227f4566b7d400c1a960fd42 Mon Sep 17 00:00:00 2001
From: Guido Urdaneta <guidou@chromium.org>
Date: Fri, 7 Feb 2025 16:08:52 +0100
Subject: [PATCH 1/5] Add timestamp explainer

---
 timestamps.md | 302 ++++++++++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 302 insertions(+)
 create mode 100644 timestamps.md

diff --git a/timestamps.md b/timestamps.md
new file mode 100644
index 0000000..20d3ca9
--- /dev/null
+++ b/timestamps.md
@@ -0,0 +1,302 @@
+# Extra Timestamps for encoded RTC media frames
+
+## Authors:
+
+- Guido Urdaneta (Google)
+
+## Participate
+- https://github.com/w3c/webrtc-encoded-transform
+
+
+## Introduction
+
+The [WebRTC Encoded Transform](https://w3c.github.io/webrtc-encoded-transform/)
+API allows applications to access encoded media flowing through a WebRTC
+[RTCPeerConnection](https://w3c.github.io/webrtc-pc/#dom-rtcpeerconnection). 
+Video data is exposed as 
+[RTCEncodedVideoFrame](https://w3c.github.io/webrtc-encoded-transform/#rtcencodedvideoframe)s
+and audio data is exposed as
+[RTCEncodedAudioFrame](https://w3c.github.io/webrtc-encoded-transform/#rtcencodedaudioframe)s.
+Both types of frames have a getMetadata() method that returns a number of
+metadata fields containing more information about the frames.
+
+This proposal consists in adding a number of additional metadata fields
+containing timestamps, in line with recent additions to
+[VideoFrameMetadata](https://w3c.github.io/webcodecs/video_frame_metadata_registry.html#videoframemetadata-members)
+in [WebCodecs](https://w3c.github.io/webcodecs/) and
+[requestVideoFrameCallback](https://wicg.github.io/video-rvfc/#video-frame-callback-metadata-attributes). 
+
+For the purposes of this proposal, we use the following definitions:
+* The *capturer system* is a system that originally captures a media frame,
+  typically from a local camera, microphone or screen-share session. This frame
+  can be relayed through multiple systems before it reaches its final
+  destination.
+* The *receiver system* is the final destination of the captured frames. It
+  receives the data via an [RTCPeerConnection] and it uses the WebRTC Encoded
+  Transform API with the changes included in this proposal.
+* The *sender system* is the system that communicates directly with the
+  *receiver system*. It may be the same as the capturer system, but not
+  necessarily. It is the last hop before the captured frames reach the receiver
+  system.
+
+The proposed new metadata fields are:
+* `receiveTime`: The time when the frame was received from the sender system.
+* `captureTime`: The time when the frame was captured by the capturer system.
+  This timestamp is set by the capturer system.
+* `senderCaptureTimeOffset`: An estimate of the offset between the capturer
+  system clock system and the sender system clock. The receiver system can
+  compute the clock offset between the receiver system and the sender system
+  and these two offset can be used to adjust the `captureTime` to the
+  receiver system clock.
+
+`captureTime` and `senderCaptureTimeOffset` are provided in WebRTC by the
+[Absolute Capture Time" header extension](https://webrtc.googlesource.com/src/+/refs/heads/main/docs/native-code/rtp-hdrext/abs-capture-time).
+
+Note that the [RTCRtpContributingSource](https://www.w3.org/TR/webrtc/#dom-rtcrtpcontributingsource) 
+interface also exposes these timestamps
+(see also [extensions[(https://w3c.github.io/webrtc-extensions/#rtcrtpcontributingsource-extensions)]),
+but in a way that is not suitable for applications using the WebRTC Encoded
+Transform API. The reason is that encoded transforms operate per frame, while
+the values in [RTCRtpContributingSource]() are the most recent seen by the UA,
+which make it impossible to know if the values provided by
+[RTCRtpContributingSource]() actually correspond to the frames being processed
+by the application.
+
+
+## User-Facing Problem
+
+This API supports applications where measuring the delay between the reception
+of a media frame and its original capture is useful.
+
+Some examples use cases are:
+1. Audio/video synchronization measurements
+2. Performance measurements
+3. Delay measurements
+
+In all of these cases, the application can log the measurements for offline
+analysis or A/B testing, but also adjust application parameters in real time.
+
+
+### Goals [or Motivating Use Cases, or Scenarios]
+
+- Provide Web applications using WebRTC Encoded Transform access to receive and
+  capture timestamps in addition to existing metadata already provided.
+- Align encoded frame metadata with [metadata provided for raw frames]().
+
+### Non-goals
+
+- Provide mechanisms to improve WebRTC communication mechanisms based on the
+information provided by these new metadata fields.
+
+
+### Example
+
+This shows an example of an application that:
+1. Computes the delay between audio and video
+2. Computes the processing and logs and/or updates remote parameters based on the
+delay.
+
+```js
+// code in a DedicatedWorker
+let lastVideoCaptureTime;
+let lastAudioCaptureTime;
+let lastVideoSenderCaptureTimeOffset;
+let lastVideoProcessingTime;
+let senderReceiverClockOffset = null;
+
+function updateAVSync() {
+  const avSyncDifference = lastVideoCaptureTime - lastAudioCaptureTime;
+  doSomethingWithAVSync(avSyncDifference);
+}
+
+// Measures delay from original capture until reception by this system.
+// Other forms of delay are also possible.
+function updateEndToEndVideoDelay() {
+  if (senderReceiverClockOffset == null) {
+    return;
+  }
+
+  const adjustedCaptureTime =
+      senderReceiverClockOffset + lastVideoSenderCaptureTimeOffset + lastVideoCaptureTime;
+  const endToEndDelay = lastVideoReceiveTime - adjustedCaptureTime;
+  doSomethingWithEndToEndDelay(endToEndDelay);
+}
+
+function updateVideoProcessingTime() {
+  const processingTime = lastVideoProcessingTime - lastVideoReceiveTime;
+  doSomethingWithProcessingTime();
+}
+
+function createReceiverAudioTransform() {
+  return new TransformStream({
+    start() {},
+    flush() {},
+    async transform(encodedFrame, controller) {
+      let metadata = encodedFrame.getMetadata();
+      lastAudioCaptureTime = metadata.captureTime;
+      updateAVSync();
+      controller.enqueue(encodedFrame);
+    }
+  });
+}
+
+function createReceiverVideoTransform() {
+  return new TransformStream({
+    start() {},
+    flush() {},
+    async transform(encodedFrame, controller) {
+      let metadata = encodedFrame.getMetadata();
+      lastVideoCaptureTime = metadata.captureTime;
+      updateAVSync();
+      lastVideoReceiveTime = metadata.receiveTime;
+      lastVideoSenderCaptureTimeOffset = metadata.senderCaptureTimeOffset;
+      updateEndToEndDelay();
+      doSomeEncodedVideoProcessing(encodedFrame.data);
+      lastVideoProcessingTime = performance.now();
+      updateProcessing();
+      controller.enqueue(encodedFrame);
+    }
+  });
+}
+
+// Code to instantiate transforms and attach them to sender/receiver pipelines.
+onrtctransform = (event) => {
+  let transform;
+  if (event.transformer.options.name == "receiverAudioTransform")
+    transform = createReceiverAudioTransform();
+  else if (event.transformer.options.name == "receiverVideoTransform")
+    transform = createReceiverVideoTransform();
+  else
+    return;
+  event.transformer.readable
+      .pipeThrough(transform)
+      .pipeTo(event.transformer.writable);
+};
+
+onmessage = (event) => {
+  senderReceiverClockOffset = event.data;
+}
+
+
+// Code running on Window
+const worker = new Worker('worker.js');
+const pc = new RTCPeerConnection();
+
+// Do ICE and offer/answer exchange. Removed from this example for clarity.
+
+// Configure transforms in the worker
+pc.ontrack = e => {
+  if (e.track.kind == "video")
+    e.receiver.transform = new RTCRtpScriptTransform(worker, { name: "receiverVideoTransform" });
+  else  // audio
+    e.receiver.transform = new RTCRtpScriptTransform(worker, { name: "receiverAudioTransform" });
+}
+
+// Compute the clock offset between the sender and this system.
+const stats = pc.getStats();
+const remoteOutboundRtpStats = getRequiredStats(stats, "remote-outbound-rtp");
+const remoteInboundRtpStats = getRequiredStats(stats, "remote-inbound-rtp")
+const senderReceiverTimeOffset = 
+    remoteOutboundRtpStats.timestamp - 
+        (remoteOutboundRtpStats.remoteTimestamp + 
+         remoteInboundRtpStats.roundTripTime / 2);
+
+worker.postMessage(senderReceiverTimeOffset);
+```
+
+
+## Alternatives considered
+
+### [Alternative 1]
+
+Use the values already exposed in `RTCRtpContributingSource`.
+
+`RTCRtpContibutingSource` already exposes the same timestamps as in this proposal.
+The problem with using those timestamps is that it is impossible to reliably
+associate them to a specific encoded frame exposed by the WebRTC Encoded
+Transform API.
+
+This makes any of the computations in this proposal unreliable.
+
+### [Alternative 2]
+
+Expose only `captureTime` and `receiveTime`.
+
+`senderCaptureTimeOffset` is a value that is provided by the 
+[Absolute Capture Timestamp]()https://webrtc.googlesource.com/src/+/refs/heads/main/docs/native-code/rtp-hdrext/abs-capture-time#absolute-capture-time
+WebRTC header extension, but that extension updates the value only periodically
+since there is little value in computing the estimatefor every packet, so it is
+strictly speaking not a per-frame value. Arguably, an application could use
+the `senderCaptureTimeOffset` already exposed in `RTCRtpContributingSource`.
+
+However, given that this value is coupled with `captureTime` in the header
+extension, it looks appropriate and more ergonomic to expose the pair in the
+frame as well. While clock offsets do not usually change significantly
+in a very short time, there is some extra accuracy in having the estimated
+offset between the capturer system and the sender for that particular frame.
+This could be more visible, for example, if the set of relays that frames
+go through from the capturer system to the sender system changes.
+
+Exposing `senderCaptureTimeOffset` also makes it clearer that the `captureTime`
+comes from the original capturer system, so it needs to be adjusted using the
+corresponding clock offset.
+
+
+### [Alternative 3]
+
+Expose a `captureTime` already adjusted to the receiver system's clock.
+
+The problem with this option is that clock offsets are estimates. Using
+estimates makes computing A/V Sync more difficult and less accurate.
+
+For example, if the UA uses the a single estimate during the whole session,
+the A/V sync computation will be accurate, but the capture times themselves will
+be inaccurate as the clock offset estimate is never updated. Any other
+computation made with the `captureTime` and other local timestamps will be
+inaccurate.
+
+### [Alternative 4]
+
+Expose a `localClockOffset` instead of a `senderClockOffset`.
+
+This would certainly support the use cases presented here, but it would have the
+following downsides:
+* It would introduce an inconsistency with the values exposed in `RTCRtpContibutingSource`.
+  This can lead to confusion, as the `senderClockOffset` is always paired together
+  with the `captureTime` in the header extension and developers expect this association.
+* Applications can compute their own estimate of the offset between sender
+  and receiver using WebRTC Stats and can control how often to update it.
+* Some applications might be interested in computing delays using the sender
+  as reference.
+
+In short, while this would be useful, the additional value is limited compared
+with the clarity, consistency and extra possibilities offered by exposing the
+`senderClockOffset`.
+
+
+
+## Accessibility, Privacy, and Security Considerations
+
+These timestamps are already available in a form less suitable for applications
+using WebRTC Encoded Transform as part of the RTCRtpContributingSource API.
+
+*The `captureTime` field is available via the 
+[RTCRtpContributingSource.captureTimestamp](https://w3c.github.io/webrtc-extensions/#dom-rtcrtpcontributingsource-capturetimestamp) field.
+
+
+*The `senderCaptureTimeOffset` field is available via the
+[RTCRtpContributingSource.senderCaptureTimeOffset](https://w3c.github.io/webrtc-extensions/#dom-rtcrtpcontributingsource-sendercapturetimeoffset) field.
+
+*The `receiveTime` field is available via the 
+[RTCRtpContributingSource.timestamp](https://w3c.github.io/webrtc-pc/#dom-rtcrtpcontributingsource-timestamp) field.
+
+While these fields are not 100% equivalent to the fields in this proposal,
+they have the same privacy characteristics. Therefore, we consider that the
+privacy delta of this proposal is zero.
+
+## References & acknowledgements
+
+Many thanks for valuable feedback and advice from:
+- Florent Castelli
+- Harald Avelstrand
+- Henrik Boström

From 80c78306928e71041a4d7294d1e2a4e5d85b4b8b Mon Sep 17 00:00:00 2001
From: Guido Urdaneta <guidou@chromium.org>
Date: Fri, 7 Feb 2025 16:12:29 +0100
Subject: [PATCH 2/5] Minor fix to timestamp explainer

---
 timestamps.md | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/timestamps.md b/timestamps.md
index 20d3ca9..ae96888 100644
--- a/timestamps.md
+++ b/timestamps.md
@@ -77,12 +77,13 @@ In all of these cases, the application can log the measurements for offline
 analysis or A/B testing, but also adjust application parameters in real time.
 
 
-### Goals [or Motivating Use Cases, or Scenarios]
+### Goals
 
 - Provide Web applications using WebRTC Encoded Transform access to receive and
   capture timestamps in addition to existing metadata already provided.
 - Align encoded frame metadata with [metadata provided for raw frames]().
 
+
 ### Non-goals
 
 - Provide mechanisms to improve WebRTC communication mechanisms based on the

From 89cb317955eb61fa917b4f25959e088dcd767c52 Mon Sep 17 00:00:00 2001
From: Guido Urdaneta <guidou@chromium.org>
Date: Fri, 7 Feb 2025 17:22:38 +0100
Subject: [PATCH 3/5] Add privacy/security questionnaire for timestamps.

---
 timestamp_sp_questionnaire.md | 76 +++++++++++++++++++++++++++++++++++
 1 file changed, 76 insertions(+)
 create mode 100644 timestamp_sp_questionnaire.md

diff --git a/timestamp_sp_questionnaire.md b/timestamp_sp_questionnaire.md
new file mode 100644
index 0000000..d8ab9eb
--- /dev/null
+++ b/timestamp_sp_questionnaire.md
@@ -0,0 +1,76 @@
+# Security and Privacy questionnaire
+
+### 2.1. What information does this feature expose, and for what purposes?
+
+This feature exposes three timestamps associated to encoded audio and video
+frames:
+* Receive Timestamp: time when a media frame was received locally.
+* Capture Timestamp: time when a media frame was originally captured, set by
+the system that captured the frame.
+* Capture Timestamp Server Offset: clock offset between the system that captured
+the frame and the system that sent the frame to the local system using this
+
+### 2.2. Do features in your specification expose the minimum amount of information necessary to implement the intended functionality?
+Yes.
+
+### 2.3. Do the features in your specification expose personal information, personally-identifiable information (PII), or information derived from either?
+No.
+
+### 2.4. How do the features in your specification deal with sensitive information?
+This feature does not deal with sensitive information.
+
+### 2.5. Does data exposed by your specification carry related but distinct information that may not be obvious to users?
+No.
+
+### 2.6. Do the features in your specification introduce state that persists across browsing sessions?
+No.
+
+### 2.7. Do the features in your specification expose information about the underlying platform to origins?
+No.
+
+### 2.8. Does this specification allow an origin to send data to the underlying platform?
+No.
+
+### 2.9. Do features in this specification enable access to device sensors?
+No.
+
+### 2.10. Do features in this specification enable new script execution/loading mechanisms?
+No.
+
+### 2.11. Do features in this specification allow an origin to access other devices?
+No.
+
+### 2.12. Do features in this specification allow an origin some measure of control over a user agent’s native UI?
+No.
+
+### 2.13. What temporary identifiers do the features in this specification create or expose to the web?
+None. It exposes timestamps but they do not seem very useful as identifiers.
+
+### 2.14. How does this specification distinguish between behavior in first-party and third-party contexts?
+No distinction.
+
+### 2.15. How do the features in this specification work in the context of a browser’s Private Browsing or Incognito mode?
+No distinction.
+
+### 2.16. Does this specification have both "Security Considerations" and "Privacy Considerations" sections?
+This is a minor addition to an existing specification. The existing specification has a "Privacy and security considerations" section.
+
+### 2.17. Do features in your specification enable origins to downgrade default security protections?
+Do features in your spec enable an origin to opt-out of security settings in order to accomplish something? If so, in what situations do these features allow such downgrading, and why?
+No.
+
+### 2.18. What happens when a document that uses your feature is kept alive in BFCache (instead of getting destroyed) after navigation, and potentially gets reused on future navigations back to the document?
+In this case, peer connection are closed, and the feature becomes inaccessible.
+
+### 2.19. What happens when a document that uses your feature gets disconnected?
+In this case, peer connection are closed, and the feature becomes inaccessible.
+
+
+### 2.20. Does your spec define when and how new kinds of errors should be raised?
+This feature does not produce new kinds of errors.
+
+### 2.21. Does your feature allow sites to learn about the user’s use of assistive technology?
+No.
+
+### 2.22. What should this questionnaire have asked?
+The questions seem appropriate.

From 444d627aa9e7f6b926c97e12c0f29a7963dbc5dc Mon Sep 17 00:00:00 2001
From: Guido Urdaneta <guidou@chromium.org>
Date: Fri, 7 Feb 2025 17:24:44 +0100
Subject: [PATCH 4/5] Some changes to timestamps explainer

---
 timestamps.md | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

diff --git a/timestamps.md b/timestamps.md
index ae96888..0e4fd06 100644
--- a/timestamps.md
+++ b/timestamps.md
@@ -20,20 +20,20 @@ and audio data is exposed as
 Both types of frames have a getMetadata() method that returns a number of
 metadata fields containing more information about the frames.
 
-This proposal consists in adding a number of additional metadata fields
+This feature consists in adding a number of additional metadata fields
 containing timestamps, in line with recent additions to
 [VideoFrameMetadata](https://w3c.github.io/webcodecs/video_frame_metadata_registry.html#videoframemetadata-members)
 in [WebCodecs](https://w3c.github.io/webcodecs/) and
 [requestVideoFrameCallback](https://wicg.github.io/video-rvfc/#video-frame-callback-metadata-attributes). 
 
-For the purposes of this proposal, we use the following definitions:
+For the purposes of this feature, we use the following definitions:
 * The *capturer system* is a system that originally captures a media frame,
   typically from a local camera, microphone or screen-share session. This frame
   can be relayed through multiple systems before it reaches its final
   destination.
 * The *receiver system* is the final destination of the captured frames. It
   receives the data via an [RTCPeerConnection] and it uses the WebRTC Encoded
-  Transform API with the changes included in this proposal.
+  Transform API with the changes proposed by this feature.
 * The *sender system* is the system that communicates directly with the
   *receiver system*. It may be the same as the capturer system, but not
   necessarily. It is the last hop before the captured frames reach the receiver
@@ -212,12 +212,12 @@ worker.postMessage(senderReceiverTimeOffset);
 
 Use the values already exposed in `RTCRtpContributingSource`.
 
-`RTCRtpContibutingSource` already exposes the same timestamps as in this proposal.
+`RTCRtpContibutingSource` already exposes the same timestamps as this feature.
 The problem with using those timestamps is that it is impossible to reliably
 associate them to a specific encoded frame exposed by the WebRTC Encoded
 Transform API.
 
-This makes any of the computations in this proposal unreliable.
+This makes any of the computations in this feature unreliable.
 
 ### [Alternative 2]
 
@@ -291,9 +291,9 @@ using WebRTC Encoded Transform as part of the RTCRtpContributingSource API.
 *The `receiveTime` field is available via the 
 [RTCRtpContributingSource.timestamp](https://w3c.github.io/webrtc-pc/#dom-rtcrtpcontributingsource-timestamp) field.
 
-While these fields are not 100% equivalent to the fields in this proposal,
+While these fields are not 100% equivalent to the fields in this feature,
 they have the same privacy characteristics. Therefore, we consider that the
-privacy delta of this proposal is zero.
+privacy delta of this feature is zero.
 
 ## References & acknowledgements
 

From 61bf6ef563283786c83b88fcd7a8a628382d6691 Mon Sep 17 00:00:00 2001
From: Guido Urdaneta <guidou@chromium.org>
Date: Mon, 3 Feb 2025 10:42:26 +0100
Subject: [PATCH 5/5] Add receiveTime field to RTCEncodedVideoFrameMetadata and
 RTCEncodedAudioFrameMetadata

Drive-by: Fix bugs preventing proper translation of the spec.
---
 index.bs                      |  25 +++
 timestamp_sp_questionnaire.md |  76 ---------
 timestamps.md                 | 303 ----------------------------------
 3 files changed, 25 insertions(+), 379 deletions(-)
 delete mode 100644 timestamp_sp_questionnaire.md
 delete mode 100644 timestamps.md

diff --git a/index.bs b/index.bs
index cdd4a0c..64504d5 100644
--- a/index.bs
+++ b/index.bs
@@ -358,6 +358,7 @@ dictionary RTCEncodedVideoFrameMetadata {
     sequence&lt;unsigned long&gt; contributingSources;
     long long timestamp;    // microseconds
     unsigned long rtpTimestamp;
+    DOMHighResTimeStamp receiveTime;
     DOMString mimeType;
 };
 </pre>
@@ -431,6 +432,18 @@ dictionary RTCEncodedVideoFrameMetadata {
             that reflects the sampling instant of the first octet in the RTP data packet.
         </p>
     </dd>
+    <dt>
+        <dfn dict-member>receiveTime</dfn> <span class=
+            "idlMemberType">DOMHighResTimeStamp</span>
+    </dt>
+    <dd>
+        <p>
+            For frames coming from an RTCRtpReceiver, represents the timestamp
+            of the last received packet used to produce this video frame. This
+            timestamp is relative to {{Performance}}.{{Performance/timeOrigin}}.
+            Only exists for incoming video frames.
+        </p>
+    </dd>
     <dt>
         <dfn dict-member>mimeType</dfn> <span class="idlMemberType">DOMString</span>
     </dt>
@@ -614,6 +627,7 @@ dictionary RTCEncodedAudioFrameMetadata {
     sequence&lt;unsigned long&gt; contributingSources;
     short sequenceNumber;
     unsigned long rtpTimestamp;
+    DOMHighResTimeStamp receiveTime;
     DOMString mimeType;
 };
 </pre>
@@ -667,6 +681,17 @@ dictionary RTCEncodedAudioFrameMetadata {
             that reflects the sampling instant of the first octet in the RTP data packet.
         </p>
     </dd>
+    <dt>
+        <dfn dict-member>receiveTime</dfn> <span class=
+            "idlMemberType">DOMHighResTimeStamp</span>
+    </dt>
+    <dd>
+        <p>
+            For frames coming from an RTCRtpReceiver, represents the timestamp
+            of the last received packet used to produce this audio frame. This
+            timestamp is relative to {{Performance}}.{{Performance/timeOrigin}}.
+            Only exists for incoming audio frames.
+        </p>
     <dt>
         <dfn dict-member>mimeType</dfn> <span class="idlMemberType">DOMString</span>
     </dt>
diff --git a/timestamp_sp_questionnaire.md b/timestamp_sp_questionnaire.md
deleted file mode 100644
index d8ab9eb..0000000
--- a/timestamp_sp_questionnaire.md
+++ /dev/null
@@ -1,76 +0,0 @@
-# Security and Privacy questionnaire
-
-### 2.1. What information does this feature expose, and for what purposes?
-
-This feature exposes three timestamps associated to encoded audio and video
-frames:
-* Receive Timestamp: time when a media frame was received locally.
-* Capture Timestamp: time when a media frame was originally captured, set by
-the system that captured the frame.
-* Capture Timestamp Server Offset: clock offset between the system that captured
-the frame and the system that sent the frame to the local system using this
-
-### 2.2. Do features in your specification expose the minimum amount of information necessary to implement the intended functionality?
-Yes.
-
-### 2.3. Do the features in your specification expose personal information, personally-identifiable information (PII), or information derived from either?
-No.
-
-### 2.4. How do the features in your specification deal with sensitive information?
-This feature does not deal with sensitive information.
-
-### 2.5. Does data exposed by your specification carry related but distinct information that may not be obvious to users?
-No.
-
-### 2.6. Do the features in your specification introduce state that persists across browsing sessions?
-No.
-
-### 2.7. Do the features in your specification expose information about the underlying platform to origins?
-No.
-
-### 2.8. Does this specification allow an origin to send data to the underlying platform?
-No.
-
-### 2.9. Do features in this specification enable access to device sensors?
-No.
-
-### 2.10. Do features in this specification enable new script execution/loading mechanisms?
-No.
-
-### 2.11. Do features in this specification allow an origin to access other devices?
-No.
-
-### 2.12. Do features in this specification allow an origin some measure of control over a user agent’s native UI?
-No.
-
-### 2.13. What temporary identifiers do the features in this specification create or expose to the web?
-None. It exposes timestamps but they do not seem very useful as identifiers.
-
-### 2.14. How does this specification distinguish between behavior in first-party and third-party contexts?
-No distinction.
-
-### 2.15. How do the features in this specification work in the context of a browser’s Private Browsing or Incognito mode?
-No distinction.
-
-### 2.16. Does this specification have both "Security Considerations" and "Privacy Considerations" sections?
-This is a minor addition to an existing specification. The existing specification has a "Privacy and security considerations" section.
-
-### 2.17. Do features in your specification enable origins to downgrade default security protections?
-Do features in your spec enable an origin to opt-out of security settings in order to accomplish something? If so, in what situations do these features allow such downgrading, and why?
-No.
-
-### 2.18. What happens when a document that uses your feature is kept alive in BFCache (instead of getting destroyed) after navigation, and potentially gets reused on future navigations back to the document?
-In this case, peer connection are closed, and the feature becomes inaccessible.
-
-### 2.19. What happens when a document that uses your feature gets disconnected?
-In this case, peer connection are closed, and the feature becomes inaccessible.
-
-
-### 2.20. Does your spec define when and how new kinds of errors should be raised?
-This feature does not produce new kinds of errors.
-
-### 2.21. Does your feature allow sites to learn about the user’s use of assistive technology?
-No.
-
-### 2.22. What should this questionnaire have asked?
-The questions seem appropriate.
diff --git a/timestamps.md b/timestamps.md
deleted file mode 100644
index 0e4fd06..0000000
--- a/timestamps.md
+++ /dev/null
@@ -1,303 +0,0 @@
-# Extra Timestamps for encoded RTC media frames
-
-## Authors:
-
-- Guido Urdaneta (Google)
-
-## Participate
-- https://github.com/w3c/webrtc-encoded-transform
-
-
-## Introduction
-
-The [WebRTC Encoded Transform](https://w3c.github.io/webrtc-encoded-transform/)
-API allows applications to access encoded media flowing through a WebRTC
-[RTCPeerConnection](https://w3c.github.io/webrtc-pc/#dom-rtcpeerconnection). 
-Video data is exposed as 
-[RTCEncodedVideoFrame](https://w3c.github.io/webrtc-encoded-transform/#rtcencodedvideoframe)s
-and audio data is exposed as
-[RTCEncodedAudioFrame](https://w3c.github.io/webrtc-encoded-transform/#rtcencodedaudioframe)s.
-Both types of frames have a getMetadata() method that returns a number of
-metadata fields containing more information about the frames.
-
-This feature consists in adding a number of additional metadata fields
-containing timestamps, in line with recent additions to
-[VideoFrameMetadata](https://w3c.github.io/webcodecs/video_frame_metadata_registry.html#videoframemetadata-members)
-in [WebCodecs](https://w3c.github.io/webcodecs/) and
-[requestVideoFrameCallback](https://wicg.github.io/video-rvfc/#video-frame-callback-metadata-attributes). 
-
-For the purposes of this feature, we use the following definitions:
-* The *capturer system* is a system that originally captures a media frame,
-  typically from a local camera, microphone or screen-share session. This frame
-  can be relayed through multiple systems before it reaches its final
-  destination.
-* The *receiver system* is the final destination of the captured frames. It
-  receives the data via an [RTCPeerConnection] and it uses the WebRTC Encoded
-  Transform API with the changes proposed by this feature.
-* The *sender system* is the system that communicates directly with the
-  *receiver system*. It may be the same as the capturer system, but not
-  necessarily. It is the last hop before the captured frames reach the receiver
-  system.
-
-The proposed new metadata fields are:
-* `receiveTime`: The time when the frame was received from the sender system.
-* `captureTime`: The time when the frame was captured by the capturer system.
-  This timestamp is set by the capturer system.
-* `senderCaptureTimeOffset`: An estimate of the offset between the capturer
-  system clock system and the sender system clock. The receiver system can
-  compute the clock offset between the receiver system and the sender system
-  and these two offset can be used to adjust the `captureTime` to the
-  receiver system clock.
-
-`captureTime` and `senderCaptureTimeOffset` are provided in WebRTC by the
-[Absolute Capture Time" header extension](https://webrtc.googlesource.com/src/+/refs/heads/main/docs/native-code/rtp-hdrext/abs-capture-time).
-
-Note that the [RTCRtpContributingSource](https://www.w3.org/TR/webrtc/#dom-rtcrtpcontributingsource) 
-interface also exposes these timestamps
-(see also [extensions[(https://w3c.github.io/webrtc-extensions/#rtcrtpcontributingsource-extensions)]),
-but in a way that is not suitable for applications using the WebRTC Encoded
-Transform API. The reason is that encoded transforms operate per frame, while
-the values in [RTCRtpContributingSource]() are the most recent seen by the UA,
-which make it impossible to know if the values provided by
-[RTCRtpContributingSource]() actually correspond to the frames being processed
-by the application.
-
-
-## User-Facing Problem
-
-This API supports applications where measuring the delay between the reception
-of a media frame and its original capture is useful.
-
-Some examples use cases are:
-1. Audio/video synchronization measurements
-2. Performance measurements
-3. Delay measurements
-
-In all of these cases, the application can log the measurements for offline
-analysis or A/B testing, but also adjust application parameters in real time.
-
-
-### Goals
-
-- Provide Web applications using WebRTC Encoded Transform access to receive and
-  capture timestamps in addition to existing metadata already provided.
-- Align encoded frame metadata with [metadata provided for raw frames]().
-
-
-### Non-goals
-
-- Provide mechanisms to improve WebRTC communication mechanisms based on the
-information provided by these new metadata fields.
-
-
-### Example
-
-This shows an example of an application that:
-1. Computes the delay between audio and video
-2. Computes the processing and logs and/or updates remote parameters based on the
-delay.
-
-```js
-// code in a DedicatedWorker
-let lastVideoCaptureTime;
-let lastAudioCaptureTime;
-let lastVideoSenderCaptureTimeOffset;
-let lastVideoProcessingTime;
-let senderReceiverClockOffset = null;
-
-function updateAVSync() {
-  const avSyncDifference = lastVideoCaptureTime - lastAudioCaptureTime;
-  doSomethingWithAVSync(avSyncDifference);
-}
-
-// Measures delay from original capture until reception by this system.
-// Other forms of delay are also possible.
-function updateEndToEndVideoDelay() {
-  if (senderReceiverClockOffset == null) {
-    return;
-  }
-
-  const adjustedCaptureTime =
-      senderReceiverClockOffset + lastVideoSenderCaptureTimeOffset + lastVideoCaptureTime;
-  const endToEndDelay = lastVideoReceiveTime - adjustedCaptureTime;
-  doSomethingWithEndToEndDelay(endToEndDelay);
-}
-
-function updateVideoProcessingTime() {
-  const processingTime = lastVideoProcessingTime - lastVideoReceiveTime;
-  doSomethingWithProcessingTime();
-}
-
-function createReceiverAudioTransform() {
-  return new TransformStream({
-    start() {},
-    flush() {},
-    async transform(encodedFrame, controller) {
-      let metadata = encodedFrame.getMetadata();
-      lastAudioCaptureTime = metadata.captureTime;
-      updateAVSync();
-      controller.enqueue(encodedFrame);
-    }
-  });
-}
-
-function createReceiverVideoTransform() {
-  return new TransformStream({
-    start() {},
-    flush() {},
-    async transform(encodedFrame, controller) {
-      let metadata = encodedFrame.getMetadata();
-      lastVideoCaptureTime = metadata.captureTime;
-      updateAVSync();
-      lastVideoReceiveTime = metadata.receiveTime;
-      lastVideoSenderCaptureTimeOffset = metadata.senderCaptureTimeOffset;
-      updateEndToEndDelay();
-      doSomeEncodedVideoProcessing(encodedFrame.data);
-      lastVideoProcessingTime = performance.now();
-      updateProcessing();
-      controller.enqueue(encodedFrame);
-    }
-  });
-}
-
-// Code to instantiate transforms and attach them to sender/receiver pipelines.
-onrtctransform = (event) => {
-  let transform;
-  if (event.transformer.options.name == "receiverAudioTransform")
-    transform = createReceiverAudioTransform();
-  else if (event.transformer.options.name == "receiverVideoTransform")
-    transform = createReceiverVideoTransform();
-  else
-    return;
-  event.transformer.readable
-      .pipeThrough(transform)
-      .pipeTo(event.transformer.writable);
-};
-
-onmessage = (event) => {
-  senderReceiverClockOffset = event.data;
-}
-
-
-// Code running on Window
-const worker = new Worker('worker.js');
-const pc = new RTCPeerConnection();
-
-// Do ICE and offer/answer exchange. Removed from this example for clarity.
-
-// Configure transforms in the worker
-pc.ontrack = e => {
-  if (e.track.kind == "video")
-    e.receiver.transform = new RTCRtpScriptTransform(worker, { name: "receiverVideoTransform" });
-  else  // audio
-    e.receiver.transform = new RTCRtpScriptTransform(worker, { name: "receiverAudioTransform" });
-}
-
-// Compute the clock offset between the sender and this system.
-const stats = pc.getStats();
-const remoteOutboundRtpStats = getRequiredStats(stats, "remote-outbound-rtp");
-const remoteInboundRtpStats = getRequiredStats(stats, "remote-inbound-rtp")
-const senderReceiverTimeOffset = 
-    remoteOutboundRtpStats.timestamp - 
-        (remoteOutboundRtpStats.remoteTimestamp + 
-         remoteInboundRtpStats.roundTripTime / 2);
-
-worker.postMessage(senderReceiverTimeOffset);
-```
-
-
-## Alternatives considered
-
-### [Alternative 1]
-
-Use the values already exposed in `RTCRtpContributingSource`.
-
-`RTCRtpContibutingSource` already exposes the same timestamps as this feature.
-The problem with using those timestamps is that it is impossible to reliably
-associate them to a specific encoded frame exposed by the WebRTC Encoded
-Transform API.
-
-This makes any of the computations in this feature unreliable.
-
-### [Alternative 2]
-
-Expose only `captureTime` and `receiveTime`.
-
-`senderCaptureTimeOffset` is a value that is provided by the 
-[Absolute Capture Timestamp]()https://webrtc.googlesource.com/src/+/refs/heads/main/docs/native-code/rtp-hdrext/abs-capture-time#absolute-capture-time
-WebRTC header extension, but that extension updates the value only periodically
-since there is little value in computing the estimatefor every packet, so it is
-strictly speaking not a per-frame value. Arguably, an application could use
-the `senderCaptureTimeOffset` already exposed in `RTCRtpContributingSource`.
-
-However, given that this value is coupled with `captureTime` in the header
-extension, it looks appropriate and more ergonomic to expose the pair in the
-frame as well. While clock offsets do not usually change significantly
-in a very short time, there is some extra accuracy in having the estimated
-offset between the capturer system and the sender for that particular frame.
-This could be more visible, for example, if the set of relays that frames
-go through from the capturer system to the sender system changes.
-
-Exposing `senderCaptureTimeOffset` also makes it clearer that the `captureTime`
-comes from the original capturer system, so it needs to be adjusted using the
-corresponding clock offset.
-
-
-### [Alternative 3]
-
-Expose a `captureTime` already adjusted to the receiver system's clock.
-
-The problem with this option is that clock offsets are estimates. Using
-estimates makes computing A/V Sync more difficult and less accurate.
-
-For example, if the UA uses the a single estimate during the whole session,
-the A/V sync computation will be accurate, but the capture times themselves will
-be inaccurate as the clock offset estimate is never updated. Any other
-computation made with the `captureTime` and other local timestamps will be
-inaccurate.
-
-### [Alternative 4]
-
-Expose a `localClockOffset` instead of a `senderClockOffset`.
-
-This would certainly support the use cases presented here, but it would have the
-following downsides:
-* It would introduce an inconsistency with the values exposed in `RTCRtpContibutingSource`.
-  This can lead to confusion, as the `senderClockOffset` is always paired together
-  with the `captureTime` in the header extension and developers expect this association.
-* Applications can compute their own estimate of the offset between sender
-  and receiver using WebRTC Stats and can control how often to update it.
-* Some applications might be interested in computing delays using the sender
-  as reference.
-
-In short, while this would be useful, the additional value is limited compared
-with the clarity, consistency and extra possibilities offered by exposing the
-`senderClockOffset`.
-
-
-
-## Accessibility, Privacy, and Security Considerations
-
-These timestamps are already available in a form less suitable for applications
-using WebRTC Encoded Transform as part of the RTCRtpContributingSource API.
-
-*The `captureTime` field is available via the 
-[RTCRtpContributingSource.captureTimestamp](https://w3c.github.io/webrtc-extensions/#dom-rtcrtpcontributingsource-capturetimestamp) field.
-
-
-*The `senderCaptureTimeOffset` field is available via the
-[RTCRtpContributingSource.senderCaptureTimeOffset](https://w3c.github.io/webrtc-extensions/#dom-rtcrtpcontributingsource-sendercapturetimeoffset) field.
-
-*The `receiveTime` field is available via the 
-[RTCRtpContributingSource.timestamp](https://w3c.github.io/webrtc-pc/#dom-rtcrtpcontributingsource-timestamp) field.
-
-While these fields are not 100% equivalent to the fields in this feature,
-they have the same privacy characteristics. Therefore, we consider that the
-privacy delta of this feature is zero.
-
-## References & acknowledgements
-
-Many thanks for valuable feedback and advice from:
-- Florent Castelli
-- Harald Avelstrand
-- Henrik Boström