|
1 |
| -# Explainer |
| 1 | +# Explainer - WebRTC Insertable Stream Processing of Media |
2 | 2 |
|
3 |
| -TBD |
| 3 | +## Problem to be solved |
| 4 | + |
| 5 | +We need an API for processing media that: |
| 6 | +* Allows the processing to be specified by the user, not the browser |
| 7 | +* Allows the processed data to be handled by the browser as if it came through |
| 8 | + the normal pipeline |
| 9 | +* Allows the use of techniques like WASM to achieve effective processing |
| 10 | +* Allows the use of techniques like Workers to avoid blocking on the main thread |
| 11 | +* Does not negatively impact security or privacy of current communications |
| 12 | + |
| 13 | +## Approach |
| 14 | + |
| 15 | +This document builds on WebCodecs, and tries to unify the concepts from there with the legacy PeerConnection in order to build an API that is: |
| 16 | + |
| 17 | +* Familiar to existing PeerConnection users |
| 18 | +* Able to support user defined component wrapping and replacement |
| 19 | +* Able to support high performance user-specified transformations |
| 20 | + |
| 21 | +The central component of the API is the concept (inherited from WebCodecs) of a component’s main role being a TransformStream (part of the WHATWG Streams spec). |
| 22 | + |
| 23 | +A PeerConnection in this model is a bunch of TransformStreams, connected together into a network that provides the functions expected of it. In particular: |
| 24 | + |
| 25 | +* MediaStreamTrack contains a TransformStream (input & output: Media samples) |
| 26 | +* RTPSender contains a TransformStream (input: Media samples, output: RTP packets) |
| 27 | +* RTPReceiver contains a TransformStream (input: RTP packets, output: Media samples) |
| 28 | + |
| 29 | + |
| 30 | +RTPSender and RTPReceiver are composable objects - a sender has an encoder and a |
| 31 | +RTP packetizer, which pipe into each other; a receiver has an RTP depacketizer |
| 32 | +and a decoder. |
| 33 | + |
| 34 | + |
| 35 | +The encoder is an object that takes a Stream(raw frames) and emits a Stream(encoded frames). It will also have API surface for non-data interfaces like asking the encoder to produce a keyframe, or setting the normal keyframe interval, target bitrate and so on. |
| 36 | + |
| 37 | +## Code examples |
| 38 | + |
| 39 | +We can pass a factory function to the PeerConnection that does the building whenever an encoder is needed (and similar for the decoder): |
| 40 | + |
| 41 | +<pre> |
| 42 | +pc = new PeerConnection( { |
| 43 | + encoderFactory: (encoder) => { |
| 44 | + var munger = new TransformStream({ |
| 45 | + transformer: munge |
| 46 | + }); |
| 47 | + var wrapped = { readable: munger.readable, |
| 48 | + writable: encoder.writable }; |
| 49 | + encoder.readable.pipeTo(munger.writable); |
| 50 | + return wrappedEncoder; |
| 51 | + } |
| 52 | +}); |
| 53 | +</pre> |
| 54 | + |
| 55 | +The PC will then connect the returned object’s “writable” to the media input, and the returned object’s “readable” to the RTP packetizer’s input. |
| 56 | + |
| 57 | +When the processing is to be done in a worker, we let the factory method pass the pipes to the worker: |
| 58 | +<pre> |
| 59 | +pc = new PeerConnection({ |
| 60 | + encoderFactory: (encoder) => { |
| 61 | + var munger = new TransformStream({ transformer: munge }); |
| 62 | + output = encoder.readable.pipeThrough(munger.writable); |
| 63 | + worker.postMessage([‘munge this’, munger], [munger]); |
| 64 | + Return { readable: output, writable: encoder.writable }; |
| 65 | + } |
| 66 | + })}); |
| 67 | +</pre> |
0 commit comments