Skip to content
Open
Changes from 4 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
332 changes: 332 additions & 0 deletions index.bs
Original file line number Diff line number Diff line change
Expand Up @@ -663,6 +663,9 @@ The interfaces defined are:
{{AudioNode}} which applies a non-linear waveshaping
effect for distortion and other more subtle warming effects.

* An {{AudioPlayoutStats}} interface, which provides statistics about the audio
played from the {{AudioContext}}.

There are also several features that have been deprecated from the
Web Audio API but not yet removed, pending implementation experience
of their replacements:
Expand Down Expand Up @@ -1488,6 +1491,7 @@ interface AudioContext : BaseAudioContext {
[SecureContext] readonly attribute (DOMString or AudioSinkInfo) sinkId;
attribute EventHandler onsinkchange;
attribute EventHandler onerror;
[SameObject] readonly attribute AudioPlayoutStats playoutStats;
AudioTimestamp getOutputTimestamp ();
Promise<undefined> resume ();
Promise<undefined> suspend ();
Expand Down Expand Up @@ -1533,6 +1537,11 @@ and to allow it only when the {{AudioContext}}'s [=relevant global object=] has
::
An ordered list to store pending {{Promise}}s created by
{{AudioContext/resume()}}. It is initially empty.

: <dfn>[[playout stats]]</dfn>
::
A slot where an instance of {{AudioPlayoutStats}} can be stored. It is
initially null.
</dl>

<h4 id="AudioContext-constructors">
Expand Down Expand Up @@ -1769,6 +1778,22 @@ Attributes</h4>
the context is {{AudioContextState/running}}.
* When the operating system reports an audio device malfunction.

: <dfn>playoutStats</dfn>
::
An instance of {{AudioPlayoutStats}} for this {{AudioContext}}.

<div algorithm="access playoutStats">
<span class="synchronous">When accessing this attribute, run the
following steps:</span>

1. If the {{[[playout stats]]}} slot is null, construct a new
{{AudioPlayoutStats}} object with [=this=] as the argument, and
store it in {{[[playout stats]]}}.

1. Return the value of the {{[[playout stats]]}} internal slot.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we doing this little dance and not return it directly? An implementation can do this lazy initialization if it wants to, but this it isn't useful to normatively require it, and it isn't observable.

</div>


</dl>

<h4 id="AudioContext-methods">
Expand Down Expand Up @@ -11536,6 +11561,313 @@ context.audioWorklet.addModule('vumeter-processor.js').then(() => {
});
</xmp>

<h3 interface lt="AudioPlayoutStats" id="AudioPlayoutStats">
The {{AudioPlayoutStats}} Interface</h3>

Provides audio underrun and latency statistics for audio played through the
{{AudioContext}}.

Audio underruns (also commonly called glitches) are gaps in the audio
playout which occur when the audio pipeline cannot deliver audio on time.
Underruns (often manifesting as audible "clicks" in the playout) are bad
for the user experience, so if any of these occur it
can be useful for the application to be able to detect them and possibly
take some action to improve the playout.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Audio underruns (also commonly called glitches) are gaps in the audio
playout which occur when the audio pipeline cannot deliver audio on time.
Underruns (often manifesting as audible "clicks" in the playout) are bad
for the user experience, so if any of these occur it
can be useful for the application to be able to detect them and possibly
take some action to improve the playout.
Audio underruns (also commonly called glitches) are gaps in the audio
playback that occur when an audio pipeline cannot deliver audio on time.
Underruns (often manifesting as audible "clicks" in the playout) are bad
for the user experience, so if any of these occur it
can be useful for the application to be able to detect them and possibly
take some action to improve the playout.

we probably need to rephrase this. When the audio isn't delievered on time, it causes an audio underrun, that causes a discontinuity, that causes and audible click, that can be called a glitch. This is because of the synchronous nature of the AudioContext, unlike for example a PeerConnection, that will do something when audio input is missing to cover it up.


{{AudioPlayoutStats}} is a dedicated object for audio stats reporting;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{{AudioPlayoutStats}} is a dedicated object for audio stats reporting;
{{AudioPlayoutStats}} is a dedicated object for audio statistics reporting;

it reports audio underrun and playout latency statistics for the
{{AudioContext's}} playout path via
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{{AudioContext's}} playout path via
{{AudioContext's}} playback path via

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's standardize on "playback" or "audio output". "playout" is an rtc/voip term. Both in normative text + API.

{{AudioDestinationNode}} and the associated output device. This allows
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
{{AudioContext's}} playout path via
{{AudioDestinationNode}} and the associated output device. This allows
{{AudioContext's}} playout path via the
{{AudioDestinationNode}} and its associated audio output device. This allows

applications to measure underruns occurring due to underperforming
AudioWorklets as well as underruns and latency originating in the playout
path between the {{AudioDestinationNode}} and the output device.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is important, but unrelated to AudioWorklet. We want to point out two possible causes:

  • The audio graph itself is too expensive for the current setup -- latency is too low, computer is generally too slow, etc.
  • The audio graph itself would render fine, but an external factor causes issues: other audio program on the device, global system overload, overload because of thermal throttling, etc.

it is important to make it extra clear what is being discussed here (especially in relation to the other section).


Underruns are defined in terms of [=underrun frames=] and [=underrun events=]:
- An <dfn>underrun frame</dfn> is an audio frame played by the output device
that was not provided by the playout path.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
that was not provided by the playout path.
that was not provided by the AudioContext.

This happens when the playout path fails to provide audio frames
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's actually the opposite -- remember, this is all synchronous rendering.

to the output device on time, in which case underrun frames will be played
(underrun frames are typically silence).
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
to the output device on time, in which case underrun frames will be played
(underrun frames are typically silence).
to the output device on time, in which case it will still have to play something.
NOTE: Underrun frames are typically silence.

This typically only happens if the pipeline is underperforming.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pipeline isn't a term that exists in this specification, find something else. rendering graph, maybe?

This includes underrun situations that happen for reasons unrelated to
WebAudio/{{AudioWorklet}}s.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should remove this, and make it clear in a dedicated section, possibly non-normative.

- When an [=underrun frame=] is played after a non-underrun frame, we consider
this an <dfn>underrun event</dfn>.
That is, multiple consecutive [=underrun frames=] will count as a single
[=underrun event=].
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is "this"? Is it a point in time, a duration? Both sentences seem to disagree.


<pre class="idl">
[Exposed=Window, SecureContext]
interface AudioPlayoutStats {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Playback

constructor (AudioContext context);
readonly attribute double underrunDuration;
readonly attribute unsigned long underrunEvents;
readonly attribute double totalDuration;
readonly attribute double averageLatency;
readonly attribute double minimumLatency;
readonly attribute double maximumLatency;
undefined resetLatency();
[Default] object toJSON();
};
</pre>

{{AudioPlayoutStats}} has the following internal slots:

<dl dfn-type=attribute dfn-for="AudioPlayoutStats">
: <dfn>[[audio context]]</dfn>
::
The {{AudioContext}} that this instance of {{AudioPlayoutStats}} is
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to make this constructable?

associated with.

: <dfn>[[underrun duration]]</dfn>
::
The total duration in seconds of [=underrun frames=] that
{{[[audio context]]}} has played as of the last stat update, a double.
Initialized to 0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If our definition of an underrun event is a duration, this can be the sum of the duration of all underrun events.


: <dfn>[[underrun events]]</dfn>
::
The total number of [=underrun events=] that has occurred in playout by
{{[[audio context]]}} as of the last stat update, an int. Initialized
to 0.

: <dfn>[[total duration]]</dfn>
::
The total duration in seconds of all frames
(including [=underrun frames=]) that {{[[audio context]]}} has played
as of the last stat update, a double. Initialized to 0.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we define this in terms of currentTime, latency and underrun duration?


: <dfn>[[average latency]]</dfn>
::
The average playout latency in seconds of frames played by
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

frames don't have a latency, maybe we just want to say The average audio output latency of the AudioContext over the currently tracked interval.

{{[[audio context]]}} over the currently tracked interval, a double.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We want to define currently tracked interval to be from the last reset() to now (roughly). Shouldn't it be initialized on the first "suspended" -> "running" transition or something like that? Akin to outputLatency, but I'm not sure that we define this properly.

Initialized to 0.

: <dfn>[[minimum latency]]</dfn>
::
The minimum playout latency in seconds of frames played by
{{[[audio context]]}} over the currently tracked interval, a double.
Initialized to 0.

: <dfn>[[maximum latency]]</dfn>
::
The maximum playout latency in seconds of frames played by
{{[[audio context]]}} over the currently tracked interval, a double.
Initialized to 0.

: <dfn>[[latency reset time]]</dfn>
::
The time when the latency statistics were last reset, a
double.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in the clock domain of AudioContext.currentTime

</dl>

<h4 id="AudioPlayoutStats-constructors">
Constructors</h4>

<dl dfn-type="constructor" dfn-for="AudioPlayoutStats" id="dom-audioplayoutstats-constructor-audioplayoutstats">
: <dfn>AudioPlayoutStats(context)</dfn>
::
Run the following steps:
1. Set {{[[audio context]]}} to <code>context</code>.
1. Set {{[[latency reset time]]}} to 0.

<pre class=argumentdef for="AudioPlayoutStats/constructor()">
context: The {{AudioContext}} this new {{AudioPlayoutStats}} will
be associated with.
</pre>
</dl>

<h4 id="AudioPlayoutStats-attributes">
Attributes</h4>

Note: These attributes update only once per second and under specific
conditions. See the <a href="#update-audio-stats">update audio stats</a>
algorithm and <a href="#AudioPlayoutStats-mitigations">privacy mitigations</a>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use markup shorthands.

for details.

<dl dfn-type=attribute dfn-for="AudioPlayoutStats">
: <dfn>underrunDuration</dfn>
::
Measures the duration of [=underrun frames=] played by the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
Measures the duration of [=underrun frames=] played by the
Returns the duration of [=underrun frames=] played by the

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

throughout

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we just want to drop the first sentence of those attributes, we're repeating ourselves here.

{{AudioContext}}, in seconds.
This metric can be used together with {{totalDuration}} to
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
This metric can be used together with {{totalDuration}} to
NOTE: This metric can be used together with {{totalDuration}} to

calculate the percentage of played out media that was not provided by
the {{AudioContext}}.

Returns the value of the {{[[underrun duration]]}} internal slot.

<dl dfn-type=attribute dfn-for="AudioPlayoutStats">
: <dfn>underrunEvents</dfn>
::
Measures the number of [=underrun events=] that have occurred during
playout by the {{AudioContext}}.

Returns the value of the {{[[underrun events]]}} internal slot.

<dl dfn-type=attribute dfn-for="AudioPlayoutStats">
: <dfn>totalDuration</dfn>
::
Measures the total duration of all audio played by the {{AudioContext}},
in seconds.

Returns the value of the {{[[total duration]]}} internal slot.

<dl dfn-type=attribute dfn-for="AudioPlayoutStats">
: <dfn>averageLatency</dfn>
::
The average playout latency, in seconds, for audio played since the
last call to {{resetLatency()}}, or since the creation of the
{{AudioContext}} if
{{resetLatency()}} has not been called.

Returns the value of the {{[[average latency]]}} internal slot.

<dl dfn-type=attribute dfn-for="AudioPlayoutStats">
: <dfn>minimumLatency</dfn>
::
The minimum playout latency, in seconds, for audio played since the
last call to {{resetLatency()}}, or since the creation of the
{{AudioContext}} if
{{resetLatency()}} has not been called.

Returns the value of the {{[[minimum latency]]}} internal slot.

<dl dfn-type=attribute dfn-for="AudioPlayoutStats">
: <dfn>maximumLatency</dfn>
::
The maximum playout latency, in seconds, for audio played since the
last call to {{resetLatency()}}, or since the creation of the
{{AudioContext}} if
{{resetLatency()}} has not been called.

Returns the value of the {{[[maximum latency]]}} internal slot.

<h4 id="AudioPlayoutStats-methods">
Methods</h4>

<dl dfn-type=method dfn-for="AudioPlayoutStats">
: <dfn>resetLatency()</dfn>
::
Sets the start of the interval that latency stats are tracked over to
the current time.
When {{resetLatency}} is called, run the following steps:

1. Set {{[[latency reset time]]}} to {{BaseAudioContext/currentTime}}.
1. Let <var>currentLatency</var> be the playout latency of the last
frame played by {{[[audio context]]}}, or 0 if no frames have been
played out yet.
1. Set {{[[average latency]]}} to <var>currentLatency</var>.
1. Set {{[[minimum latency]]}} to <var>currentLatency</var>.
1. Set {{[[maximum latency]]}} to <var>currentLatency</var>.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we restricting this to the latency? A developer that notices that the underrun figures increase and make changes to its processing will want to know if it increases again. OTOH the latency is typically but not always constant with hopefully a very tight stddev.


<h4> Updating the stats</h4>
<div id="update-audio-stats" algorithm="update audio stats">
Once per second, execute the
<a href="#update-audio-stats">update audio stats</a> algorithm:
1. If {{[[audio context]]}} is not running, abort these steps.
1. Let <var>canUpdate</var> be false.
1. Let <var>document</var> be the current [=this=]'s
[=relevant global object=]'s [=associated Document=].
If <var>document</var> is [=Document/fully active=] and <var>document</var>'s
[=Document/visibility state=] is `"visible"`, set <var>canUpdate</var> to
true.
1. Let <var>permission</var> be the [=permission state=] for the permission
associated with [="microphone"=] access.
If <var>permission</var> is "granted", set <var>canUpdate</var> to true.
1. If <var>canUpdate</var> is false, abort these steps.
1. Set {{[[underrun duration]]}} to the total duration of all
[=underrun frames=] (in seconds) that
{{[[audio context]]}} has played since its construction.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by definition, an AudioContext doesn't play an under-run frame, this is backwards.

1. Set {{[[underrun events]]}} to the number of times that {{[[audio context]]}}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm curious if this design holds up to distinguish these three cases

  1. occasional spike
  2. consistent overload -> consistent underruns (every quantum processed)
  3. periodic overload. as an example, a misaligned block based computation that ends up processing every N frames

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi!

  1. An occasional spike will manifest as a single increase in underrunDuration, and an increase in underrunEvents by 1.
  2. Consistent underruns will manifest as many small increases in underrunDuration, and many increases in underrunEvents.
  3. Similar to 1, but periodic. Since the API updates at most once per second (for privacy reasons), this might not be possible to immediately detect if N is small enough that we get several underruns per second. If N is large, it should be possible to see that the underruns occur at regular intervals. Also (if my math is correct) underrunEvents / currentTime should converge steadily towards sampleRate / N, so you could also look for that.

has played an [=underrun frame=] after a non-underrun frame since its
construction.
1. Set {{[[total duration]]}} to the total duration of all frames (in seconds)
that {{[[audio context]]}} has played since its construction.
1. Set {{[[average latency]]}} to the average playout latency (in seconds) of
frames that {{[[audio context]]}} has played since
{{[[latency reset time]]}}.
1. Set {{[[minimum latency]]}} to the minimum playout latency (in seconds) of
frames that {{[[audio context]]}} has played since
{{[[latency reset time]]}}.
1. Set {{[[maximum latency]]}} to the maximum playout latency (in seconds) of
frames that {{[[audio context]]}} has played since
{{[[latency reset time]]}}.
</div>

<h4>Privacy considerations of glitch stats</h4>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's use the API name here, and not introduce another term.


<h5>Risk</h5>
Audio underrun information could be used to form a cross-site
covert channel between two cooperating sites.
One site could transmit information by intentionally causing audio glitches
(by causing very high CPU usage, for example) while the other site
could detect these glitches.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but only if:

  • There is a linearization point somewhere on the system (typically the audio mixer, be it in the OS or in the browser)
  • The callbacks are effectively synchronous all the way from this linearization point, without a buffer in between that could flatten load spikes (that could be because of a different AudioContextLatencyCategory).

<h5 id="AudioPlayoutStats-mitigations">Mitigations</h5>
To inhibit the usage of such a covert channel, the API implements these
mitigations.
- The values returned by the API should not be updated more than once per
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not MUST?

second.
- The API should be restricted to sites that fulfill at least one of the following
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also why not MUST? Is this normative at all?

criteria:
1. The site has obtained
<a href="https://w3c.github.io/mediacapture-main/#dom-mediadevices-getusermedia">getUserMedia</a>
permission.

Note: The reasoning is that if a site has obtained
<a href="https://w3c.github.io/mediacapture-main/#dom-mediadevices-getusermedia">getUserMedia</a>
permission, it can receive glitch information or communicate
efficiently through use of the microphone, making access to the
information provided by {{AudioPlayoutStats}} redundant. These options
include detecting glitches through gaps in the microphone signal, or
communicating using human-inaudible sine waves. If microphone access is
ever made safer in this regard, this condition should be reconsidered.
1. The document is [=Document/fully active=] and its
[=Document/visibility state=] is `"visible"`.

Note: Assuming that neither cooperating site has microphone permission,
this criterion ensures that the site that receives the covert signal
must be visible, restricting the conditions under which the covert
channel can be used. It makes it impossible for sites to communicate
with each other using the covert channel while not visible.

<h4>Usage example</h4>
This example shows how the {{AudioPlayoutStats}} can be used to calculate audio
underrun and latency statistics, and what the statistics might be used for.
<pre line-numbers class="example" highlight="js">
let oldTotalDuration = audioContext.playoutStats.totalDuration;
let oldUnderrunDuration = audioContext.playoutStats.underrunDuration;
let oldUnderrunEvents = audioContext.playoutStats.underrunEvents;
audioContext.playoutStats.resetLatency();

// Wait while playing audio
...

// the number of seconds that were covered by the frames played by the output
// device between the two executions.
let deltaTotalDuration =
audioContext.playoutStats.totalDuration - oldTotalDuration;
let deltaUnderrunDuration =
audioContext.playoutStats.underrunDuration - oldUnderrunDuration;
let deltaUnderrunEvents =
audioContext.playoutStats.underrunEvents - oldUnderrunEvents;

// underrun fraction stat over the last deltaTotalDuration seconds
let underrunFraction = deltaUnderrunDuration / deltaTotalDuration;
// underrun frequency stat over the last deltaTotalDuration seconds
let underrunFrequency = deltaUnderrunEvents / deltaTotalDuration;
// Average playout delay stat during the last deltaTotalDuration seconds
let playoutDelay = audioContext.playoutStats.averageLatency;

if (underrunFrequency > 0) {
// Do something to prevent audio glitches, like lowering
// WebAudio graph complexity
}
if (playoutDelay > 0.2) {
// Do something to reduce latency, like suggesting alternative
// playout methods
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current prose makes this example unnecessary, it's already quite clear.

</pre>

<h2 id="processing-model">
Processing model</h2>

Expand Down
Loading