Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 108 additions & 0 deletions DecoderError/explainer.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,108 @@
# Decoder Error

**Authors**
* [Nishitha Burman Dey](https://github.com/nishitha-burman)

Much of this explainer synthesizes and consolidates prior discussions and contributions from [Diego Perez Botero](https://github.com/Diego-Perez-Botero), [Henrik Boström](https://github.com/henbos), [Philipp Hancke](https://github.com/fippo), [Sun Shin](https://github.com/xingri), and other members of the WebRTC working group.

## Participate
* [Issue tracker](https://github.com/MicrosoftEdge/MSEdgeExplainers/labels/DecoderError)
* [Discussion forum](https://github.com/w3c/webrtc-extensions/issues/146)

## Introduction
Game streaming platforms like Xbox Cloud Gaming and Nvidia GeForce Now rely on hardware decoding in browsers to deliver low-latency, power efficient experiences. However, there is currently no reliable way for these applications to detect when decoding silently falls back to software during a stream.

This proposal introduces a runtime event to notify applications when a decoder error or fallback occurs. The goal is to give developers actionable visibility into runtime behavior without exposing new fingerprinting vectors or hardware details.

## User-Facing Problem
End users of game streaming services may experience increased latency, degraded quality, and battery drain when the browser switches from hardware to software decoding. Developers currently lack a way to detect this fallback in real time without prompting users for camera/mic permissions. In the past, developers used to rely on [`decoderImplementation`](https://w3c.github.io/webrtc-stats/#dom-rtcinboundrtpstreamstats-decoderimplementation) info, but as of Chromium M110+ it requires [`getUserMedia()`](https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices/getUserMedia) permissions. This is not ideal because the UI prompt is invasive, it’s excessive since it grants access to the camera and mic hardware when apps don’t need it, and it has a high failure rate since users have little reason to grant the permission unless they want to use voice chat. This gap makes it difficult to diagnose performance regressions and provide troubleshooting guidance.

## Goals
* Enable developers to detect runtime decoder fallback from hardware to software in a non-invasive way without requiring additional permissions.
* Allow applications to diagnose regressions (e.g. codec negotiation issues, device specific problems).
* Support user experience improvements by enabling apps to adapt (e.g. lowering resolution, re-negotiating codecs), alerting end users when software decode occurs, and displaying troubleshooting information.

## Non-goals
* Exposing vendor-specific hardware information.
* Exposing deterministic codec support/capabilities beyond what [`MediaCapabilities`](https://developer.mozilla.org/en-US/docs/Web/API/Media_Capabilities_API) already provides.
* Providing detailed telemetry such as frame-level error counts or decoder identifiers.

## User Research
Feedback from Xbox Cloud Gaming, Nvidia GeForce Now and similar partners shows:
* Fallback is common in the field, and developers lack visibility into when/why it occurs.
* Reliance on `getUserMedia()` to query `decoderImplementation` has a high failure rate because users often deny permissions that are irrelevant to media playback.
* Previous workarounds (e.g. guessing based on decode times) have proven unreliable and masked bugs.
* Relying on `MediaCapabilities` is insufficient because it only provides a static capability hint and does not reflect what happens at runtime, for example, when hardware decode fails mid-session and the browser silently falls back to software.
* Without this signal, developers cannot confidently diagnose or reduce fallback incidence.

## Proposed Approach
Introduce an event on [`RTCRtpReceiver`](https://developer.mozilla.org/en-US/docs/Web/API/RTCRtpReceiver) ([see slide 30](https://docs.google.com/presentation/d/1FpCAlxvRuC0e52JrthMkx-ILklB5eHszbk8D3FIuSZ0/edit?slide=id.g2452ff65d17_0_71#slide=id.g2452ff65d17_0_71)) that fires when:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we give a name to the new event?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the example code I have it as decodererror, I'll add the API shape to make this clear. But from discussions above, this name may change.

* A decoder error occurs (e.g. bitstream decode failure)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i'd merge this sentence with the previous and then we have two alternatives listed

* The engine falls back from hardware to software decoding
* No software decoder is available (e.g. in the case of H.265)

This enables applications to alert users, re-negotiate codecs, and debug issues without requiring [`getUserMedia()`](https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices/getUserMedia) permissions.

### Example
```JavaScript
const pc = new RTCPeerConnection();

pc.ontrack = (event) => {

const receiver = event.receiver;

receiver.addEventListener("decodererror", () => {

console.warn("Decoder error: decoder fell back or failed");

// Log telemetry signal
logMetric("decoder_error");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we augment this example showing how different errors could be handled based on RTCRtpSenderErrorEvent::errorCode? For example: "software decoding fallback" vs "decoder unavailable".

Copy link
Contributor Author

@nishitha-burman nishitha-burman Oct 7, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decided to not add the scenario specific error codes because it may add potential fingerprinting vectors. Right now the event fires when there is an error, fallback happens, or when software decoder is not available.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But aren't them part of your proposal? I was reading the slide deck that you mentioned above. Just want to make sure that the example shows how clients could fully leverage this new event. I am not sure about this adding fingerprinting concerns though (I would say that it is not a risk).

Also, is there any value for the clients of this new API in being able to differentiate between "an error", "fallback happens", and when "software decoder is not available"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Error sounds like a terminal state. In the case of a decoder error, it might be terminal. However, in the case of a fallback, streaming may continue. Should we create separate events for decoder fallback and decoder error to differentiate between these states? After decoder fallback, is it possible to transition back to hardware decode? If so, do we need a way to signal this transition to inform developers that software decode is no longer used?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You both raise a good point. Perhaps we can differentiate between fallback and error without adding new fingerprinting vectors.

@gabrielsanbrito to answer your question if there is value in differentiating, I do think there is value because site developers can take different action based on the issue and their scenarios. For example, if there is a decoder issue the developer can switch to a different codec or profile or trigger error UI for end users. And if decoder fallback happens then the developer can adapt quality dynamically, pause non-critical efforts, and warn users that performance may be reduced.

@SteveBeckerMSFT instead of a separate event, how about an error code that differentiates between the states?
As for transitioning back to hardware decode I am not sure if this is possible, will look into this.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SteveBeckerMSFT: decoderstatechange is something I'd consider more useful for cases where the decoder switches e.g. from AV1 to VP9 in a multiparty scenario, basically informing the application that the codec and its characteristic changed without the need for polling getStats. I like it!

decodeerror as name SGTM but I think we want an error (EncodingError which is more specific than DataError and OperationError?) so we can differentiate between fatal ("you need to switch from H265 to H264") and non-fatal ("we went to SW and this is going to drain your battery") similar to slide 20 .
From that slide we want timestamp which should be named rtpTimestamp because that is what pinpointing where in a bitstream the error occured.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fippo is it common for the decoder to go between hardware <-> software and between codecs?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nishitha-burman currently hardware->software due to decode errors can happen only once since there is no mechanism to switch back to hardware and retry yet.

Switching codecs is common in multiparty scenarios where clients might be picking the "best" codec for a certain group of peers (ideally hardware accelerated) and then a browser not supporting that codec joins and forces a downgrade.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks everyone for the feedback! I chatted with Steve and we are thinking of updating the API shape to the following:

partial interface RTCRtpReceiver {
attribute EventHandler ondecoderstatechange;
};

interface RTCDecoderStateChangeEvent : Event {
constructor(DOMString type, RTCDecoderStateChangeEventInit eventInitDict);

// Media timeline reference
readonly attribute unsigned long rtpTimestamp;

// Codec now in effect after the change.
readonly attribute DOMString codecString; 

// Align with MediaCapabilitiesInfo, powerEfficient changes primarily based on hardware/software decoder
// https://www.w3.org/TR/media-capabilities/#media-capabilities-info
readonly attribute boolean powerEfficient;
};


// Notify the user in a simple way
showToast("Playback quality may be reduced");

});
};

```
## Alternatives Considered
1. Use [`decoderImplementation`](https://w3c.github.io/webrtc-stats/#dom-rtcinboundrtpstreamstats-decoderimplementation) info via WebRTC Stats API
* Rejected because it now requires [`getUserMedia()`](https://developer.mozilla.org/en-US/docs/Web/API/MediaDevices/getUserMedia) permissions, which are invasive and have a high failure rate.
* Requires unnecessary permissions (camera/microphone).
2. Use [`MediaCapabilitiesInfo.powerEfficient`](https://www.w3.org/TR/media-capabilities/#media-capabilities-info)
* Rejected because this is a static hint, not a runtime signal.
* Does not update when the browser silently switches from hardware to software.
3. Guess based on decode times
* Unreliable and has masked bugs in production.
4. [Add `decoderFallback` field to `RTCInboundRtpStreamStats`](https://github.com/w3c/webrtc-stats/pull/725)
* Rejected because relying on stats to trigger a change felt like an anti-pattern and the recommendation was to explore an event driven solution. Additionally, there were concerns around fingerprinting.
* [WebRTC March 2023 meeting – 21 March 2023](https://www.w3.org/2023/03/21-webrtc-minutes.html)

## Privacy Considerations
* The event does not expose hardware vendor or device identity, reducing fingerprinting risk.
* Does not reveal deterministic codec/hardware support.

### Counter-argument to fingerprinting concerns:
* **Information is already exposed via Media Capabilities**: Hardware/software decode status is already partially exposed via the [`MediaCapabilitiesInfo.powerEfficient`](https://www.w3.org/TR/media-capabilities/#media-capabilities-info) attribute. A “common implementation strategy” is to treat hardware usage as indicative of optimal power draw.
* **Fallback doesn’t directly reveal capability**: The fallback event does not deterministically expose hardware support, as software fallback may occur for various reasons, making it a dynamic and contextual signal rather than a static fingerprint. Software fallback may occur because:
* Device lacks hardware support for the specific codec.
* The hardware decoder is temporarily unavailable.

## Stakeholder Feedback
* Web Developers: Positive
* [Xbox Cloud Gaming](https://github.com/w3c/webrtc-stats/pull/725#discussion_r1093134014) & Nvidia GeForce Now have direct use cases.
* Chromium: Positive; actively pursuing proposal.
* WebKit & Gecko: Overall positive feedback, but privacy/fingerprinting is a common concern.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have links to their official positions on this? Mozilla, WebKit

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We generally don't/can't request those until after the initial explainer is checked in.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And @xingri can tell us what GFN thinks!

Copy link

@xingri xingri Oct 17, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @fippo for sharing this.
Thanks @nishitha-burman for the proposal to surface decoder errors to developers.

This is a much-needed improvement for media diagnostics and reliability. We’ve had some internal discussions inside NVIDIA and would like to share the following feedback:

  1. Differentiating Between Fallback and Error
    We agree with the comment in the PR that it’s important to distinguish between a decoder error and a fallback. These are conceptually different, and conflating them could lead to misinterpretation in telemetry or user-facing diagnostics. Additionally, we’d like clarity on whether an error that leads to a fallback would trigger multiple events. Understanding which events are terminal (i.e., indicate unrecoverable failure) versus transitional (e.g., fallback to software decoding) is crucial for building robust error handling logic.

  2. What Triggers a Fallback?
    When discussing fallback scenarios, it would be helpful to clarify what conditions—besides decoder errors—might prompt a fallback to software decoding. For example:

  • Hardware limitations or unsupported configurations.
  • Performance-related decisions (e.g., slow decode).
  • Queue flushes or resource constraints.
    Providing a taxonomy of fallback triggers would help developers better interpret the context of these events.
  1. Linking Decoder Anomalies to Outcomes
    We’re also interested in understanding how developers can determine whether a decoder anomaly resulted in a fallback or a failure. For instance:
  • Could slow decode or queue flushes be surfaced as errors?
  • Would there be a way to correlate anomalies with fallback decisions?
    It may be valuable to include metadata such as the frame number where the fallback occurred.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your feedback @xingri! I just replied to the conversation above with a new proposal based on the feedback we received. Let me know if that addresses your concerns, especially around what triggers a fallback.


Last discussed in the 2025-09-16 WebRTC WG Call: [Slides 17-21](https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/edit?slide=id.g37afa1cfe47_0_26#slide=id.g37afa1cfe47_0_26) & [minutes](https://www.w3.org/2025/09/16-webrtc-minutes.html)

## References & Acknowledgements
Many thanks for valuable feedback and advice from:
* [Steve Becker](https://github.com/SteveBeckerMSFT)
* [Nic Champagne Williamson](https://github.com/champnic)

Links to past working group meetings where this has been discussed:
* 2025-09-16 WebRTC WG Call: [Slides 17-21](https://docs.google.com/presentation/d/11rr8X4aOao1AmvyoDLX8o9CPCmnDHkWGRM3nB4Q_104/edit?slide=id.g37afa1cfe47_0_26#slide=id.g37afa1cfe47_0_26) & [minutes](https://www.w3.org/2025/09/16-webrtc-minutes.html)
* 2023-09-15 WebRTC WG Call: [Slides 25-31](https://docs.google.com/presentation/d/1FpCAlxvRuC0e52JrthMkx-ILklB5eHszbk8D3FIuSZ0/edit?slide=id.g2452ff65d17_0_71#slide=id.g2452ff65d17_0_71) & minutes
* 2023-03-21 WebRTC WG Call: [Slides 16-18](https://lists.w3.org/Archives/Public/www-archive/2023Mar/att-0004/WEBRTCWG-2023-03-21.pdf) & [minutes](https://www.w3.org/2023/03/21-webrtc-minutes.html)

1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -100,6 +100,7 @@ we move them into the [Alumni section](#alumni-) below.
| [SelectiveClipboardFormatRead](ClipboardAPI/SelectiveClipboardFormatRead/explainer.md) | <a href="https://github.com/MicrosoftEdge/MSEdgeExplainers/labels/SelectiveClipboardFormatRead">![GitHub issues by-label](https://img.shields.io/github/issues/MicrosoftEdge/MSEdgeExplainers/SelectiveClipboardFormatRead?label=issues)</a> | [New Issue...](https://github.com/MicrosoftEdge/MSEdgeExplainers/issues/new?assignees=ragoulik&labels=SelectiveClipboardFormatRead&template=selective-clipboard-format-read.md&title=%5BSelective+Clipboard+Format+Read%5D+%3CTITLE+HERE%3E) | Editing |
| [Page Interaction Restriction Manager](PageInteractionRestrictionManager/explainer.md) | <a href="https://github.com/MicrosoftEdge/MSEdgeExplainers/labels/Page%20Interaction%20Restriction%20Manager">![GitHub issues by-label](https://img.shields.io/github/issues/MicrosoftEdge/MSEdgeExplainers/Page%20Interaction%20Restriction%20Manager?label=issues)</a> | [New issue...](https://github.com/MicrosoftEdge/MSEdgeExplainers/issues/new?assignees=jineens&labels=PageInteractionRestrictionManager&template=page-interaction-restriction-manager.md&title=%5BPage+Interaction+Restriction+Manager%5D+%3CTITLE+HERE%3E) | Enterprise |
| [DataTransferForInputEvent](Editing/input-event-dataTransfer-explainer.md) | <a href="https://github.com/MicrosoftEdge/MSEdgeExplainers/labels/DataTransferForInputEvent">![GitHub issues by-label](https://img.shields.io/github/issues/MicrosoftEdge/MSEdgeExplainers/DataTransferForInputEvent?label=issues)</a> | [New Issue...](https://github.com/MicrosoftEdge/MSEdgeExplainers/issues/new?assignees=pranavmodi&labels=DataTransferForInputEvent&template=data-transfer-for-input-event.md&title=%5BData+Transfer+For+Input+Event%5D+%3CTITLE+HERE%3E) | Editing |
| [Decoder Error](DecoderError/explainer.md) | <a href="https://github.com/MicrosoftEdge/MSEdgeExplainers/labels/DecoderError">![GitHub issues by-label](https://img.shields.io/github/issues/MicrosoftEdge/MSEdgeExplainers/DecoderError?label=issues)</a> | [New Issue...](https://github.com/MicrosoftEdge/MSEdgeExplainers/issues/new?assignees=nishitha-burman&labels=DecoderError&template=decoder-error.md&title=%5BDecoder+Error%5D+%3CTITLE+HERE%3E) | WebRTC |

# Brainstorming 🧠

Expand Down