Skip to content

Refactor device selection: Rename to computePolicy, remove accelerated, and add fallback#923

Open
mingmingtasd wants to merge 1 commit intowebmachinelearning:mainfrom
mingmingtasd:policy
Open

Refactor device selection: Rename to computePolicy, remove accelerated, and add fallback#923
mingmingtasd wants to merge 1 commit intowebmachinelearning:mainfrom
mingmingtasd:policy

Conversation

@mingmingtasd
Copy link
Copy Markdown
Contributor

@mingmingtasd mingmingtasd commented Mar 18, 2026

To fix #911

Description:
This PR refactors the device selection preference API to establish a more extensible framework by replacing MLPowerPreference with MLComputePolicy in MLContextOptions.

Key changes included:

  1. Renamed preference enum: MLPowerPreference has been refactored to MLComputePolicy (and the corresponding MLContextOptions dictionary member updated to computePolicy).
  2. Clarified extensibility: The spec now explicitly states that the policy is designed to be extensible to accommodate future execution profiles, moving beyond just power and performance metrics.
  3. Removed "accelerated" option: Removed the accelerated value to eliminate ambiguity, as performance and power semantics are better covered by other profiles.
  4. Introduced "fallback" policy: Added a new "fallback" compute policy. This option prioritizes maximum compatibility over other considerations (typically running on a CPU) and is highly useful for verifying a model's numeric behavior without utilizing parallel accelerators like GPUs or NPUs.

The corresponding chromium CL is https://chromium-review.googlesource.com/c/chromium/src/+/7513189

PTAL, thanks! @huningxin


Preview | Diff

<dt>"<dfn enum-value>low-power</dfn>"</dt>
<dd>Prioritizes power consumption over other considerations such as execution speed.</dd>
<dt>"<dfn enum-value>fallback</dfn>"</dt>
<dd>Prioritizes maximum compatibility over other considerations, typically running on a CPU. This is useful for testing a model's numeric behavior without utilizing parallel accelerators like GPUs or NPUs.</dd>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This policy is useful for audio processing that doesn't want to depend on accelerators due to latency reasons. Can the text be updated to mention this case?

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just naming) Hmm, reading the description of this enum ("Prioritizes maximum compatibility...", "useful for testing a model's numeric behavior..."), some more immediately enlightening names to readers would be something like "compatible"/"compatibility"/"stable"/"precision". I would have not guessed that description from the word "fallback", which implies you fell back to a less capable device than the one you really wanted, when actually CPU may have been exactly what you wanted (that is, it was the primary preference, not a fallback). e.g.

enum MLComputePolicy {
  "default",
  "high-performance",
  "low-power",
  "compatible", // or stable/local...
};

This policy is useful for audio processing that doesn't want to depend on accelerators due to latency reasons

Totally, as I recall some teams in the past complaining about GPU overhead for background audio filtering in chat apps, preferring to keep compute more local on the CPU. Perhaps as a separate PR, we could add an explicit "low-latency" option too, which would be even clearer in intent for that scenario.

Also "precision" would be a useful preference, as some NPU's and GPU's chop off low bits (also, separate PR).

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for both the "compatible" (in this PR), adding "low-latency" in a next PR, and adding a preference/hint for "precision" (to MLContextOptions?).

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let me throw in "non-accelerated" here or even "cpu" if we don't like "fallback". Reading "compatible" makes me be mildly suspecting an accelerator could still be in the mix :)

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not tied to "fallback" but it does align with existing WebGPU naming.

Some opposition to "cpu" since it describes an implementation detail and not a policy. Definition of fallback as used in the WebGPU specification https://www.w3.org/TR/webgpu/#fallback-adapter:

An adapter may be considered a fallback adapter if it has significant performance caveats in exchange for some combination of wider compatibility, more predictable behavior, or improved privacy.

whatever name chosen should help describe the behavior and not the implementation. The spec does not require cpu execution so having that in the name would seem incorrect.

Copy link
Copy Markdown
Collaborator

@zolkis zolkis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

Copy link
Copy Markdown
Collaborator

@fdwr fdwr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider clearer naming, otherwise LGTM sir.

<dt>"<dfn enum-value>low-power</dfn>"</dt>
<dd>Prioritizes power consumption over other considerations such as execution speed.</dd>
<dt>"<dfn enum-value>fallback</dfn>"</dt>
<dd>Prioritizes maximum compatibility over other considerations, typically running on a CPU. This is useful for testing a model's numeric behavior without utilizing parallel accelerators like GPUs or NPUs.</dd>
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(just naming) Hmm, reading the description of this enum ("Prioritizes maximum compatibility...", "useful for testing a model's numeric behavior..."), some more immediately enlightening names to readers would be something like "compatible"/"compatibility"/"stable"/"precision". I would have not guessed that description from the word "fallback", which implies you fell back to a less capable device than the one you really wanted, when actually CPU may have been exactly what you wanted (that is, it was the primary preference, not a fallback). e.g.

enum MLComputePolicy {
  "default",
  "high-performance",
  "low-power",
  "compatible", // or stable/local...
};

This policy is useful for audio processing that doesn't want to depend on accelerators due to latency reasons

Totally, as I recall some teams in the past complaining about GPU overhead for background audio filtering in chat apps, preferring to keep compute more local on the CPU. Perhaps as a separate PR, we could add an explicit "low-latency" option too, which would be even clearer in intent for that scenario.

Also "precision" would be a useful preference, as some NPU's and GPU's chop off low bits (also, separate PR).

<dt>"<dfn enum-value>low-power</dfn>"</dt>
<dd>Prioritizes power consumption over other considerations such as execution speed.</dd>
<dt>"<dfn enum-value>fallback</dfn>"</dt>
<dd>Prioritizes maximum compatibility over other considerations, typically running on a CPU. This is useful for testing a model's numeric behavior without utilizing parallel accelerators like GPUs or NPUs.</dd>
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this wording is too implementation specific:

Prioritizes maximum compatibility over other considerations, typically running on a CPU. This is useful for testing a model's numeric behavior without utilizing parallel accelerators like GPUs or NPUs

and something more general similar to the WebGPU wording where either "compatibility, more predictable behavior, or improved privacy" can be prioritized would be more consistent with existing web specification.

Copy link
Copy Markdown

@mwyrzykowski mwyrzykowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would like to see the references to CPU removed from the non-normative wording https://github.com/webmachinelearning/webnn/pull/923/changes#r3010808209 but otherwise no objections to the API surface changes ✅

@mingmingtasd
Copy link
Copy Markdown
Contributor Author

I have no preference for the naming; I'll wait for the decision. For the description wording, I'm OK with not mentioning CPU to avoid being implementation-specific.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

accelerated should be prior to powerPreference for device selection

6 participants