Skip to content

New principle: Sourcing models and compute for inference #551

@martinthomson

Description

@martinthomson

This was brought up in our discussion of w3ctag/design-reviews#991 and w3ctag/design-reviews#1038 and translations and a few other new APIs.

The common theme here is that these capabilities tend to rely on the availability of large or relatively large ML models. There are essentially three states for availability:

  • The model is available locally.
  • The model is available, but it needs to be downloaded.
  • The model is available on the cloud somewhere.

Then there is the question of what responsibility users have for providing compute resources for inference. Historically, we haven't really done a lot to limit access to bandwidth and compute resources on user agents, but some newer ML applications are expensive enough to maybe make us reconsider that.

How these APIs might reasonably manage the potential choices that user agents might make with respect to these things is worth some discussion, to see if we can arrive at some reasonable advice.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions