Skip to content

Commit f2ee475

Browse files
authored
Overhaul availability testing and add expected input languages for detector
Add an `expectedInputLanguages` option to language detector creation API. This allows the browser to download relevant material if necessary, or fail-fast if a language the web developer needs to support is not available. Then, remove the `capabilities()` methods and the accompanying `AI*Capabilities` classes. * For translator, the only useful capabilities API was `(await ai.translator.capabilities()).languagePairAvailable()`. We simplify this to `await ai.translator.availability()`. This design also avoids the complexity where we have to retrieve all the availability information for every combination of options during the call to `capabilities()`, for later sync access. Now we can just retrieve the relevant information during the call to `availability()`. Also, by unifying on using the same options for `create()` and `availability()`, we fix #24. * For language detector, the capabilities supplied both `(await ai.languageDetector.capabilities()).available` and `(await ai.languageDetector.capabilities()).languageAvailable()`. We simplify this into `await ai.languageDetector.availability()`, which can either take no arguments (emulating `available`) or take the same `{ expectedInputLanguages }` argument as `create()` (emulating `languageAvailable()`). See also webmachinelearning/writing-assistance-apis#22 and webmachinelearning/prompt-api#69.
1 parent 96c4724 commit f2ee475

File tree

2 files changed

+45
-64
lines changed

2 files changed

+45
-64
lines changed

README.md

Lines changed: 41 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -77,59 +77,53 @@ Here `results` will be an array of `{ detectedLanguage, confidence }` objects, w
7777

7878
The language being unknown is represented by `detectedLanguage` being null. The array will always contain at least 1 entry, although it could be for the unknown (`null`) language.
7979

80-
### Capabilities, and a more realistic combined example
80+
### Language detection with expected input languages
8181

82-
Both APIs provide a promise-returning `capabilities()` methods which let you know, before calling `create()`, what is possible with the implementation. The capabilities object that the promise fulfills with has an `available` property which is one of `"no"`, `"after-download"`, or `"readily"`:
82+
If there are certain languages you need to be able to detect for your use case, you can include them in the `expectedInputLanguages` option when creating a language detector:
8383

84-
* `"no"` means that the implementation does not support translation or language detection.
85-
* `"after-download"` means that the implementation supports translation or language detection, but it will have to download something (e.g. a machine learning model) before it can do anything.
86-
* `"readily"` means that the implementation supports translation or language detection, and at least the base model is available without any downloads.
84+
```js
85+
const detector = await ai.languageDetector.create({ expectedInputLanguages: ["en", "ja"] });
86+
```
87+
88+
This will allow the implementation to download additional resources like language detection models if necessary, and will ensure that the promise is rejected with a `"NotSupportedError"` `DOMException` if the browser is unable to detect the given input languages.
8789

88-
Each of these capabilities objects has further methods which give the state of specific translation or language detection capabilities:
90+
### Checking before creation, and a more realistic combined example
8991

90-
* `languagePairAvailable(sourceLanguageTag, targetLanguageTag)`, for the `ai.translation.capabilities()` object
91-
* `languageAvailable(languageTag)`, for the `ai.languageDetection.capabilities()` object
92+
Both APIs provide the ability to know, before calling `create()`, what is possible with the implementation. This is done via `availability()` methods, which takes the same options as `create()`. They return a promise, which fulfills with one of the following values:
9293

93-
Both of these methods return `"no"`, `"after-download"`, or `"readily"`, which have the same meanings as above, except specialized to the specific arguments in question.
94+
* `"no"` means that the implementation does not support translation or language detection of the given language(s).
95+
* `"after-download"` means that the implementation supports translation or language detection of the given language(s), but it will have to download something (e.g., a machine learning model) as part of creating the associated object.
96+
* `"readily"` means that the implementation supports translation or language detection of the given language(s), without performing any downloads.
9497

9598
Here is an example that adds capability checking to log more information and fall back to cloud services, as part of a language detection plus translation task:
9699

97100
```js
98101
async function translateUnknownCustomerInput(textToTranslate, targetLanguage) {
99-
const languageDetectorCapabilities = await ai.languageDetector.capabilities();
100-
const translatorCapabilities = await ai.translator.capabilities();
102+
const canDetect = await ai.languageDetector.availability();
101103

102-
// If `languageDetectorCapabilities.available === "no"`, then assume the source language is the
104+
// If there is no language detector, then assume the source language is the
103105
// same as the document language.
104106
let sourceLanguage = document.documentElement.lang;
105107

106108
// Otherwise, let's detect the source language.
107-
if (languageDetectorCapabilities.available !== "no") {
108-
if (languageDetectorCapabilities.available === "after-download") {
109+
if (canDetect !== "no") {
110+
if (canDetect === "after-download") {
109111
console.log("Language detection is available, but something will have to be downloaded. Hold tight!");
110112
}
111113

112-
// Special-case check for Japanese since for our site it's particularly important.
113-
if (languageDetectorCapabilities.languageAvailable("ja") === "no") {
114-
console.warn("Japanese Language detection is not available. Falling back to cloud API.");
115-
sourceLanguage = await useSomeCloudAPIToDetectLanguage(textToTranslate);
116-
} else {
117-
const detector = await ai.languageDetector.create();
118-
const [bestResult] = await detector.detect(textToTranslate);
119-
120-
if (bestResult.detectedLangauge ==== null || bestResult.confidence < 0.4) {
121-
// We'll just return the input text without translating. It's probably mostly punctuation
122-
// or something.
123-
return textToTranslate;
124-
}
125-
sourceLanguage = bestResult.detectedLanguage;
114+
const detector = await ai.languageDetector.create();
115+
const [bestResult] = await detector.detect(textToTranslate);
116+
117+
if (bestResult.detectedLangauge ==== null || bestResult.confidence < 0.4) {
118+
// We'll just return the input text without translating. It's probably mostly punctuation
119+
// or something.
120+
return textToTranslate;
126121
}
122+
sourceLanguage = bestResult.detectedLanguage;
127123
}
128124

129125
// Now we've figured out the source language. Let's translate it!
130-
// Note how we can just check `translatorCapabilities.languagePairAvailable()` instead of also checking
131-
// `translatorCapabilities.available`.
132-
const availability = translatorCapabilities.languagePairAvailable(sourceLanguage, targetLanguage);
126+
const availability = await ai.translator.availability({ sourceLanguage, targetLanguage });
133127
if (availability === "no") {
134128
console.warn("Translation is not available. Falling back to cloud API.");
135129
return await useSomeCloudAPIToTranslate(textToTranslate, { sourceLanguage, targetLanguage });
@@ -232,7 +226,7 @@ enum AICapabilityAvailability { "readily", "after-download", "no" };
232226
[Exposed=(Window,Worker), SecureContext]
233227
interface AITranslatorFactory {
234228
Promise<AITranslator> create(AITranslatorCreateOptions options);
235-
Promise<AITranslatorCapabilities> capabilities();
229+
Promise<AICapabilityAvailability> availability(AITranslatorCreateCoreOptions options);
236230
};
237231
238232
[Exposed=(Window,Worker), SecureContext]
@@ -246,19 +240,14 @@ interface AITranslator {
246240
undefined destroy();
247241
};
248242
249-
[Exposed=(Window,Worker), SecureContext]
250-
interface AITranslatorCapabilities {
251-
readonly attribute AICapabilityAvailability available;
252-
253-
AICapabilityAvailability languagePairAvailable(DOMString sourceLanguage, DOMString targetLanguage);
243+
dictionary AITranslatorCreateCoreOptions {
244+
required DOMString sourceLanguage;
245+
required DOMString targetLanguage;
254246
};
255247
256-
dictionary AITranslatorCreateOptions {
248+
dictionary AITranslatorCreateOptions : AITranslatorCreateCoreOptions {
257249
AbortSignal signal;
258250
AICreateMonitorCallback monitor;
259-
260-
required DOMString sourceLanguage;
261-
required DOMString targetLanguage;
262251
};
263252
264253
dictionary AITranslatorTranslateOptions {
@@ -272,25 +261,24 @@ dictionary AITranslatorTranslateOptions {
272261
[Exposed=(Window,Worker), SecureContext]
273262
interface AILanguageDetectorFactory {
274263
Promise<AILanguageDetector> create(optional AILanguageDetectorCreateOptions options = {});
275-
Promise<AILanguageDetectorCapabilities> capabilities();
264+
Promise<AICapabilityAvailability> availability(optional AILanguageDetectorCreateCoreOptions = {});
276265
};
277266
278267
[Exposed=(Window,Worker), SecureContext]
279268
interface AILanguageDetector {
280269
Promise<sequence<LanguageDetectionResult>> detect(DOMString input,
281270
optional AILanguageDetectorDetectOptions options = {});
282271
272+
readonly attribute FrozenArray<DOMString>? expectedInputLanguages;
273+
283274
undefined destroy();
284275
};
285276
286-
[Exposed=(Window,Worker), SecureContext]
287-
interface AILanguageDetectorCapabilities {
288-
readonly attribute AICapabilityAvailability available;
289-
290-
AICapabilityAvailability languageAvailable(DOMString languageTag);
277+
dictionary AILanguageDetectorCreateCoreOptions {
278+
sequence<DOMString> expectedInputLanguages;
291279
};
292280
293-
dictionary AILanguageDetectorCreateOptions {
281+
dictionary AILanguageDetectorCreateOptions : AILanguageDetectorCreateCoreOptions {
294282
AbortSignal signal;
295283
AICreateMonitorCallback monitor;
296284
};
@@ -313,17 +301,9 @@ We're not clear on what the right model is here, and are discussing it in [issue
313301

314302
### Downloading
315303

316-
The current design envisions that the following operations will _not_ cause downloads of language packs or other material like a language detection model:
317-
318-
* `ai.translator.capabilities()` and the properties/methods of the returned object
319-
* `ai.languageDetector.capabilities()` and the properties/methods of the returned object
320-
321-
The following _can_ cause downloads. In all cases, whether or not a call will initiate a download can be detected beforehand by checking the corresponding capabilities object.
322-
323-
* `ai.translator.create()`
324-
* `ai.languageDetector.create()`
304+
The current design envisions that `availability()` methods will _not_ cause downloads of language packs or other material like a language detection model. Whereas, the `create()` methods _can_ cause downloads. In all cases, whether or not creation will initiate a download can be detected beforehand by the corresponding `availability()` method.
325305

326-
After a developer has a `AITranslator` or `AILanguageDetector` object created by these methods, further calls are not expected to cause any downloads. (Although they might require internet access, if the implementation is not entirely on-device.)
306+
After a developer has a `AITranslator` or `AILanguageDetector` object, further calls are not expected to cause any downloads. (Although they might require internet access, if the implementation is not entirely on-device.)
327307

328308
This design means that the implementation must have all information about the capabilities of its translation and language detection models available beforehand, i.e. "shipped with the browser". (Either as part of the browser binary, or through some out-of-band update mechanism that eagerly pushes updates.)
329309

@@ -339,7 +319,7 @@ Some sort of mitigation may be necessary here. We believe this is adjacent to ot
339319
* Partitioning download status by top-level site, introducing a fake download (which takes time but does not actually download anything) for the second-onward site to download a language pack.
340320
* Only exposing a fixed set of languages to this API, e.g. based on the user's locale or the document's main language.
341321

342-
As a first step, we require that detecting the availability of translation/detection be done via individual calls to `translationCapabilities.languagePairAvailable()` and `detectionCapabilities.languageAvailable()`. This allows browsers to implement possible mitigation techniques, such as detecting excessive calls to these methods and starting to return `"no"`.
322+
As a first step, we require that detecting the availability of translation/detection be done via individual calls to `ai.translator.availability()` and `ai.languageDetector.availability()`. This allows browsers to implement possible mitigation techniques, such as detecting excessive calls to these methods and starting to return `"no"`.
343323

344324
Another way in which this API might enhance the web's fingerprinting surface is if translation and language detection models are updated separately from browser versions. In that case, differing results from different versions of the model provide additional fingerprinting bits beyond those already provided by the browser's major version number. Mandating that older browser versions not receive updates or be able to download models from too far into the future might be a possible remediation for this.
345325

@@ -373,13 +353,13 @@ Should we simplify these down with convenience APIs that do both steps at once?
373353

374354
We're open to this idea, but we think the existing complexity is necessary to support the design wherein translation and language detection models might not be already downloaded. By separating the two stages, we allow web developers to perform the initial creation-and-possibly-downloading steps early in their page's lifecycle, in preparation for later, hopefully-quick calls to APIs like `translate()`.
375355

376-
Another possible simplification is to make the `capabilities()` APIs synchronous instead of asynchronous. This would be implementable by having the browser proactively load the capabilities information into the main thread's process, upon creation of the global object. We think this is not worthwhile, as it imposes a non-negligible cost on all global object creation, even when the APIs are not used.
356+
Another possible simplification is to make the `availability()` APIs synchronous instead of asynchronous. This would be implementable by having the browser proactively load the capabilities information into the main thread's process, upon creation of the global object. We think this is not worthwhile, as it imposes a non-negligible cost on all global object creation, even when the APIs are not used.
377357

378358
### Allowing unknown source languages for translation
379359

380360
An earlier revision of this API including support for combining the language detection and translation steps into a single translation call, which did a best-guess on the source language. The idea was that this would possibly be more efficient than requiring the web developer to do two separate calls, and it could possibly even be done using a single model.
381361

382-
We abandoned this design when it became clear that existing browsers have very decoupled implementations of translation vs. language detection, using separate models for each. This includes supporting different languages for language detection vs. for translation. So even if the translation model supported an unknown-source-language mode, it might not support the same inputs as the language detection model, which would create a confusing developer experience and be hard to signal in the capabilities API.
362+
We abandoned this design when it became clear that existing browsers have very decoupled implementations of translation vs. language detection, using separate models for each. This includes supporting different languages for language detection vs. for translation. So even if the translation model supported an unknown-source-language mode, it might not support the same inputs as the language detection model, which would create a confusing developer experience and be hard to signal in the API.
383363

384364
## Stakeholder feedback
385365

index.bs

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
Title: Translator and Language Detector APIs
33
Shortname: translation
44
Level: None
5-
Status: w3c/UD
5+
Status: CG-DRAFT
66
Group: webml
77
Repository: webmachinelearning/translation-api
88
URL: https://webmachinelearning.github.io/translation-api
@@ -11,9 +11,10 @@ Abstract: The translator and langauge detector APIs gives web pages the ability
1111
Markup Shorthands: markdown yes, css no
1212
Complain About: accidental-2119 yes, missing-example-ids yes
1313
Assume Explicit For: yes
14-
Die On: warning
15-
Boilerplate: omit conformance
1614
Default Biblio Status: current
15+
Boilerplate: omit conformance
16+
Indent: 2
17+
Die On: warning
1718
</pre>
1819

1920
Introduction {#intro}

0 commit comments

Comments
 (0)