Skip to content

Commit 5874683

Browse files
authored
Overhaul availability testing and add expected input languages
Remove the `ai.languageModel.capabilities()` method and its accompanying `AILanguageModelCapabilities` class. Instead, replace it with: * `ai.languageModel.availability(options)`, which takes the same options as `ai.languageModel.create()`, and returns the corresponding availability. * `ai.languageModel.params()`, which returns the default and max params (currently top-K and temperature). Additionally, add the `expectedInputLanguages` option to `create()` and `availability()`. The addition of this option to `create()` allows the web developer to signal the expected input languages ahead of time, allowing the downloading of additional material, or fast-failing if the additional material cannot be supported. The addition of this option to `availability()` replaces the `(await ai.languageModel.capabilities()).languageAvailable()` method. Closes #29; see especially #29 (comment). See also webmachinelearning/writing-assistance-apis#22 and webmachinelearning/translation-api#31.
1 parent 47d9f17 commit 5874683

File tree

1 file changed

+66
-34
lines changed

1 file changed

+66
-34
lines changed

README.md

Lines changed: 66 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -173,28 +173,32 @@ console.log(await promptWithCalculator("What is 2 + 2?"));
173173

174174
We'll likely explore more specific APIs for tool- and function-calling in the future; follow along in [issue #7](https://github.com/webmachinelearning/prompt-api/issues/7).
175175

176-
### Configuration of per-session options
176+
### Configuration of per-session parameters
177177

178-
In addition to the `systemPrompt` and `initialPrompts` options shown above, the currently-configurable options are [temperature](https://huggingface.co/blog/how-to-generate#sampling) and [top-K](https://huggingface.co/blog/how-to-generate#top-k-sampling). More information about the values for these parameters can be found using the `capabilities()` API explained [below](#capabilities-detection).
178+
In addition to the `systemPrompt` and `initialPrompts` options shown above, the currently-configurable model parameters are [temperature](https://huggingface.co/blog/how-to-generate#sampling) and [top-K](https://huggingface.co/blog/how-to-generate#top-k-sampling). The `params()` API gives the default, minimum, and maximum values for these parameters.
179+
180+
_However, see [issue #42](https://github.com/webmachinelearning/prompt-api/issues/42): sampling hyperparameters are not universal among models._
179181

180182
```js
181183
const customSession = await ai.languageModel.create({
182184
temperature: 0.8,
183185
topK: 10
184186
});
185187

186-
const capabilities = await ai.languageModel.capabilities();
188+
const params = await ai.languageModel.params();
187189
const slightlyHighTemperatureSession = await ai.languageModel.create({
188190
temperature: Math.max(
189-
capabilities.defaultTemperature * 1.2,
190-
capabilities.maxTemperature
191+
params.defaultTemperature * 1.2,
192+
params.maxTemperature
191193
),
192194
topK: 10
193195
});
194196

195-
// capabilities also contains defaultTopK and maxTopK.
197+
// params also contains defaultTopK and maxTopK.
196198
```
197199

200+
If the language model is not available at all in this browser, `params()` will fulfill with `null`.
201+
198202
### Session persistence and cloning
199203

200204
Each language model session consists of a persistent series of interactions with the model:
@@ -316,31 +320,57 @@ session.addEventListener("contextoverflow", () => {
316320
});
317321
```
318322

319-
### Capabilities detection
323+
### Multilingual content and expected languages
320324

321-
In all our above examples, we call `ai.languageModel.create()` and assume it will always succeed.
325+
The default behavior for a language model session assumes that the input languages are unknown. In this case, implementations will use whatever "base" capabilities they have available for the language model, and might throw `"NotSupportedError"` `DOMException`s if they encounter languages they don't support.
322326

323-
However, sometimes a language model needs to be downloaded before the API can be used. In such cases, immediately calling `create()` will start the download, which might take a long time. The capabilities API gives you insight into the download status of the model:
327+
It's better practice, if possible, to supply the `create()` method with information about the expected input languages. This allows the implementation to download any necessary supporting material, such as fine-tunings or safety-checking models, and to immediately reject the promise returned by `create()` if the web developer needs to use languages that the browser is not capable of supporting:
324328

325329
```js
326-
const capabilities = await ai.languageModel.capabilities();
327-
console.log(capabilities.available);
330+
const session = await ai.languageModel.create({
331+
systemPrompt: `
332+
You are a foreign-language tutor for Japanese. The user is Korean. If necessary, either you or
333+
the user might "break character" and ask for or give clarification in Korean. But by default,
334+
prefer speaking in Japanese, and return to the Japanese conversation once any sidebars are
335+
concluded.
336+
`,
337+
expectedInputLanguages: ["en" /* for the system prompt */, "ja", "kr"]
338+
});
328339
```
329340

330-
The `capabilities.available` property is a string that can take one of three values:
341+
Note that there is no way of specifying output languages, since these are governed by the language model's own decisions. Similarly, the expected input languages do not affect the context or prompt the language model sees; they only impact the process of setting up the session and performing appropriate downloads.
342+
343+
### Testing available options before creation
344+
345+
In the simple case, web developers should call `ai.languageModel.create()`, and handle failures gracefully.
346+
347+
However, if the web developer wants to provide a differentiated user experience, which lets users know ahead of time that the feature will not be possible or might require a download, they can use the promise-returning `ai.languageModel.availability()` method. This method lets developers know, before calling `create()`, what is possible with the implementation.
348+
349+
The method will return a promise that fulfills with one of the following availability values:
350+
351+
* "`no`" means that the implementation does not support the requested options, or does not support prompting a language model at all.
352+
* "`after-download`" means that the implementation supports the requested options, but it will have to download something (e.g. the language model itself, or a fine-tuning) before it can create a session using those options.
353+
* "`readily`" means that the implementation supports the requested options without requiring any new downloads.
331354

332-
* `"no"`, indicating the device or browser does not support prompting a language model at all.
333-
* `"after-download"`, indicating the device or browser supports prompting a language model, but it needs to be downloaded before it can be used.
334-
* `"readily"`, indicating the device or browser supports prompting a language model and it’s ready to be used without any downloading steps.
355+
An example usage is the following:
335356

336-
In the `"after-download"` case, developers might want to have users confirm before you call `create()` to start the download, since doing so uses up significant bandwidth and users might not be willing to wait for a large download before using the site or feature.
357+
```js
358+
const options = { expectedInputLanguages: ["en", "es"], temperature: 2 };
337359

338-
Note that regardless of the return value of `available`, `create()` might also fail, if either the download fails or the session creation fails.
360+
const supportsOurUseCase = await ai.languageModel.availability(options);
339361

340-
The capabilities API also contains other information about the model:
362+
if (supportsOurUseCase !== "no") {
363+
if (supportsOurUseCase === "after-download") {
364+
console.log("Sit tight, we need to do some downloading...");
365+
}
341366

342-
* `defaultTemperature`, `maxTemperature`, `defaultTopK`, and `maxTopK` properties giving information about the model's sampling parameters.
343-
* `languageAvailable(languageTag)`, which returns `"no"`, `"after-download"`, or `"readily"` to indicate whether the model supports conversing in a given human language.
367+
const session = await ai.languageModel.create({ ...options, systemPrompt: "..." });
368+
// ... Use session ...
369+
} else {
370+
// Either the API overall, or the expected languages and temperature setting, is not available.
371+
console.error("No language model for us :(");
372+
}
373+
```
344374

345375
### Download progress
346376

@@ -403,7 +433,8 @@ enum AICapabilityAvailability { "readily", "after-download", "no" };
403433
[Exposed=(Window,Worker), SecureContext]
404434
interface AILanguageModelFactory {
405435
Promise<AILanguageModel> create(optional AILanguageModelCreateOptions options = {});
406-
Promise<AILanguageModelCapabilities> capabilities();
436+
Promise<AICapabilityAvailability> availability(optional AILanguageModelCreateCoreOptions options = {});
437+
Promise<AILanguageModelInfo?> params();
407438
};
408439
409440
[Exposed=(Window,Worker), SecureContext]
@@ -418,6 +449,7 @@ interface AILanguageModel : EventTarget {
418449
419450
readonly attribute unsigned long topK;
420451
readonly attribute float temperature;
452+
readonly attribute FrozenArray<DOMString>? expectedInputLanguages;
421453
422454
attribute EventHandler oncontextoverflow;
423455
@@ -426,25 +458,25 @@ interface AILanguageModel : EventTarget {
426458
};
427459
428460
[Exposed=(Window,Worker), SecureContext]
429-
interface AILanguageModelCapabilities {
430-
readonly attribute AICapabilityAvailability available;
431-
AICapabilityAvailability languageAvailable(DOMString languageTag);
432-
433-
// Always null if available === "no"
434-
readonly attribute unsigned long? defaultTopK;
435-
readonly attribute unsigned long? maxTopK;
436-
readonly attribute float? defaultTemperature;
437-
readonly attribute float? maxTemperature;
461+
interface AILanguageModelParams {
462+
readonly attribute unsigned long defaultTopK;
463+
readonly attribute unsigned long maxTopK;
464+
readonly attribute float defaultTemperature;
465+
readonly attribute float maxTemperature;
438466
};
439467
440-
dictionary AILanguageModelCreateOptions {
468+
dictionary AILanguageModelCreateCoreOptions {
469+
[EnforceRange] unsigned long topK;
470+
float temperature;
471+
sequence<DOMString> expectedInputLanguages;
472+
}
473+
474+
dictionary AILanguageModelCreateOptions : AILanguageModelCreateCoreOptions {
441475
AbortSignal signal;
442476
AICreateMonitorCallback monitor;
443477
444478
DOMString systemPrompt;
445479
sequence<AILanguageModelInitialPrompt> initialPrompts;
446-
[EnforceRange] unsigned long topK;
447-
float temperature;
448480
};
449481
450482
dictionary AILanguageModelInitialPrompt {
@@ -489,7 +521,7 @@ To ensure the API can be used by web developers across multiple implementations,
489521
To actually get a response back from the model given a prompt, the following possible stages are involved:
490522

491523
1. Download the model, if necessary.
492-
2. Establish a session, including configuring [per-session options](#configuration-of-per-session-options).
524+
2. Establish a session, including configuring per-session options and parameters.
493525
3. Add an initial prompt to establish context. (This will not generate a response.)
494526
4. Execute a prompt and receive a response.
495527

0 commit comments

Comments
 (0)