Skip to content

Commit e8c0f77

Browse files
authored
Merge pull request #985 from huggingface/v3-docs
Improve documentation (v3)
2 parents d61848e + 96b30ae commit e8c0f77

File tree

9 files changed

+332
-69
lines changed

9 files changed

+332
-69
lines changed

README.md

Lines changed: 52 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -11,25 +11,19 @@
1111
</p>
1212

1313
<p align="center">
14-
<a href="https://www.npmjs.com/package/@huggingface/transformers">
15-
<img alt="NPM" src="https://img.shields.io/npm/v/@huggingface/transformers">
16-
</a>
17-
<a href="https://www.npmjs.com/package/@huggingface/transformers">
18-
<img alt="NPM Downloads" src="https://img.shields.io/npm/dw/@huggingface/transformers">
19-
</a>
20-
<a href="https://www.jsdelivr.com/package/npm/@huggingface/transformers">
21-
<img alt="jsDelivr Hits" src="https://img.shields.io/jsdelivr/npm/hw/@huggingface/transformers">
22-
</a>
23-
<a href="https://github.com/huggingface/transformers.js/blob/main/LICENSE">
24-
<img alt="License" src="https://img.shields.io/github/license/huggingface/transformers.js?color=blue">
25-
</a>
26-
<a href="https://huggingface.co/docs/transformers.js/index">
27-
<img alt="Documentation" src="https://img.shields.io/website/http/huggingface.co/docs/transformers.js/index.svg?down_color=red&down_message=offline&up_message=online">
28-
</a>
14+
<a href="https://www.npmjs.com/package/@huggingface/transformers"><img alt="NPM" src="https://img.shields.io/npm/v/@huggingface/transformers"></a>
15+
<a href="https://www.npmjs.com/package/@huggingface/transformers"><img alt="NPM Downloads" src="https://img.shields.io/npm/dw/@huggingface/transformers"></a>
16+
<a href="https://www.jsdelivr.com/package/npm/@huggingface/transformers"><img alt="jsDelivr Hits" src="https://img.shields.io/jsdelivr/npm/hw/@huggingface/transformers"></a>
17+
<a href="https://github.com/huggingface/transformers.js/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/github/license/huggingface/transformers.js?color=blue"></a>
18+
<a href="https://huggingface.co/docs/transformers.js/index"><img alt="Documentation" src="https://img.shields.io/website/http/huggingface.co/docs/transformers.js/index.svg?down_color=red&down_message=offline&up_message=online"></a>
2919
</p>
3020

3121

32-
State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!
22+
<h3 align="center">
23+
<p>State-of-the-art Machine Learning for the Web</p>
24+
</h3>
25+
26+
Run 🤗 Transformers directly in your browser, with no need for a server!
3327

3428
Transformers.js is designed to be functionally equivalent to Hugging Face's [transformers](https://github.com/huggingface/transformers) python library, meaning you can run the same pretrained models using a very similar API. These models support common tasks in different modalities, such as:
3529
- 📝 **Natural Language Processing**: text classification, named entity recognition, question answering, language modeling, summarization, translation, multiple choice, and text generation.
@@ -42,6 +36,22 @@ Transformers.js uses [ONNX Runtime](https://onnxruntime.ai/) to run models in th
4236
For more information, check out the full [documentation](https://huggingface.co/docs/transformers.js).
4337

4438

39+
## Installation
40+
41+
42+
To install via [NPM](https://www.npmjs.com/package/@huggingface/transformers), run:
43+
```bash
44+
npm i @huggingface/transformers
45+
```
46+
47+
Alternatively, you can use it in vanilla JS, without any bundler, by using a CDN or static hosting. For example, using [ES Modules](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules), you can import the library with:
48+
```html
49+
<script type="module">
50+
import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/[email protected]';
51+
</script>
52+
```
53+
54+
4555
## Quick tour
4656

4757

@@ -72,9 +82,9 @@ out = pipe('I love transformers!')
7282
import { pipeline } from '@huggingface/transformers';
7383

7484
// Allocate a pipeline for sentiment-analysis
75-
let pipe = await pipeline('sentiment-analysis');
85+
const pipe = await pipeline('sentiment-analysis');
7686

77-
let out = await pipe('I love transformers!');
87+
const out = await pipe('I love transformers!');
7888
// [{'label': 'POSITIVE', 'score': 0.999817686}]
7989
```
8090

@@ -86,29 +96,40 @@ let out = await pipe('I love transformers!');
8696
You can also use a different model by specifying the model id or path as the second argument to the `pipeline` function. For example:
8797
```javascript
8898
// Use a different model for sentiment-analysis
89-
let pipe = await pipeline('sentiment-analysis', 'Xenova/bert-base-multilingual-uncased-sentiment');
99+
const pipe = await pipeline('sentiment-analysis', 'Xenova/bert-base-multilingual-uncased-sentiment');
90100
```
91101

102+
By default, when running in the browser, the model will be run on your CPU (via WASM). If you would like
103+
to run the model on your GPU (via WebGPU), you can do this by setting `device: 'webgpu'`, for example:
104+
```javascript
105+
// Run the model on WebGPU
106+
const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english', {
107+
device: 'webgpu',
108+
});
109+
```
92110

93-
## Installation
111+
For more information, check out the [WebGPU guide](https://huggingface.co/docs/transformers.js/guides/webgpu).
94112

113+
> [!WARNING]
114+
> The WebGPU API is still experimental in many browsers, so if you run into any issues,
115+
> please file a [bug report](https://github.com/huggingface/transformers.js/issues/new?title=%5BWebGPU%5D%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
95116
96-
To install via [NPM](https://www.npmjs.com/package/@huggingface/transformers), run:
97-
```bash
98-
npm i @huggingface/transformers
99-
```
100-
101-
Alternatively, you can use it in vanilla JS, without any bundler, by using a CDN or static hosting. For example, using [ES Modules](https://developer.mozilla.org/en-US/docs/Web/JavaScript/Guide/Modules), you can import the library with:
102-
```html
103-
<script type="module">
104-
import { pipeline } from 'https://cdn.jsdelivr.net/npm/@huggingface/[email protected]';
105-
</script>
117+
In resource-constrained environments, such as web browsers, it is advisable to use a quantized version of
118+
the model to lower bandwidth and optimize performance. This can be achieved by adjusting the `dtype` option,
119+
which allows you to select the appropriate data type for your model. While the available options may vary
120+
depending on the specific model, typical choices include `"fp32"` (default for WebGPU), `"fp16"`, `"q8"`
121+
(default for WASM), and `"q4"`. For more information, check out the [quantization guide](https://huggingface.co/docs/transformers.js/guides/dtypes).
122+
```javascript
123+
// Run the model at 4-bit quantization
124+
const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english', {
125+
dtype: 'q4',
126+
});
106127
```
107128

108129

109130
## Examples
110131

111-
Want to jump straight in? Get started with one of our sample applications/templates:
132+
Want to jump straight in? Get started with one of our sample applications/templates, which can be found [here](https://github.com/huggingface/transformers.js-examples).
112133

113134
| Name | Description | Links |
114135
|-------------------|----------------------------------|-------------------------------|

docs/scripts/build_readme.py

Lines changed: 9 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -13,33 +13,23 @@
1313
</p>
1414
1515
<p align="center">
16-
<a href="https://www.npmjs.com/package/@huggingface/transformers">
17-
<img alt="NPM" src="https://img.shields.io/npm/v/@huggingface/transformers">
18-
</a>
19-
<a href="https://www.npmjs.com/package/@huggingface/transformers">
20-
<img alt="NPM Downloads" src="https://img.shields.io/npm/dw/@huggingface/transformers">
21-
</a>
22-
<a href="https://www.jsdelivr.com/package/npm/@huggingface/transformers">
23-
<img alt="jsDelivr Hits" src="https://img.shields.io/jsdelivr/npm/hw/@huggingface/transformers">
24-
</a>
25-
<a href="https://github.com/huggingface/transformers.js/blob/main/LICENSE">
26-
<img alt="License" src="https://img.shields.io/github/license/huggingface/transformers.js?color=blue">
27-
</a>
28-
<a href="https://huggingface.co/docs/transformers.js/index">
29-
<img alt="Documentation" src="https://img.shields.io/website/http/huggingface.co/docs/transformers.js/index.svg?down_color=red&down_message=offline&up_message=online">
30-
</a>
16+
<a href="https://www.npmjs.com/package/@huggingface/transformers"><img alt="NPM" src="https://img.shields.io/npm/v/@huggingface/transformers"></a>
17+
<a href="https://www.npmjs.com/package/@huggingface/transformers"><img alt="NPM Downloads" src="https://img.shields.io/npm/dw/@huggingface/transformers"></a>
18+
<a href="https://www.jsdelivr.com/package/npm/@huggingface/transformers"><img alt="jsDelivr Hits" src="https://img.shields.io/jsdelivr/npm/hw/@huggingface/transformers"></a>
19+
<a href="https://github.com/huggingface/transformers.js/blob/main/LICENSE"><img alt="License" src="https://img.shields.io/github/license/huggingface/transformers.js?color=blue"></a>
20+
<a href="https://huggingface.co/docs/transformers.js/index"><img alt="Documentation" src="https://img.shields.io/website/http/huggingface.co/docs/transformers.js/index.svg?down_color=red&down_message=offline&up_message=online"></a>
3121
</p>
3222
3323
{intro}
3424
35-
## Quick tour
36-
37-
{quick_tour}
38-
3925
## Installation
4026
4127
{installation}
4228
29+
## Quick tour
30+
31+
{quick_tour}
32+
4333
## Examples
4434
4535
{examples}

docs/snippets/0_introduction.snippet

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,9 @@
11

2-
State-of-the-art Machine Learning for the web. Run 🤗 Transformers directly in your browser, with no need for a server!
2+
<h3 align="center">
3+
<p>State-of-the-art Machine Learning for the Web</p>
4+
</h3>
5+
6+
Run 🤗 Transformers directly in your browser, with no need for a server!
37

48
Transformers.js is designed to be functionally equivalent to Hugging Face's [transformers](https://github.com/huggingface/transformers) python library, meaning you can run the same pretrained models using a very similar API. These models support common tasks in different modalities, such as:
59
- 📝 **Natural Language Processing**: text classification, named entity recognition, question answering, language modeling, summarization, translation, multiple choice, and text generation.

docs/snippets/1_quick-tour.snippet

Lines changed: 30 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -26,9 +26,9 @@ out = pipe('I love transformers!')
2626
import { pipeline } from '@huggingface/transformers';
2727
2828
// Allocate a pipeline for sentiment-analysis
29-
let pipe = await pipeline('sentiment-analysis');
29+
const pipe = await pipeline('sentiment-analysis');
3030
31-
let out = await pipe('I love transformers!');
31+
const out = await pipe('I love transformers!');
3232
// [{'label': 'POSITIVE', 'score': 0.999817686}]
3333
```
3434

@@ -40,5 +40,32 @@ let out = await pipe('I love transformers!');
4040
You can also use a different model by specifying the model id or path as the second argument to the `pipeline` function. For example:
4141
```javascript
4242
// Use a different model for sentiment-analysis
43-
let pipe = await pipeline('sentiment-analysis', 'Xenova/bert-base-multilingual-uncased-sentiment');
43+
const pipe = await pipeline('sentiment-analysis', 'Xenova/bert-base-multilingual-uncased-sentiment');
44+
```
45+
46+
By default, when running in the browser, the model will be run on your CPU (via WASM). If you would like
47+
to run the model on your GPU (via WebGPU), you can do this by setting `device: 'webgpu'`, for example:
48+
```javascript
49+
// Run the model on WebGPU
50+
const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english', {
51+
device: 'webgpu',
52+
});
53+
```
54+
55+
For more information, check out the [WebGPU guide](/guides/webgpu).
56+
57+
> [!WARNING]
58+
> The WebGPU API is still experimental in many browsers, so if you run into any issues,
59+
> please file a [bug report](https://github.com/huggingface/transformers.js/issues/new?title=%5BWebGPU%5D%20Error%20running%20MODEL_ID_GOES_HERE&assignees=&labels=bug,webgpu&projects=&template=1_bug-report.yml).
60+
61+
In resource-constrained environments, such as web browsers, it is advisable to use a quantized version of
62+
the model to lower bandwidth and optimize performance. This can be achieved by adjusting the `dtype` option,
63+
which allows you to select the appropriate data type for your model. While the available options may vary
64+
depending on the specific model, typical choices include `"fp32"` (default for WebGPU), `"fp16"`, `"q8"`
65+
(default for WASM), and `"q4"`. For more information, check out the [quantization guide](/guides/dtypes).
66+
```javascript
67+
// Run the model at 4-bit quantization
68+
const pipe = await pipeline('sentiment-analysis', 'Xenova/distilbert-base-uncased-finetuned-sst-2-english', {
69+
dtype: 'q4',
70+
});
4471
```

docs/snippets/3_examples.snippet

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
Want to jump straight in? Get started with one of our sample applications/templates:
1+
Want to jump straight in? Get started with one of our sample applications/templates, which can be found [here](https://github.com/huggingface/transformers.js-examples).
22

33
| Name | Description | Links |
44
|-------------------|----------------------------------|-------------------------------|

docs/source/_toctree.yml

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -23,10 +23,14 @@
2323
title: Server-side Inference in Node.js
2424
title: Tutorials
2525
- sections:
26+
- local: guides/webgpu
27+
title: Running models on WebGPU
28+
- local: guides/dtypes
29+
title: Using quantized models (dtypes)
2630
- local: guides/private
2731
title: Accessing Private/Gated Models
2832
- local: guides/node-audio-processing
29-
title: Server-side Audio Processing in Node.js
33+
title: Server-side Audio Processing
3034
title: Developer Guides
3135
- sections:
3236
- local: api/transformers

docs/source/guides/dtypes.md

Lines changed: 130 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,130 @@
1+
# Using quantized models (dtypes)
2+
3+
Before Transformers.js v3, we used the `quantized` option to specify whether to use a quantized (q8) or full-precision (fp32) variant of the model by setting `quantized` to `true` or `false`, respectively. Now, we've added the ability to select from a much larger list with the `dtype` parameter.
4+
5+
The list of available quantizations depends on the model, but some common ones are: full-precision (`"fp32"`), half-precision (`"fp16"`), 8-bit (`"q8"`, `"int8"`, `"uint8"`), and 4-bit (`"q4"`, `"bnb4"`, `"q4f16"`).
6+
7+
<p align="center">
8+
<picture>
9+
<source media="(prefers-color-scheme: dark)" srcset="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/transformersjs-v3/dtypes-dark.jpg" style="max-width: 100%;">
10+
<source media="(prefers-color-scheme: light)" srcset="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/transformersjs-v3/dtypes-light.jpg" style="max-width: 100%;">
11+
<img alt="Available dtypes for mixedbread-ai/mxbai-embed-xsmall-v1" src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/transformersjs-v3/dtypes-dark.jpg" style="max-width: 100%;">
12+
</picture>
13+
<a href="https://huggingface.co/mixedbread-ai/mxbai-embed-xsmall-v1/tree/main/onnx">(e.g., mixedbread-ai/mxbai-embed-xsmall-v1)</a>
14+
</p>
15+
16+
## Basic usage
17+
18+
**Example:** Run Qwen2.5-0.5B-Instruct in 4-bit quantization ([demo](https://v2.scrimba.com/s0dlcpv0ci))
19+
20+
```js
21+
import { pipeline } from "@huggingface/transformers";
22+
23+
// Create a text generation pipeline
24+
const generator = await pipeline(
25+
"text-generation",
26+
"onnx-community/Qwen2.5-0.5B-Instruct",
27+
{ dtype: "q4", device: "webgpu" },
28+
);
29+
30+
// Define the list of messages
31+
const messages = [
32+
{ role: "system", content: "You are a helpful assistant." },
33+
{ role: "user", content: "Tell me a funny joke." },
34+
];
35+
36+
// Generate a response
37+
const output = await generator(messages, { max_new_tokens: 128 });
38+
console.log(output[0].generated_text.at(-1).content);
39+
```
40+
41+
## Per-module dtypes
42+
43+
Some encoder-decoder models, like Whisper or Florence-2, are extremely sensitive to quantization settings: especially of the encoder. For this reason, we added the ability to select per-module dtypes, which can be done by providing a mapping from module name to dtype.
44+
45+
**Example:** Run Florence-2 on WebGPU ([demo](https://v2.scrimba.com/s0pdm485fo))
46+
47+
```js
48+
import { Florence2ForConditionalGeneration } from "@huggingface/transformers";
49+
50+
const model = await Florence2ForConditionalGeneration.from_pretrained(
51+
"onnx-community/Florence-2-base-ft",
52+
{
53+
dtype: {
54+
embed_tokens: "fp16",
55+
vision_encoder: "fp16",
56+
encoder_model: "q4",
57+
decoder_model_merged: "q4",
58+
},
59+
device: "webgpu",
60+
},
61+
);
62+
```
63+
64+
<p align="middle">
65+
<img src="https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/blog/transformersjs-v3/florence-2-webgpu.gif" alt="Florence-2 running on WebGPU" />
66+
</p>
67+
68+
<details>
69+
<summary>
70+
See full code example
71+
</summary>
72+
73+
```js
74+
import {
75+
Florence2ForConditionalGeneration,
76+
AutoProcessor,
77+
AutoTokenizer,
78+
RawImage,
79+
} from "@huggingface/transformers";
80+
81+
// Load model, processor, and tokenizer
82+
const model_id = "onnx-community/Florence-2-base-ft";
83+
const model = await Florence2ForConditionalGeneration.from_pretrained(
84+
model_id,
85+
{
86+
dtype: {
87+
embed_tokens: "fp16",
88+
vision_encoder: "fp16",
89+
encoder_model: "q4",
90+
decoder_model_merged: "q4",
91+
},
92+
device: "webgpu",
93+
},
94+
);
95+
const processor = await AutoProcessor.from_pretrained(model_id);
96+
const tokenizer = await AutoTokenizer.from_pretrained(model_id);
97+
98+
// Load image and prepare vision inputs
99+
const url = "https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/transformers/tasks/car.jpg";
100+
const image = await RawImage.fromURL(url);
101+
const vision_inputs = await processor(image);
102+
103+
// Specify task and prepare text inputs
104+
const task = "<MORE_DETAILED_CAPTION>";
105+
const prompts = processor.construct_prompts(task);
106+
const text_inputs = tokenizer(prompts);
107+
108+
// Generate text
109+
const generated_ids = await model.generate({
110+
...text_inputs,
111+
...vision_inputs,
112+
max_new_tokens: 100,
113+
});
114+
115+
// Decode generated text
116+
const generated_text = tokenizer.batch_decode(generated_ids, {
117+
skip_special_tokens: false,
118+
})[0];
119+
120+
// Post-process the generated text
121+
const result = processor.post_process_generation(
122+
generated_text,
123+
task,
124+
image.size,
125+
);
126+
console.log(result);
127+
// { '<MORE_DETAILED_CAPTION>': 'A green car is parked in front of a tan building. The building has a brown door and two brown windows. The car is a two door and the door is closed. The green car has black tires.' }
128+
```
129+
130+
</details>

0 commit comments

Comments
 (0)