19 Nov 16:50

xenova

bf09aaf

3.8.0 Latest

Latest

🚀 Transformers.js v3.8 — SAM2, SAM3, EdgeTAM, Supertonic TTS

Add support for EdgeTAM in #1454

Add support for Supertonic TTS in #1459

Example:

import { pipeline } from '@huggingface/transformers';

const tts = await pipeline('text-to-speech', 'onnx-community/Supertonic-TTS-ONNX');

const input_text = 'This is really cool!';
const audio = await tts(input_text, {
    speaker_embeddings: 'https://huggingface.co/onnx-community/Supertonic-TTS-ONNX/resolve/main/voices/F1.bin',
});
await audio.save('output.wav');

Add support for SAM2 and SAM3 (Tracker) in #1461
Remove Metaspace add_prefix_space logic in #1451
ImageProcessor preprocess uses image_std for fill value by @NathanKolbas in #1455

New Contributors

@NathanKolbas made their first contribution in #1455

Full Changelog: 3.7.6...3.8.0

Contributors

NathanKolbas

Assets 2

20 Oct 19:44

xenova

3.7.6

4c908ec

3.7.6

What's new?

Fix issue when temperature=0 and do_sample=true by @nico-martin in #1431
Fix type errors by @nico-martin in #1436
Add support for NanoChat in #1441
Add support for Parakeet CTC in #1440

New Contributors

@nico-martin made their first contribution in #1431

Full Changelog: 3.7.5...3.7.6

Contributors

nico-martin

Assets 2

02 Oct 13:58

xenova

3.7.5

c670bb9

3.7.5

What's new?

Add support for GraniteMoeHybrid in #1426

Full Changelog: 3.7.4...3.7.5

Assets 2

29 Sep 17:40

xenova

3.7.4

d6b3998

3.7.4

What's new?

Correctly assign logits warpers in _get_logits_processor in #1422

Full Changelog: 3.7.3...3.7.4

Assets 2

12 Sep 20:35

xenova

3.7.3

699dcb5

3.7.3

What's new?

Unify inference chains in #1399
Fix progress tracking bug by @kukudixiaoming in #1405
Add support for MobileLLM-R1 (llama4_text) in #1412
Add support for VaultGemma in #1413

New Contributors

@kukudixiaoming made their first contribution in #1405

Full Changelog: 3.7.2...3.7.3

Contributors

kukudixiaoming

Assets 2

15 Aug 17:58

xenova

3.7.2

28852a2

3.7.2

What's new?

Add support for DINOv3 in #1390

See here for the full list of supported models.

Example: Compute image embeddings

import { pipeline } from '@huggingface/transformers';

const image_feature_extractor = await pipeline(
    'image-feature-extraction',
    'onnx-community/dinov3-vits16-pretrain-lvd1689m-ONNX',
);
const url = 'https://huggingface.co/datasets/huggingface/documentation-images/resolve/main/cats.png';
const features = await image_feature_extractor(url);
console.log(features);

Try it out using our online demo:

dinov3.mp4

Full Changelog: 3.7.1...3.7.2

Assets 2

01 Aug 21:14

xenova

3.7.1

8d6c400

3.7.1

What's new?

Add support for Arcee in #1377
Optimize tensor.slice() by @Honry in #1381

New Contributors

@Honry made their first contribution in #1381

Full Changelog: 3.7.0...3.7.1

Contributors

Honry

Assets 2

23 Jul 03:12

xenova

3.7.0

0feb5b7

3.7.0

🚀 Transformers.js v3.7 — Voxtral, LFM2, ModernBERT Decoder

🤖 New models

This update adds support for 3 new architectures:

Voxtral
LFM2
ModernBERT Decoder

Voxtral

Voxtral Mini is an enhancement of Ministral 3B, incorporating state-of-the-art audio input capabilities while retaining best-in-class text performance. It excels at speech transcription, translation and audio understanding. ONNX weights for Voxtral-Mini-3B-2507 can be found here. Learn more about Voxtral in the release blog post.

Try it out with our online demo:

Voxtral.WebGPU.demo.mp4

Example: Audio transcription

import { VoxtralForConditionalGeneration, VoxtralProcessor, TextStreamer, read_audio } from "@huggingface/transformers";

// Load the processor and model
const model_id = "onnx-community/Voxtral-Mini-3B-2507-ONNX";
const processor = await VoxtralProcessor.from_pretrained(model_id);
const model = await VoxtralForConditionalGeneration.from_pretrained(
    model_id,
    {
        dtype: {
            embed_tokens: "fp16", // "fp32", "fp16", "q8", "q4"
            audio_encoder: "q4", // "fp32", "fp16", "q8", "q4", "q4f16"
            decoder_model_merged: "q4", // "q4", "q4f16"
        },
        device: "webgpu",
    },
);

// Prepare the conversation
const conversation = [
    {
        "role": "user",
        "content": [
            { "type": "audio" },
            { "type": "text", "text": "lang:en [TRANSCRIBE]" },
        ],
    }
];
const text = processor.apply_chat_template(conversation, { tokenize: false });
const audio = await read_audio("http://huggingface.co/datasets/Xenova/transformers.js-docs/resolve/main/mlk.wav", 16000);
const inputs = await processor(text, audio);

// Generate the response
const generated_ids = await model.generate({
    ...inputs,
    max_new_tokens: 256,
    streamer: new TextStreamer(processor.tokenizer, { skip_special_tokens: true, skip_prompt: true }),
});

// Decode the generated tokens
const new_tokens = generated_ids.slice(null, [inputs.input_ids.dims.at(-1), null]);
const generated_texts = processor.batch_decode(
    new_tokens,
    { skip_special_tokens: true },
);
console.log(generated_texts[0]);
// I have a dream that one day this nation will rise up and live out the true meaning of its creed.

Added in #1373 and #1375.

LFM2

LFM2 is a new generation of hybrid models developed by Liquid AI, specifically designed for edge AI and on-device deployment. It sets a new standard in terms of quality, speed, and memory efficiency.

The models, which we have converted to ONNX, come in three different sizes: 350M, 700M, and 1.2B parameters.

Example: Text-generation with LFM2-350M:

import { pipeline, TextStreamer } from "@huggingface/transformers";

// Create a text generation pipeline
const generator = await pipeline(
  "text-generation",
  "onnx-community/LFM2-350M-ONNX",
  { dtype: "q4" },
);

// Define the list of messages
const messages = [
  { role: "system", content: "You are a helpful assistant." },
  { role: "user", content: "What is the capital of France?" },
];

// Generate a response
const output = await generator(messages, {
    max_new_tokens: 512,
    do_sample: false,
    streamer: new TextStreamer(generator.tokenizer, { skip_prompt: true, skip_special_tokens: true }),
});
console.log(output[0].generated_text.at(-1).content);
// The capital of France is Paris. It is a vibrant city known for its historical landmarks, art, fashion, and gastronomy.

Added in #1367 and #1369.

ModernBERT Decoder

These models form part of the Ettin suite: the first collection of paired encoder-only and decoder-only models trained with identical data, architecture, and training recipes. Ettin enables fair comparisons between encoder and decoder architectures across multiple scales, providing state-of-the-art performance for open-data models in their respective size categories.

The list of supported models can be found here.

import { pipeline, TextStreamer } from "@huggingface/transformers";

// Create a text generation pipeline
const generator = await pipeline(
  "text-generation",
  "onnx-community/ettin-decoder-150m-ONNX",
  { dtype: "fp32" },
);

// Generate a response
const text = "Q: What is the capital of France?\nA:";
const output = await generator(text, {
  max_new_tokens: 128,
  streamer: new TextStreamer(generator.tokenizer, { skip_prompt: true, skip_special_tokens: true }),
});
console.log(output[0].generated_text);

Added in #1371.

🛠️ Other improvements

Add special tokens in text-generation pipeline if tokenizer requires in #1370

Full Changelog: 3.6.3...3.7.0

Assets 2

11 Jul 20:11

xenova

3.6.3

467f59c

3.6.3

What's new?

Bump @huggingface/jinja to version 0.5.1 for new chat template functionality in #1364

Full Changelog: 3.6.2...3.6.3

Assets 2

Releases: huggingface/transformers.js

3.8.0

🚀 Transformers.js v3.8 — SAM2, SAM3, EdgeTAM, Supertonic TTS

New Contributors

Contributors

Uh oh!

3.7.6

What's new?

New Contributors

Contributors

Uh oh!

3.7.5

What's new?

Uh oh!

3.7.4

What's new?

Uh oh!

3.7.3

What's new?

New Contributors

Contributors

Uh oh!

3.7.2

What's new?

Uh oh!

3.7.1

What's new?

New Contributors

Contributors

Uh oh!

3.7.0

🚀 Transformers.js v3.7 — Voxtral, LFM2, ModernBERT Decoder

🤖 New models

Voxtral

LFM2

ModernBERT Decoder

🛠️ Other improvements

Uh oh!

3.6.3

What's new?

Uh oh!