Skip to content

Releases: huggingface/text-embeddings-inference

v1.8.0

05 Aug 08:31
2bff275
Compare
Choose a tag to compare

Notable Changes

  • Qwen3 support for 0.6B, 4B and 8B on CPU, MPS, and FlashQwen3 on CUDA and Intel HPUs
  • NomicBert MoE support
  • JinaAI Re-Rankers V1 support
  • Matryoshka Representation Learning (MRL)
  • Dense layer module support (after pooling)

Note

Some of the aforementioned changes were released within the patch versions on top of v1.7.0, whilst both Matryoshka Representation Learning (MRL) and Dense layer module support have been recently included and were not released yet.

What's Changed

New Contributors

Full Changelog: v1.7.0...v1.8.0

v1.7.4

07 Jul 12:33
6e900af
Compare
Choose a tag to compare

Noticeable Changes

Qwen3 was not working fine on CPU / MPS when sending batched requests on FP16 precision, due to the FP32 minimum value downcast (now manually set to FP16 minimum value instead) leading to null values, as well as a missing to_dtype call on the attention_bias when working with batches.

What's Changed

Full Changelog: v1.7.3...v1.7.4

v1.7.3

30 Jun 10:54
fb80177
Compare
Choose a tag to compare

Noticeable Changes

Qwen3 support included for Intel HPU, and fixed for CPU / Metal / CUDA.

What's Changed

New Contributors

Full Changelog: v1.7.2...v1.7.3

v1.7.2

16 Jun 06:44
a69cc2e
Compare
Choose a tag to compare

Notable change

  • Added support for Qwen3 embeddigns

What's Changed

New Contributors

Full Changelog: v1.7.1...v1.7.2

v1.7.1

03 Jun 13:38
006e16b
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.7.0...v1.7.1

v1.7.0

08 Apr 11:54
72dac20
Compare
Choose a tag to compare

Notable changes

  • Upgrade dependencies heavily (candle 0.5 -> 0.8 and related)
  • Added ModernBert support by @kozistr !

What's Changed

New Contributors

Full Changelog: v1.6.1...v1.7.0

v1.6.1

28 Mar 08:47
875239e
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.6.0...v1.6.1

v1.6.0

13 Dec 15:52
57d8fc8
Compare
Choose a tag to compare

What's Changed

Full Changelog: v1.5.1...v1.6.0

v1.5.1

05 Nov 15:17
76b29f1
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: v1.5.0...v1.5.1

v1.5.0

10 Jul 15:34
661a77f
Compare
Choose a tag to compare

Notable Changes

  • ONNX runtime for CPU deployments: greatly improve CPU deployment throughput
  • Add /similarity route

What's Changed

New Contributors

Full Changelog: v1.4.0...v1.5.0