Default to Qwen/Qwen3-Embedding-0.6B in docs/ examples

alvarobartt · alvarobartt · commit f08f91ccd288 · 2025-06-16T17:38:11.000+02:00
diff --git a/docs/source/en/intel_container.md b/docs/source/en/intel_container.md
@@ -35,7 +35,7 @@ docker build . -f Dockerfile-intel --build-arg PLATFORM=$platform -t tei_cpu_ipe
 To deploy your model on an Intel® CPU, use the following command:
 
 ```shell
-model='BAAI/bge-large-en-v1.5'
+model='Qwen/Qwen3-Embedding-0.6B'
 volume=$PWD/data
 
 docker run -p 8080:80 -v $volume:/data tei_cpu_ipex --model-id $model
@@ -58,7 +58,7 @@ docker build . -f Dockerfile-intel --build-arg PLATFORM=$platform -t tei_xpu_ipe
 To deploy your model on an Intel® XPU, use the following command:
 
 ```shell
-model='BAAI/bge-large-en-v1.5'
+model='Qwen/Qwen3-Embedding-0.6B'
 volume=$PWD/data
 
 docker run -p 8080:80 -v $volume:/data --device=/dev/dri -v /dev/dri/by-path:/dev/dri/by-path tei_xpu_ipex --model-id $model --dtype float16
@@ -81,7 +81,7 @@ docker build . -f Dockerfile-intel --build-arg PLATFORM=$platform -t tei_hpu
 To deploy your model on an Intel® HPU (Gaudi), use the following command:
 
 ```shell
-model='BAAI/bge-large-en-v1.5'
+model='Qwen/Qwen3-Embedding-0.6B'
 volume=$PWD/data
 
 docker run -p 8080:80 -v $volume:/data --runtime=habana -e HABANA_VISIBLE_DEVICES=all -e MAX_WARMUP_SEQUENCE_LENGTH=512 tei_hpu --model-id $model --dtype bfloat16
diff --git a/docs/source/en/local_cpu.md b/docs/source/en/local_cpu.md
@@ -47,10 +47,9 @@ cargo install --path router -F metal
 Once the installation is successfully complete, you can launch Text Embeddings Inference on CPU with the following command:
 
 ```shell
-model=BAAI/bge-large-en-v1.5
-revision=refs/pr/5
+model=Qwen/Qwen3-Embedding-0.6B
 
-text-embeddings-router --model-id $model --revision $revision --port 8080
+text-embeddings-router --model-id $model --port 8080
 ```
 
 <Tip>
diff --git a/docs/source/en/local_gpu.md b/docs/source/en/local_gpu.md
@@ -58,8 +58,7 @@ cargo install --path router -F candle-cuda -F http --no-default-features
 You can now launch Text Embeddings Inference on GPU with:
 
 ```shell
-model=BAAI/bge-large-en-v1.5
-revision=refs/pr/5
+model=Qwen/Qwen3-Embedding-0.6B
 
-text-embeddings-router --model-id $model --revision $revision --port 8080
+text-embeddings-router --model-id $model --dtype float16 --port 8080
 ```
diff --git a/docs/source/en/local_metal.md b/docs/source/en/local_metal.md
@@ -38,10 +38,9 @@ cargo install --path router -F metal
 Once the installation is successfully complete, you can launch Text Embeddings Inference with Metal with the following command:
 
 ```shell
-model=BAAI/bge-large-en-v1.5
-revision=refs/pr/5
+model=Qwen/Qwen3-Embedding-0.6B
 
-text-embeddings-router --model-id $model --revision $revision --port 8080
+text-embeddings-router --model-id $model --port 8080
 ```
 
 Now you are ready to use `text-embeddings-inference` locally on your machine.
diff --git a/docs/source/en/quick_tour.md b/docs/source/en/quick_tour.md
@@ -28,10 +28,10 @@ Next, install the [NVIDIA Container Toolkit](https://docs.nvidia.com/datacenter/
 
 ## Deploy
 
-Next it's time to deploy your model. Let's say you want to use [`BAAI/bge-large-en-v1.5`](https://huggingface.co/BAAI/bge-large-en-v1.5). Here's how you can do this:
+Next it's time to deploy your model. Let's say you want to use [`Qwen/Qwen3-Embedding-0.6B`](https://huggingface.co/Qwen/Qwen3-Embedding-0.6B). Here's how you can do this:
 
 ```shell
-model=BAAI/bge-large-en-v1.5
+model=Qwen/Qwen3-Embedding-0.6B
 volume=$PWD/data
 
 docker run --gpus all -p 8080:80 -v $volume:/data --pull always ghcr.io/huggingface/text-embeddings-inference:1.7 --model-id $model