Commit 539f322
committed
Add support for voyage-4-nano embedding model
Add two new config fields to Qwen3 to support voyage-4-nano and similar models:
- `use_bidirectional_attention`: When true, disables causal masking
for embedding models that use full bidirectional attention
- `num_labels`: When set, loads projection layer from linear.weight
at safetensors root level (e.g., 1024 -> 2048 for voyage-4-nano)
Both fields are backwards compatible, defaulting to disabled behavior.
Changes:
- backends/candle/src/models/qwen3.rs: Add config fields and CPU impl
- backends/candle/src/models/flash_qwen3.rs: Add CUDA/flash-attn impl
- backends/candle/tests/test_voyage_nano.rs: CPU tests with snapshots
- backends/candle/tests/test_flash_voyage_nano.rs: CUDA tests
- README.md, docs/source/en/supported_models.md: Add voyage-4-nano
Tested with voyageai/voyage-4-nano:
- Output dimension: 2048 (correct)
- Cosine similarity vs transformers: 0.999965
- Inference time: ~9ms on L4 GPU (vs 35ms with transformers)1 parent cb9de7a commit 539f322
File tree
8 files changed
+8388
-3
lines changed- backends/candle
- src/models
- tests
- snapshots
- docs/source/en
8 files changed
+8388
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
91 | 91 | | |
92 | 92 | | |
93 | 93 | | |
| 94 | + | |
94 | 95 | | |
95 | 96 | | |
96 | 97 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
109 | 109 | | |
110 | 110 | | |
111 | 111 | | |
| 112 | + | |
112 | 113 | | |
113 | 114 | | |
114 | 115 | | |
| |||
158 | 159 | | |
159 | 160 | | |
160 | 161 | | |
161 | | - | |
| 162 | + | |
162 | 163 | | |
163 | 164 | | |
164 | 165 | | |
| |||
262 | 263 | | |
263 | 264 | | |
264 | 265 | | |
| 266 | + | |
265 | 267 | | |
266 | 268 | | |
267 | 269 | | |
268 | 270 | | |
269 | 271 | | |
270 | 272 | | |
271 | 273 | | |
272 | | - | |
| 274 | + | |
273 | 275 | | |
274 | 276 | | |
275 | 277 | | |
| |||
285 | 287 | | |
286 | 288 | | |
287 | 289 | | |
| 290 | + | |
288 | 291 | | |
289 | 292 | | |
290 | 293 | | |
| 294 | + | |
291 | 295 | | |
292 | 296 | | |
293 | 297 | | |
| |||
313 | 317 | | |
314 | 318 | | |
315 | 319 | | |
| 320 | + | |
| 321 | + | |
316 | 322 | | |
317 | 323 | | |
318 | 324 | | |
| |||
331 | 337 | | |
332 | 338 | | |
333 | 339 | | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
334 | 357 | | |
335 | 358 | | |
336 | 359 | | |
| |||
348 | 371 | | |
349 | 372 | | |
350 | 373 | | |
| 374 | + | |
351 | 375 | | |
352 | 376 | | |
353 | 377 | | |
| 378 | + | |
354 | 379 | | |
355 | 380 | | |
356 | 381 | | |
| |||
376 | 401 | | |
377 | 402 | | |
378 | 403 | | |
| 404 | + | |
| 405 | + | |
379 | 406 | | |
380 | 407 | | |
381 | 408 | | |
| |||
385 | 412 | | |
386 | 413 | | |
387 | 414 | | |
| 415 | + | |
388 | 416 | | |
389 | 417 | | |
390 | 418 | | |
391 | 419 | | |
392 | 420 | | |
393 | 421 | | |
394 | 422 | | |
| 423 | + | |
| 424 | + | |
| 425 | + | |
| 426 | + | |
| 427 | + | |
| 428 | + | |
395 | 429 | | |
396 | 430 | | |
397 | 431 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
27 | 31 | | |
28 | 32 | | |
29 | 33 | | |
| |||
379 | 383 | | |
380 | 384 | | |
381 | 385 | | |
| 386 | + | |
382 | 387 | | |
383 | 388 | | |
384 | 389 | | |
385 | 390 | | |
386 | 391 | | |
| 392 | + | |
387 | 393 | | |
388 | 394 | | |
389 | 395 | | |
| |||
402 | 408 | | |
403 | 409 | | |
404 | 410 | | |
| 411 | + | |
| 412 | + | |
405 | 413 | | |
406 | 414 | | |
407 | 415 | | |
| |||
420 | 428 | | |
421 | 429 | | |
422 | 430 | | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
423 | 448 | | |
424 | 449 | | |
425 | 450 | | |
| |||
433 | 458 | | |
434 | 459 | | |
435 | 460 | | |
| 461 | + | |
436 | 462 | | |
437 | 463 | | |
438 | 464 | | |
439 | 465 | | |
440 | 466 | | |
| 467 | + | |
441 | 468 | | |
442 | 469 | | |
443 | 470 | | |
| |||
555 | 582 | | |
556 | 583 | | |
557 | 584 | | |
558 | | - | |
| 585 | + | |
| 586 | + | |
| 587 | + | |
559 | 588 | | |
560 | 589 | | |
561 | 590 | | |
| |||
581 | 610 | | |
582 | 611 | | |
583 | 612 | | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
584 | 619 | | |
585 | 620 | | |
586 | 621 | | |
| |||
0 commit comments