Commit 2f1e3ee
committed
Add bidirectional attention and projection layer support for voyage-4-nano
This change adds support for the voyageai/voyage-4-nano model which is based on
Qwen3 architecture but with two key modifications:
1. Bidirectional attention (is_causal=False) instead of causal attention
- Added `use_bidirectional_attention` config field (default: false)
- When true, skips causal masking in attention
2. Projection layer (1024 -> 2048 dimensions)
- Added `num_labels` config field for output projection dimension
- When set, loads "linear.weight" and applies projection after final norm
voyage-4-nano config.json includes:
"use_bidirectional_attention": true
"num_labels": 2048
Both flash (CUDA) and non-flash implementations are updated.1 parent cb9de7a commit 2f1e3ee
2 files changed
+89
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
109 | 109 | | |
110 | 110 | | |
111 | 111 | | |
| 112 | + | |
112 | 113 | | |
113 | 114 | | |
114 | 115 | | |
| |||
158 | 159 | | |
159 | 160 | | |
160 | 161 | | |
161 | | - | |
| 162 | + | |
162 | 163 | | |
163 | 164 | | |
164 | 165 | | |
| |||
262 | 263 | | |
263 | 264 | | |
264 | 265 | | |
| 266 | + | |
265 | 267 | | |
266 | 268 | | |
267 | 269 | | |
268 | 270 | | |
269 | 271 | | |
270 | 272 | | |
271 | 273 | | |
272 | | - | |
| 274 | + | |
273 | 275 | | |
274 | 276 | | |
275 | 277 | | |
| |||
285 | 287 | | |
286 | 288 | | |
287 | 289 | | |
| 290 | + | |
288 | 291 | | |
289 | 292 | | |
290 | 293 | | |
| 294 | + | |
291 | 295 | | |
292 | 296 | | |
293 | 297 | | |
| |||
331 | 335 | | |
332 | 336 | | |
333 | 337 | | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
334 | 362 | | |
335 | 363 | | |
336 | 364 | | |
| |||
348 | 376 | | |
349 | 377 | | |
350 | 378 | | |
| 379 | + | |
351 | 380 | | |
352 | 381 | | |
353 | 382 | | |
| 383 | + | |
354 | 384 | | |
355 | 385 | | |
356 | 386 | | |
| |||
376 | 406 | | |
377 | 407 | | |
378 | 408 | | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
379 | 412 | | |
380 | 413 | | |
381 | 414 | | |
| |||
385 | 418 | | |
386 | 419 | | |
387 | 420 | | |
| 421 | + | |
388 | 422 | | |
389 | 423 | | |
390 | 424 | | |
391 | 425 | | |
392 | 426 | | |
393 | 427 | | |
394 | 428 | | |
| 429 | + | |
| 430 | + | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
395 | 436 | | |
396 | 437 | | |
397 | 438 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
24 | 24 | | |
25 | 25 | | |
26 | 26 | | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
| 31 | + | |
| 32 | + | |
27 | 33 | | |
28 | 34 | | |
29 | 35 | | |
| |||
379 | 385 | | |
380 | 386 | | |
381 | 387 | | |
| 388 | + | |
382 | 389 | | |
383 | 390 | | |
384 | 391 | | |
385 | 392 | | |
386 | 393 | | |
| 394 | + | |
387 | 395 | | |
388 | 396 | | |
389 | 397 | | |
| |||
420 | 428 | | |
421 | 429 | | |
422 | 430 | | |
| 431 | + | |
| 432 | + | |
| 433 | + | |
| 434 | + | |
| 435 | + | |
| 436 | + | |
| 437 | + | |
| 438 | + | |
| 439 | + | |
| 440 | + | |
| 441 | + | |
| 442 | + | |
| 443 | + | |
| 444 | + | |
| 445 | + | |
| 446 | + | |
| 447 | + | |
| 448 | + | |
| 449 | + | |
| 450 | + | |
| 451 | + | |
| 452 | + | |
| 453 | + | |
| 454 | + | |
423 | 455 | | |
424 | 456 | | |
425 | 457 | | |
| |||
433 | 465 | | |
434 | 466 | | |
435 | 467 | | |
| 468 | + | |
436 | 469 | | |
437 | 470 | | |
438 | 471 | | |
439 | 472 | | |
440 | 473 | | |
| 474 | + | |
441 | 475 | | |
442 | 476 | | |
443 | 477 | | |
| |||
555 | 589 | | |
556 | 590 | | |
557 | 591 | | |
558 | | - | |
| 592 | + | |
| 593 | + | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
559 | 597 | | |
560 | 598 | | |
561 | 599 | | |
| |||
581 | 619 | | |
582 | 620 | | |
583 | 621 | | |
| 622 | + | |
| 623 | + | |
| 624 | + | |
| 625 | + | |
| 626 | + | |
| 627 | + | |
| 628 | + | |
584 | 629 | | |
585 | 630 | | |
586 | 631 | | |
| |||
0 commit comments