Add bidirectional attention and projection layer support for Qwen3-based models #808
+16,589
−3
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What does this PR do?
This PR adds support for
voyageai/voyage-4-nano, a Qwen3-based embedding model that uses bidirectional attention and a projection layer.Changes
1. Bidirectional Attention Support
use_bidirectional_attentionconfig field (default:false)true, disables causal masking in the attention mechanism2. Projection Layer Support
num_labelsconfig field for output projection dimensionlinear.weightfrom safetensors root level and applies projection after final normalizationModel Configuration
Models using these features should have in their
config.json:{ "use_bidirectional_attention": true, "num_labels": 2048 }Testing
Tested with
voyageai/voyage-4-nano:Files Changed
backends/candle/src/models/flash_qwen3.rs- CUDA/flash attention implementationbackends/candle/src/models/qwen3.rs- CPU/Metal implementation + config structbackends/candle/Cargo.toml- Added cudarc dev-dependency for CUDA testsbackends/candle/tests/test_voyage_nano.rs- CPU test with snapshotsbackends/candle/tests/test_flash_voyage_nano.rs- CUDA test with snapshotsREADME.md- Added voyage-4-nano to supported models tabledocs/source/en/supported_models.md- Added voyage-4-nano to docsBefore submitting
instasnapshots?Who can review?
@Narsil @alvarobartt - This adds two new config fields to support voyage-4-nano embedding model. The changes are backwards compatible (both fields default to disabled behavior).