Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
WoRA (Weighted-Direction Low-Rank Adaptation) Implementation for PEFT
Summary
This pull request adds support for WoRA (Weighted-Direction Low-Rank Adaptation), a novel extension of DoRA that introduces learnable scalar parameters (alpha and beta) to create a weighted combination of the base weights and LoRA adapters. WoRA provides more fine-grained control over the adaptation process compared to standard LoRA and DoRA.
Fixes : #2861
Analysis and Understanding
WoRA Formula
WoRA extends DoRA by introducing two learnable scalar parameters:
Where:
mis the learned magnitude vector (from DoRA)W₀is the base weight matrixBAis the LoRA decomposition (B × A)α(alpha) controls the LoRA contributionβ(beta) controls the base weight contributionscalingis the LoRA scaling factorKey Insights
LoraVariant Pattern: The existing DoRA implementation uses a clean separation between:
wora.py) that handle forward computationvariants.py) that handle initialization and variant-specific logicParameter Naming Convention: PEFT automatically marks parameters as trainable if their names contain "lora_". This is why we use
lora_wora_alphaandlora_wora_betaParameterDict Storage: Using
nn.ParameterDictensures parameters are:Layer-Specific Challenges:
Implementation Approach
1. Core Architecture (wora.py)
Created four main layer classes:
WoraLinearLayer: Base implementation for linear transformationsWoraEmbeddingLayer: Handles token embeddings with proper matrix transposition_WoraConvNdLayer: Base class for convolutional layersWoraConv1dLayer,WoraConv2dLayer,WoraConv3dLayer: Specialized conv layersKey Design Decisions:
.item()) to avoid affecting the norm computation2. Variant Classes (variants.py)
Implemented five variant classes following PEFT's LoraVariant pattern:
WoraLinearVariantWoraEmbeddingVariantWoraConv1dVariant,WoraConv2dVariant,WoraConv3dVariantEach variant handles:
init(): Creating and initializing WoRA-specific parametersforward(): Calling the appropriate layer forward methodmerge_safe/merge_unsafe(): Merging adapters with base weightsunmerge(): Restoring original weights3. Parameter Initialization (layer.py)
Modified three key methods to initialize WoRA parameters:
LoraLayer.update_layer(): Base implementation for Linear layersEmbedding.update_layer(): Special handling for embedding layers_ConvNd.update_layer(): Handling for convolutional layersInitialization Pattern:
4. Configuration (config.py)
Added
use_woraboolean flag toLoraConfigwith proper validation:Falsefor backward compatibilityTrue5. Testing (test_lora_variants.py)
Added comprehensive tests:
test_variant_is_applied_to_layers: Verifies WoRA variants are correctly applied to all layer typestest_wora_params_have_gradients: Ensures alpha and beta parameters receive gradients during backpropagationKey Technical Challenges and Solutions
Challenge 1: Gradient Flow for Alpha and Beta
Problem: Initial implementation used
.item()to convert Parameters to scalars throughout the computation, breaking gradient flow.Solution:
Challenge 2: Embedding Layer Matrix Dimensions
Problem: Embedding layers store lora_embedding_A and lora_embedding_B with shapes that need transposition before use.
Solution:
lora_embedding_A.Tandlora_embedding_B.TChallenge 3: Parameter Initialization in Override Methods
Problem:
Embeddingand_ConvNdclasses overrideupdate_layer()without callingsuper(), so they missed WoRA parameter initialization.Solution:
requires_grad_(True)to ensure trainabilityChallenge 4: Conv Layer Forward Pass
Problem: Convolutional layers have more complex forward logic with bias handling and reshaping requirements.
Solution:
Verification and Testing
Test Coverage
The implementation includes two parametrized tests that cover:
Variant Application Test: Verifies that:
Gradient Flow Test: Verifies that:
Test Results
All tests pass successfully:

Files Modified
src/peft/tuners/lora/config.py: Addeduse_woraconfiguration parametersrc/peft/tuners/lora/layer.py: Added WoRA parameter initialization in update_layer methodssrc/peft/tuners/lora/wora.py: Implemented WoRA layer classessrc/peft/tuners/lora/variants.py: Implemented WoRA variant classestests/test_lora_variants.py: Added comprehensive WoRA testsBackward Compatibility
This implementation maintains full backward compatibility:
cc: @BenjaminBossan