Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -257,6 +257,7 @@ For detailed explanations, parameter descriptions, and use cases for each method
| [**Multi-SLERP** (`multislerp`)](docs/merge_methods.md#multi-slerp-multislerp) | Barycentric SLERP for multiple models. | ≥2 | * | Spherical interpolation for >2 models. |
| [**Karcher Mean** (`karcher`)](docs/merge_methods.md#karcher-mean-karcher) | Riemannian barycenter of model parameters. | ≥2 | - | Geometrically sound averaging on manifolds. |
| [**Task Arithmetic** (`task_arithmetic`)](docs/merge_methods.md#task-arithmetic-task_arithmetic) | Linearly combine "task vectors" (differences from a base). | ≥2 | ✓ | Transferring/combining fine-tuned skills. |
| [**Core Space** (`core_space`)](docs/merge_methods.md#core-space-core_space) | SVD-aligned LoRA merging in compact core subspace. | ≥2 | ✓ | Efficient LoRA merging, heterogeneous ranks, subspace alignment.|
| [**TIES** (`ties`)](docs/merge_methods.md#ties-merging-ties) | Task arithmetic + sparsification & sign consensus. | ≥2 | ✓ | Merging many models, reducing interference. |
| [**DARE** (`dare_linear`, `dare_ties`)](docs/merge_methods.md#dare-dare_linear-dare_ties) | Task arithmetic + random pruning & rescaling. | ≥2 | ✓ | Robust skill retention, similar to TIES. |
| [**DELLA** (`della`, `della_linear`)](docs/merge_methods.md#della-della-della_linear) | Task arithmetic + adaptive magnitude-based pruning. | ≥2 | ✓ | Prioritizing important changes, reducing interference. |
Expand Down
41 changes: 41 additions & 0 deletions docs/merge_methods.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
- [Karcher Mean (`karcher`)](#karcher-mean-karcher)
- [Task Vector Methods](#task-vector-methods)
- [Task Arithmetic (`task_arithmetic`)](#task-arithmetic-task_arithmetic)
- [Core Space (`core_space`)](#core-space)
- [TIES-Merging (`ties`)](#ties-merging-ties)
- [DARE (`dare_linear`, `dare_ties`)](#dare-dare_linear-dare_ties)
- [DELLA (`della`, `della_linear`)](#della-della-della_linear)
Expand Down Expand Up @@ -149,6 +150,46 @@ This guide provides detailed information about the various model merging algorit

**Reference:** [Editing Models with Task Arithmetic](https://arxiv.org/abs/2212.04089)

### Core Space (`core_space`)

**Concept**: Merges LoRA-adapted models by projecting them into a shared, aligned core space using SVD-based reference bases. Operates in a compact subspace for efficiency while preserving information.

**Algorithm**:

1. Extract LoRA matrices (B, A) from each model where ΔW = B @ A
2. Compute reference bases via SVD: concatenate all B matrices horizontally and A matrices vertically, then compute orthonormal bases U_B and V_A
3. Project to core space: Core_i = U_B^T @ B_i @ A_i @ V_A
4. Merge in core space using weighted average
5. Reconstruct: ΔW_merged = U_B @ Core_merged @ V_A^T, then W_final = W_base + ΔW_merged

**Inputs**: Requires 2 or more models, plus one `base_model`.

**Parameters**:

- `weight` (per-model, float, default: 1.0): Weight for each model. Currently uses equal weights.

**Use Cases**:

- Efficiently merging multiple LoRA adapters
- Multi-task model creation from specialized adapters
- When adapters have different ranks
- Resource-constrained environments

**Example**:

```yaml
models:
- model: meta-llama/Llama-2-7b-hf
- model: username/llama2-lora-math
- model: username/llama2-lora-code

merge_method: core_space
base_model: meta-llama/Llama-2-7b-hf
dtype: bfloat16
```

**Reference**: [Accurate and Efficient Low-Rank Model Merging in Core Space](https://arxiv.org/abs/2509.17786) (Panariello et al., NeurIPS 2025)

### TIES-Merging (`ties`)

**Concept:** Builds on Task Arithmetic by sparsifying task vectors and applying a sign consensus algorithm. This helps to resolve interference when merging multiple models and retain more of their individual strengths.
Expand Down
11 changes: 11 additions & 0 deletions examples/core_space.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,11 @@
models:
- model: gpt2
parameters:
weight: 0.5
- model: gpt2
parameters:
weight: 1.0

merge_method: core_space
base_model: gpt2
dtype: float32
Loading
Loading