Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/merge_methods.md
Original file line number Diff line number Diff line change
Expand Up @@ -347,6 +347,10 @@ Finally, the (variance-selected, calculated-weighted, and sign-agreed) task vect

- `scale` (per-model, optional): A scalar to multiply the tensor by. Useful for scaling specific layers, e.g., `{"filter": "down_proj", "value": 0.5}`

- `noise_scale` (per-model, optional)
- `noise_seed` (per-model, optional)
- `noise_variance` (per-model, optional): Boolean toggle whether to scale the noise based on the tensor's standard deviation.

---

## Summary
Expand Down
28 changes: 27 additions & 1 deletion mergekit/merge_methods/passthrough.py
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,27 @@ def execute(self, tensors: Dict[ModelReference, torch.Tensor]) -> torch.Tensor:
if scale is not None:
tensor = tensor * scale

noise_scale = self.tensor_parameters[model].data.get("noise_scale", None)
if noise_scale is not None and noise_scale != 0.0:
noise_seed = self.tensor_parameters[model].data.get("noise_seed", 42)
noise_generator = torch.Generator()
if noise_seed is not None:
noise_generator = noise_generator.manual_seed(int(noise_seed))
print("applying noise_seed")

print(f"Noise Generator Seed: {noise_generator.initial_seed()}")
random_tensor = torch.empty_like(tensor).normal_(generator=noise_generator)
noisy_tensor = random_tensor * noise_scale

noise_variance = self.tensor_parameters[model].data.get("noise_variance", False)
if noise_variance is not None and noise_variance != 0.0:
noisy_tensor = noisy_tensor * (tensor.std() * noise_variance)
print("applying noise_variance")

tensor = tensor + noisy_tensor

print(f"noise_scale={noise_scale}, noise_seed={noise_seed}, noise_variance={noise_variance}")
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Silence Debugging Noise

Multiple print() statements for debugging noise injection were left in the code. These will clutter output during production merges and should be removed or replaced with proper logging.

Fix in Cursor Fix in Web

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To remove before merging


return tensor

def group_label(self) -> Optional[str]:
Expand All @@ -46,7 +67,12 @@ def pretty_name(self) -> Optional[str]:
return "Passthrough"

def tensor_parameters(self) -> List[ConfigParameterDef]:
return [ConfigParameterDef(name="scale", required=False, default_value=None)]
return [
ConfigParameterDef(name="scale", required=False, default_value=None),
ConfigParameterDef(name="noise_scale", required=False, default_value=None),
ConfigParameterDef(name="noise_variance", required=False, default_value=None),
ConfigParameterDef(name="noise_seed", required=False, default_value=None)
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: Default Seed Value Not Honored

The noise_seed parameter has default_value=None in its definition, but line 36 expects it to default to 42. When not provided by the user, the parameter will be None in the dict, so .get("noise_seed", 42) returns None instead of 42, causing the seed to be None rather than the intended default of 42.

Fix in Cursor Fix in Web

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I remember getting confused by this one. ConfigParameterDef seems to cast the default value into a string?

]

def make_task(
self,
Expand Down
Loading