Skip to content

Commit f4213a2

Browse files
committed
Switch the tensor override helper to a local variable
Tested with this configuration in BatchedExecutorSimple: parameters.GpuLayerCount = 99; parameters.TensorBufferOverrides = new List<Abstractions.TensorBufferOverride> { new("blk\.(2[6-9]|[3-4][0-9]).*", "CPU") }; Because I used that to speed up Qwen-3-30B-A3B by a factor of 10 on my machine (though it would likely be less for batching since it's an MoE).
1 parent 3c8d239 commit f4213a2

File tree

1 file changed

+1
-2
lines changed

1 file changed

+1
-2
lines changed

LLama/Extensions/IModelParamsExtensions.cs

Lines changed: 1 addition & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,6 @@ namespace LLama.Extensions;
1111
/// </summary>
1212
public static class IModelParamsExtensions
1313
{
14-
private static LLamaTensorBufferOverrideHelper bufferOverrideHelper = new();
15-
1614
/// <summary>
1715
/// Convert the given `IModelParams` into a `LLamaModelParams`
1816
/// </summary>
@@ -50,6 +48,7 @@ public static IDisposable ToLlamaModelParams(this IModelParams @params, out LLam
5048
// Add tensor buffer overrides, if any
5149
if (@params.TensorBufferOverrides.Count > 0)
5250
{
51+
var bufferOverrideHelper = new LLamaTensorBufferOverrideHelper();
5352
disposer.Add(bufferOverrideHelper);
5453

5554
foreach (var tensorOverride in @params.TensorBufferOverrides)

0 commit comments

Comments
 (0)