server: add `--lora-layer-range START END` to match existing `--control-vector-layer-range START END` #14681

jukofyork · 2025-07-14T14:45:48Z

This just adds the equivalent option for LoRAs as already existed for Control Vectors, eg:

Using --lora-layer-range 0 59 on a LoRA with 64 pairs of A/B tensors:

llama_adapter_lora_init_impl: loading lora adapter from 'qwq-32b-writer-lora-F32.gguf' ...
llama_adapter_lora_init_impl: Metal_Mapped LoRA buffer size =   120.00 MiB
llama_adapter_lora_init_impl: loaded 120 tensors from lora file

It does change the function signature of llama_adapter_lora_init:

    // Load a LoRA adapter from file
    // il_start and il_end are the layer range the lora should apply to (both inclusive)
    LLAMA_API struct llama_adapter_lora * llama_adapter_lora_init(
            struct llama_model * model,
                    const char * path_lora,
                       int32_t   il_start,
                       int32_t   il_end);

but it is only called from common.cpp here:

    // load and optionally apply lora adapters
    if (!params.lora_adapters.empty()) {
        if (params.lora_layer_start < 0) params.lora_layer_start = 0;
        if (params.lora_layer_end   < 0) params.lora_layer_end   = llama_model_n_layer(model);

        for (auto & la : params.lora_adapters) {
            llama_adapter_lora_ptr lora;
            lora.reset(llama_adapter_lora_init(model, la.path.c_str(), params.lora_layer_start, params.lora_layer_end));
            if (lora == nullptr) {
                LOG_ERR("%s: failed to apply lora adapter '%s'\n", __func__, la.path.c_str());
                llama_free(lctx);
                llama_model_free(model);
                return iparams;
            }

            la.ptr = lora.get();
            iparams.lora.emplace_back(std::move(lora)); // copy to list of loaded adapters
        }
    }

(just something to be aware of if any other external tool uses this as an API call, etc)

ngxson · 2025-07-14T16:03:15Z

I think both the lora and control vector part in llama.cpp have little usage, so we should not make it too complicated.

Even without this --lora-layer-range, user can easy slice off some layers from the lora gguf to remove those layers. So it may not worth adding a lot of code into the project just to do that. And if we add it, most users will not use it anyway, as it is not as intuitive as controlling the scale of each adapter.

Ability to specify layer range is needed in control vector because in most cases, the model breaks when apply it to all layers. Obviously this is also why no one actually use it in production. By contrast, all lora adapters work fine when apply to all layers by default.

jukofyork · 2025-07-14T16:53:34Z

I think both the lora and control vector part in llama.cpp have little usage, so we should not make it too complicated.

Even without this --lora-layer-range, user can easy slice off some layers from the lora gguf to remove those layers. So it may not worth adding a lot of code into the project just to do that. And if we add it, most users will not use it anyway, as it is not as intuitive as controlling the scale of each adapter.

Ability to specify layer range is needed in control vector because in most cases, the model breaks when apply it to all layers. Obviously this is also why no one actually use it in production. By contrast, all lora adapters work fine when apply to all layers by default.

No problem and I agree it's not hard to just trim the layers from the gguf directly if needed.

I'll close this now.

jukofyork added 4 commits July 14, 2025 14:48

Added --lora-layer-range option

b5c3eaa

Added missing args to llama_adapter_lora_init call

25da381

Fixed lower end of range as LoRAs can be applied to layer 0

7828e4f

Updated the README.md for llama-server

71f8b75

jukofyork requested a review from ngxson as a code owner July 14, 2025 14:45

github-actions bot added examples server labels Jul 14, 2025

jukofyork closed this Jul 14, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

server: add `--lora-layer-range START END` to match existing `--control-vector-layer-range START END` #14681

server: add `--lora-layer-range START END` to match existing `--control-vector-layer-range START END` #14681

Uh oh!

jukofyork commented Jul 14, 2025 •

edited

Loading

Uh oh!

ngxson commented Jul 14, 2025

Uh oh!

jukofyork commented Jul 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

server: add --lora-layer-range START END to match existing --control-vector-layer-range START END #14681

server: add --lora-layer-range START END to match existing --control-vector-layer-range START END #14681

Uh oh!

Conversation

jukofyork commented Jul 14, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ngxson commented Jul 14, 2025

Uh oh!

jukofyork commented Jul 14, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

server: add `--lora-layer-range START END` to match existing `--control-vector-layer-range START END` #14681

server: add `--lora-layer-range START END` to match existing `--control-vector-layer-range START END` #14681

jukofyork commented Jul 14, 2025 •

edited

Loading