Add Instruction Fine-tuning Support for LoRA with Assistant-Only Loss #46

makaveli10 · 2025-10-24T09:12:59Z

This PR adds support for instruction fine-tuning using LoRA adapters with efficient masked loss computation.

Key Features:

Assistant-Only Loss: --assistant-loss-only flag enables training only on assistant responses, ignoring system/user prompts in conversation datasets
Masked Cross-Entropy Loss: CPU and VUlkan shader implementation for computing loss only on assistant tokens
Count-Equal Op: CPU and Vulkan-shader for masked token counting for accuracy metrics during training
Chat ML Template Support: Native support for conversation-format datasets with automatic ChatML rendering
Custom Chat Templates: Support for Jinja chat templates via --chat-template custom-templat.jinja to render JSON conversation datasets (compatible with HuggingFace apply_chat_template)

This enables efficient instruction fine-tuning of language models using LoRA adapters with proper loss masking for conversational AI datasets/applications.

This PR is built on top of #43 so, we should land that first.

- Add masked loss computation on assistant responses only - Implement Vulkan masked cross-entropy loss shader & count_equal shader - Support default ChatML template & custom jinja chat templates

Signed-off-by: Marcus Edel <[email protected]>

zoq · 2025-10-27T20:44:08Z

LR scheduler sample comands:

constant (default):
./build/bin/llama-finetune-lora -m Qwen3_0.6B.Q8_0.gguf -f trump.txt -ngl 999 -c 256 -b 256 -ub 256 --flash-attn off

cosine:
./build/bin/llama-finetune-lora -m Qwen3_0.6B.Q8_0.gguf -f trump.txt -ngl 999 -c 256 -b 256 -ub 256 --flash-attn off --learning-rate 3e-4 --lr-scheduler cosine --lr-min 1e-5 --weight-decay 0.01

linear:
./build/bin/llama-finetune-lora -m Qwen3_0.6B.Q8_0.gguf -f trump.txt -ngl 999 -c 256 -b 256 -ub 256 --flash-attn off --lr-scheduler linear --lr-min 1e-5

Signed-off-by: Marcus Edel <[email protected]>

zoq · 2025-10-27T23:00:29Z

Added --warmup-ratio to match HF training parameters, if warmup-steps is used it overwrites --warmup-ratio.

github-actions bot added Vulkan examples ggml labels Oct 24, 2025

makaveli10 force-pushed the lora-instruct-ft branch 2 times, most recently from 8a29771 to f74cb64 Compare October 24, 2025 09:48

lora: Add Instruction Finetuning support

ec4e2b8

- Add masked loss computation on assistant responses only - Implement Vulkan masked cross-entropy loss shader & count_equal shader - Support default ChatML template & custom jinja chat templates

makaveli10 force-pushed the lora-instruct-ft branch from f74cb64 to ec4e2b8 Compare October 24, 2025 10:11

Add learning rate scheduler: constant (default), linear, and cosine.

f1bed73

Signed-off-by: Marcus Edel <[email protected]>

Add warmup-ratio parameter to match HF training.

4674584

Signed-off-by: Marcus Edel <[email protected]>

makaveli10 added 2 commits October 28, 2025 15:09

lora: Fix lr assertion on step 0

082e7a0

lora: Fix training start from step 2

661890c

gianni-cor merged commit e9825e6 into tetherto:temp-latest-finetuning Nov 5, 2025
35 of 47 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add Instruction Fine-tuning Support for LoRA with Assistant-Only Loss #46

Add Instruction Fine-tuning Support for LoRA with Assistant-Only Loss #46

Uh oh!

makaveli10 commented Oct 24, 2025 •

edited

Loading

Uh oh!

zoq commented Oct 27, 2025

Uh oh!

zoq commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add Instruction Fine-tuning Support for LoRA with Assistant-Only Loss #46

Add Instruction Fine-tuning Support for LoRA with Assistant-Only Loss #46

Uh oh!

Conversation

makaveli10 commented Oct 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

zoq commented Oct 27, 2025

Uh oh!

zoq commented Oct 27, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

makaveli10 commented Oct 24, 2025 •

edited

Loading