Skip to content

Conversation

@makaveli10
Copy link

@makaveli10 makaveli10 commented Oct 24, 2025

This PR adds support for instruction fine-tuning using LoRA adapters with efficient masked loss computation.

Key Features:

  • Assistant-Only Loss: --assistant-loss-only flag enables training only on assistant responses, ignoring system/user prompts in conversation datasets
  • Masked Cross-Entropy Loss: CPU and VUlkan shader implementation for computing loss only on assistant tokens
  • Count-Equal Op: CPU and Vulkan-shader for masked token counting for accuracy metrics during training
  • Chat ML Template Support: Native support for conversation-format datasets with automatic ChatML rendering
  • Custom Chat Templates: Support for Jinja chat templates via --chat-template custom-templat.jinja to render JSON conversation datasets (compatible with HuggingFace apply_chat_template)

This enables efficient instruction fine-tuning of language models using LoRA adapters with proper loss masking for conversational AI datasets/applications.

This PR is built on top of #43 so, we should land that first.

@makaveli10 makaveli10 force-pushed the lora-instruct-ft branch 2 times, most recently from 8a29771 to f74cb64 Compare October 24, 2025 09:48
- Add masked loss computation on assistant responses only

- Implement Vulkan masked cross-entropy loss shader & count_equal shader
- Support default ChatML template & custom jinja chat templates
@zoq
Copy link

zoq commented Oct 27, 2025

LR scheduler sample comands:

constant (default):
./build/bin/llama-finetune-lora -m Qwen3_0.6B.Q8_0.gguf -f trump.txt -ngl 999 -c 256 -b 256 -ub 256 --flash-attn off

cosine:
./build/bin/llama-finetune-lora -m Qwen3_0.6B.Q8_0.gguf -f trump.txt -ngl 999 -c 256 -b 256 -ub 256 --flash-attn off --learning-rate 3e-4 --lr-scheduler cosine --lr-min 1e-5 --weight-decay 0.01

linear:
./build/bin/llama-finetune-lora -m Qwen3_0.6B.Q8_0.gguf -f trump.txt -ngl 999 -c 256 -b 256 -ub 256 --flash-attn off --lr-scheduler linear --lr-min 1e-5

@zoq
Copy link

zoq commented Oct 27, 2025

Added --warmup-ratio to match HF training parameters, if warmup-steps is used it overwrites --warmup-ratio.

@gianni-cor gianni-cor merged commit e9825e6 into tetherto:temp-latest-finetuning Nov 5, 2025
35 of 47 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants