Fix geglu back & Add gemma chat-template support #63

makaveli10 · 2025-11-19T15:53:32Z

Fix geglu backward
Add geglu_back test
Add support for using defualt chat-template from the model being fine-tuned which now supports gemma as well. This allows the instruction finetuning to run without the need of a jinja chat-template but can work with it as well.

- Fix CPU implementation: now correctly computes gelu_backward(gate, grad) instead of splitting computation across two halves - Update Vulkan shader to match corrected implementation with proper gelu_backward - Add a test for geglu_back op The previous implementation incorrectly assumed geglu_back operated on concatenated tensors and split them. The correct implementation computes the GELU backward pass element-wise on the gate values.

- Add auto-detection for Gemma format (<start_of_turn>model\n...<end_of_turn>) - Falls back to ChatML format for other models - Uses models default chat-template i.e. no need for jinja chat-template This enables instruction finetuning on any model.

makaveli10 added 2 commits November 18, 2025 01:33

github-actions bot added Vulkan ggml testing labels Nov 19, 2025

gianni-cor merged commit 10fd931 into tetherto:temp-latest-finetuning Nov 20, 2025
36 of 47 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix geglu back & Add gemma chat-template support #63

Fix geglu back & Add gemma chat-template support #63

Uh oh!

makaveli10 commented Nov 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Fix geglu back & Add gemma chat-template support #63

Fix geglu back & Add gemma chat-template support #63

Uh oh!

Conversation

makaveli10 commented Nov 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants