Can I finetune Qwen3-Embedding-4B using axolotl? #2860

LiorAshkenazy1 · 2025-07-03T13:12:35Z

LiorAshkenazy1
Jul 3, 2025

I’d like to ask if it’s possible to finetune the Qwen3-Embedding-4B model using Axolotl.
My goal is to train it on a dataset structured like this:
[
{
"question": "some query here",
"document": "a related or unrelated document",
"label": 1
},
{
"question": "another query",
"document": "possibly unrelated document",
"label": 0
},
...
]
This is essentially a binary relevance dataset for training an embedding model to distinguish relevant vs. non-relevant pairs.
Is this data format compatible with Axolotl?
What model_type, data_type, or Axolotl config settings should I use for this kind of task?

Thanks in advance!

NanoCode012 · 2025-07-03T13:21:38Z

NanoCode012
Jul 3, 2025
Maintainer

Hello, unfortunately, Axolotl does not support embedding model atm.

0 replies

LiorAshkenazy1 · 2025-07-03T14:08:39Z

LiorAshkenazy1
Jul 3, 2025
Author

Thanks for the clarification!

Just to confirm — would it be possible to implement a custom sentence_pairs dataset type for embedding training while still using the standard Axolotl Docker image, e.g.:
container:
name: axolotl
image: winglian/axolotl:0.9.0

My plan is to use Axolotl’s training infrastructure and workflow, but extend it with custom logic to support this kind of binary relevance data for embedding models.
Is that feasible within the current framework, or would it require building a custom image or deeply modifying the internals?

Thanks!

1 reply

NanoCode012 Jul 4, 2025
Maintainer

You would def modify the internals. I'm not familiar enough to know what the dataset tokenized format looks like too. Even the model loader needs to be changed. We have patches that are model specific

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Can I finetune Qwen3-Embedding-4B using axolotl? #2860

Uh oh!

{{title}}

Uh oh!

Replies: 2 comments 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{editor}}'s edit

{{editor}}'s edit

Uh oh!

Select a reply

Uh oh!

Uh oh!

Can I finetune Qwen3-Embedding-4B using axolotl? #2860

Uh oh!

LiorAshkenazy1 Jul 3, 2025

Replies: 2 comments · 1 reply

Uh oh!

NanoCode012 Jul 3, 2025 Maintainer

Uh oh!

LiorAshkenazy1 Jul 3, 2025 Author

Uh oh!

Uh oh!

NanoCode012 Jul 4, 2025 Maintainer

LiorAshkenazy1
Jul 3, 2025

Replies: 2 comments 1 reply

NanoCode012
Jul 3, 2025
Maintainer

LiorAshkenazy1
Jul 3, 2025
Author

NanoCode012 Jul 4, 2025
Maintainer