Can I finetune Qwen3-Embedding-4B using axolotl? #2860
Replies: 2 comments 1 reply
-
Hello, unfortunately, Axolotl does not support embedding model atm. |
Beta Was this translation helpful? Give feedback.
-
Thanks for the clarification! Just to confirm — would it be possible to implement a custom sentence_pairs dataset type for embedding training while still using the standard Axolotl Docker image, e.g.: My plan is to use Axolotl’s training infrastructure and workflow, but extend it with custom logic to support this kind of binary relevance data for embedding models. Thanks! |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
I’d like to ask if it’s possible to finetune the Qwen3-Embedding-4B model using Axolotl.
My goal is to train it on a dataset structured like this:
[
{
"question": "some query here",
"document": "a related or unrelated document",
"label": 1
},
{
"question": "another query",
"document": "possibly unrelated document",
"label": 0
},
...
]
This is essentially a binary relevance dataset for training an embedding model to distinguish relevant vs. non-relevant pairs.
Is this data format compatible with Axolotl?
What model_type, data_type, or Axolotl config settings should I use for this kind of task?
Thanks in advance!
Beta Was this translation helpful? Give feedback.
All reactions