Dataset format for multi-turn conversations in DPO #2417
Replies: 3 comments 5 replies
-
Hello, thanks for the question. For DPO, the dataset is usually like so https://axolotl-ai-cloud.github.io/axolotl/docs/rlhf.html#chat_template.default It would only have the last turn being different for the model to be able to choose chosen/rejected, not multi-turn difference. Given your dataset, I would advise converting it into ones like above (splitting your dataset into smaller sections with the last message being the different one) or doing SFT instead. |
Beta Was this translation helpful? Give feedback.
-
Basically split the conversation into turns then for each turn have the history as the question and the last response as the answer |
Beta Was this translation helpful? Give feedback.
-
Hey @NanoCode012 .I wanted to clarify something regarding the dpo_dataset_format, Is it acceptable for the messages field to contain a full multi-turn conversation, with alternating messages between the user and the assistant, effectively capturing the full dialogue history? { |
Beta Was this translation helpful? Give feedback.
Uh oh!
There was an error while loading. Please reload this page.
-
Can you help on how to format the dataset for multi-turn dpo? Below is an example of the first record from my dataset containing three keys: system,chosen and rejected
{'system': 'You are a helpful and knowledgeable AI assisting in patient-doctor conversations.',
'chosen': [{'role': 'assistant',
'content': 'Hello, I am Dr. Smith. Can you tell me what brings you to the hospital today?'},
{'role': 'user',
'content': 'Yes, I have been feeling very weak and sick for the past two weeks. I have a persistent fever and dry cough.'},
{'role': 'assistant', 'content': 'I see. And how is your breathing?'},
{'role': 'user',
'content': "It's been shallow and rapid, especially when I am at rest. And I get severely breathless even with minor physical activities."},
{'role': 'assistant',
'content': 'Okay. I understand. You were given physical therapy, right?'},
{'role': 'user',
'content': 'Yes, they focused on educating me about dyspnea-relieving positions and the importance of regular mobilization and deep-breathing exercises.'}],
{'role': 'assistant', 'content': "That's good. },
'rejected': [{'role': 'assistant',
'content': "Hey, I'm Dr. Smith. What's up?"},
{'role': 'user',
'content': 'Yes, I have been feeling very weak and sick for the past two weeks. I have a persistent fever and dry cough.'},
{'role': 'assistant', 'content': 'So, breathing okay?'},
{'role': 'user',
'content': "It's been shallow and rapid, especially when I am at rest. And I get severely breathless even with minor physical activities."},
{'role': 'assistant', 'content': 'Right. You had physical therapy, yeah?'},
{'role': 'user',
'content': 'Yes, they focused on educating me about dyspnea-relieving positions and the importance of regular mobilization and deep-breathing exercises.'},
{'role': 'assistant', 'content': "That is good"},
}]}
Beta Was this translation helpful? Give feedback.
All reactions