Blend for Super 3 and Training Questions

Hi, I noticed that the blend both raw and tiny are empty for super_v3. Will you be sharing the blends with proportions at some point? Also, I noticed that in the paper there are 2 Phases mentioned, Phase 1 at 256k and Phase 2 at 512k, do you have those samplings as well? Finally, the chat template strips reasoning data from multi-turn samples right? How is this controlled during training? Thank you!

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Blend for Super 3 and Training Questions #119

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Blend for Super 3 and Training Questions #119

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions