Creative commons training data #90

samuk · 2025-01-28T10:36:54Z

samuk
Jan 28, 2025

Is the intention to train this on the common corpus/ open training data? https://huggingface.co/blog/Pclanglais/common-models

ATaylorAerospace · 2025-02-02T07:10:40Z

ATaylorAerospace
Feb 2, 2025

Is the intention to train this on the common corpus/ open training data? https://huggingface.co/blog/Pclanglais/common-models

@samuk The datasets requested for Open-R1 are domain specific datasets that will be used for fine tuning through RLHF, GRPO methods etc vs common corpus which is generally for pre-training.

The following domain specific datasets are currently requested for Open-R1
Math #24
Code #28
Medicine #31
Law #50
Linguistics #60
Fine Arts #61

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Creative commons training data #90

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Creative commons training data #90

Uh oh!

samuk Jan 28, 2025

Replies: 1 comment

Uh oh!

ATaylorAerospace Feb 2, 2025

samuk
Jan 28, 2025

ATaylorAerospace
Feb 2, 2025