Dump raw training data for the LLM-jp-3 series

Dump raw training data for the LLM-jp-3 series. For each training instance, the following fields should be included at least:
- `token_ids`: A list of token IDs for the training instance
- `training_step`: Training step at which the training instance was processed
- `dataset`: Name of the dataset from which the instance was sourced
- `document_ids`: IDs of the documents associated with the training instance


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Dump raw training data for the LLM-jp-3 series #46

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Dump raw training data for the LLM-jp-3 series #46

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions