Skip to content

feat: Add utility to create responses datasets based on HF datasetsΒ #208

@mayabar

Description

@mayabar

Create a new utility under the cmd folder which gets as an input parameter an Hugging Face dataset (we need to define which formats will be supported) and a model name to be used for the tokenization. The utility should generate a SQLite file mapping each prompt hash value to the appropriate response.

Datasets created by this utility will be placed in the llm-d organization in HF - https://huggingface.co/llm-d/datasets

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions