Skip to content

NbAiLab/mmlu-translate

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MMLU Translation Tools

This repository provides tools for translating the Massive Multitask Language Understanding (MMLU) dataset from English to Norwegian, along with evaluation scripts.

The final dataset itself is available on HuggingFace

Contents

  • Research Protocol: Details the translation process, quality scoring, and evaluation strategy.
  • Translation Quality Evaluation: Evaluation of translation quality.
  • Translation Scripts: Tools to accurately translate MMLU questions while preserving structure and meaning.
  • Evaluation Tools: Built upon lm-evaluation-harness to assess translation quality and model performance on both Norwegian and English datasets.

Overview

The MMLU dataset includes over 14,000 multiple-choice questions across 57 subjects. High-quality translations ensure that the original difficulty and context are maintained for Norwegian audiences.

Usage

The repo is mainly for creating the actual dataset, but the final result can also be used. You would then use it together with the lm-evaluation-harness. Example usage:

git clone --depth 1 https://github.com/EleutherAI/lm-evaluation-harness
cd lm-evaluation-harness
pip install -e .

cd ..
git clone https://github.com/NbAiLab/mmlu-translate
cd mmlu-translate

lm_eval \                                          
  --model hf \
  --model_args pretrained=<org>/<model> \
  --tasks global_mmlu_full_nb \
  --include_path ./mmlu-translate/tasks \
  --output results/mmlu-translate/0-shot/<org>/<model> \
  --log_samples \
  --show_config \
  --write_out \
  --batch_size auto \
  --num_fewshot 0

License

This project is licensed under the MIT License.

About

Translating the MMLU to Norwegian

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 2

  •  
  •