MMLU Translation Tools

This repository provides tools for translating the Massive Multitask Language Understanding (MMLU) dataset from English to Norwegian, along with evaluation scripts.

The final dataset itself is available on HuggingFace

Research Protocol: Details the translation process, quality scoring, and evaluation strategy.
Translation Quality Evaluation: Evaluation of translation quality.
Translation Scripts: Tools to accurately translate MMLU questions while preserving structure and meaning.
Evaluation Tools: Built upon lm-evaluation-harness to assess translation quality and model performance on both Norwegian and English datasets.

Overview

The MMLU dataset includes over 14,000 multiple-choice questions across 57 subjects. High-quality translations ensure that the original difficulty and context are maintained for Norwegian audiences.

Usage

The repo is mainly for creating the actual dataset, but the final result can also be used. You would then use it together with the lm-evaluation-harness. Example usage:

git clone --depth 1 https://github.com/EleutherAI/lm-evaluation-harness
cd lm-evaluation-harness
pip install -e .

cd ..
git clone https://github.com/NbAiLab/mmlu-translate
cd mmlu-translate

lm_eval \                                          
  --model hf \
  --model_args pretrained=<org>/<model> \
  --tasks global_mmlu_full_nb \
  --include_path ./mmlu-translate/tasks \
  --output results/mmlu-translate/0-shot/<org>/<model> \
  --log_samples \
  --show_config \
  --write_out \
  --batch_size auto \
  --num_fewshot 0

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
alexandria_data		alexandria_data
experiment		experiment
experiment_compare		experiment_compare
mmlu-no-best-clean		mmlu-no-best-clean
mmlu-no-best		mmlu-no-best
mmlu-no-comparison-dev		mmlu-no-comparison-dev
mmlu-no-comparison		mmlu-no-comparison
mmlu-no		mmlu-no
tasks/full/nb		tasks/full/nb
templates		templates
test		test
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
convert_alexandria.py		convert_alexandria.py
fetch_mmlu.py		fetch_mmlu.py
make_dataset.py		make_dataset.py
mmlu_analyse_comparisons.py		mmlu_analyse_comparisons.py
mmlu_comparison_deepinfra.py		mmlu_comparison_deepinfra.py
mmlu_find_best_scores.py		mmlu_find_best_scores.py
mmlu_translate_deepinfra.py		mmlu_translate_deepinfra.py
research_protocol.md		research_protocol.md
run_alexandria_comparison.sh		run_alexandria_comparison.sh
run_comparison.sh		run_comparison.sh
translation_experiment.md		translation_experiment.md
translation_quality_evaluation.md		translation_quality_evaluation.md
translation_run.md		translation_run.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

MMLU Translation Tools

Contents

Overview

Usage

License

About

Uh oh!

Releases

Packages

Contributors 2

Uh oh!

Languages

License

NbAiLab/mmlu-translate

Folders and files

Latest commit

History

Repository files navigation

MMLU Translation Tools

Contents

Overview

Usage

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Uh oh!

Languages

Packages