Multilingual LLM Emotion Attribution Study

This repository contains the public data and code package for a preregistered study of multilingual emotion attribution by large language models (LLMs). It is prepared for reviewers and readers who want to inspect the study design, reproduce the reported analyses, and verify the repair-aware official dataset.

The associated manuscript should be read through the journal submission or publication site. Manuscript source files, review-response materials, and local submission artifacts are intentionally not included in this public repository.

Study Overview

The study examines how six LLMs assign four emotion scores to literary texts under three full language conditions. A language condition changes the instruction language, persona description, title, author name, and text content together.

Study ID: models6_multilingual_2026-03-06
Design: 6 models x 3 languages x 3 texts x 4 personas x 3 trials
Total API calls: 648
Languages: Japanese (ja), English (en), Traditional Chinese (zh)
Emotions: interesting, surprise, sadness, anger
Main model: OLS with HC3 robust standard errors, fitted separately for each emotion

The Japanese source texts are in the public domain. The English and Traditional Chinese versions used in this study were machine-translated by the authors for experimental purposes and are included to support reproducibility.

Official Dataset

Use the repaired combined dataset for the main analysis:

outputs/models6_multilingual_2026-03-06_combined01/raw_public.jsonl
outputs/models6_multilingual_2026-03-06_combined01/parsed_wide.csv
outputs/models6_multilingual_2026-03-06_combined01/parsed_long.csv
outputs/models6_multilingual_2026-03-06_combined01/repair_failed_job_keys.txt

Quality status:

648/648 unique job keys
648/648 successful API requests
648/648 parse-success rows
all model x lang cells satisfy the preregistered quality threshold
provenance: 631 records from the main run and 17 records from the failure-only patch run

raw_public.jsonl is a redacted public raw log. It preserves job metadata, request prompts, raw assistant text, parse results, and basic usage metadata, but removes provider response IDs, provider-side response bodies, fine-grained cost details, and unnecessary transport metadata. The parsed CSV files preserve their original schema for analysis compatibility, but the response_id values are replaced with REDACTED.

Repository Map

docs/REVIEWER_GUIDE.md: shortest reviewer-oriented verification path
docs/DATA_AND_REPRODUCIBILITY.md: dataset layout and reproduction workflow
docs/preregistration/: preregistration and repair amendment
emotion_runner/: experiment runner, parser, analysis, and asset generation code
outputs/models6_multilingual_2026-03-06_combined01/: official public dataset and analysis outputs
prompt_template.md: fixed multilingual prompt and stimulus specification
models.yaml: preregistered model list
tests/: unit tests for prompt generation, parsing, repair workflow, and analysis helpers

Reviewer Quick Start

Create a Python environment and install the analysis dependencies:

python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt

Run the tests:

pytest -q

Recreate the analysis outputs from the official combined dataset:

python -m emotion_runner analyze \
  --input-wide outputs/models6_multilingual_2026-03-06_combined01/parsed_wide.csv \
  --outdir outputs/models6_multilingual_2026-03-06_combined01/analysis \
  --study-id models6_multilingual_2026-03-06

The repository also includes the runner needed to reproduce or extend the API collection workflow. Re-running the full experiment requires an OpenRouter API key and may produce different outputs because hosted LLMs can change over time. Do not overwrite the official combined01 dataset; use a new output directory for reruns.

License

Source code is released under the MIT License. Datasets, documentation, prompt/stimulus specifications, analysis outputs, and machine-translated texts are released under CC BY 4.0 unless otherwise noted.

Citation

If you use this repository, please cite the associated paper and this repository release. A CITATION.cff file is included for software citation metadata.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Multilingual LLM Emotion Attribution Study

Study Overview

Official Dataset

Repository Map

Reviewer Quick Start

License

Citation

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/workflows		.github/workflows
docs		docs
emotion_runner		emotion_runner
outputs/models6_multilingual_2026-03-06_combined01		outputs/models6_multilingual_2026-03-06_combined01
tests		tests
.gitignore		.gitignore
CITATION.cff		CITATION.cff
LICENSE		LICENSE
LICENSE-DATA.md		LICENSE-DATA.md
README.md		README.md
models.yaml		models.yaml
prompt_template.md		prompt_template.md
requirements.txt		requirements.txt

Folders and files

Latest commit

History

Repository files navigation

Multilingual LLM Emotion Attribution Study

Study Overview

Official Dataset

Repository Map

Reviewer Quick Start

License

Citation

About

Topics

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages