This repository contains the public data and code package for a preregistered study of multilingual emotion attribution by large language models (LLMs). It is prepared for reviewers and readers who want to inspect the study design, reproduce the reported analyses, and verify the repair-aware official dataset.
The associated manuscript should be read through the journal submission or publication site. Manuscript source files, review-response materials, and local submission artifacts are intentionally not included in this public repository.
The study examines how six LLMs assign four emotion scores to literary texts under three full language conditions. A language condition changes the instruction language, persona description, title, author name, and text content together.
- Study ID:
models6_multilingual_2026-03-06 - Design:
6 models x 3 languages x 3 texts x 4 personas x 3 trials - Total API calls:
648 - Languages: Japanese (
ja), English (en), Traditional Chinese (zh) - Emotions:
interesting,surprise,sadness,anger - Main model: OLS with HC3 robust standard errors, fitted separately for each emotion
The Japanese source texts are in the public domain. The English and Traditional Chinese versions used in this study were machine-translated by the authors for experimental purposes and are included to support reproducibility.
Use the repaired combined dataset for the main analysis:
outputs/models6_multilingual_2026-03-06_combined01/raw_public.jsonloutputs/models6_multilingual_2026-03-06_combined01/parsed_wide.csvoutputs/models6_multilingual_2026-03-06_combined01/parsed_long.csvoutputs/models6_multilingual_2026-03-06_combined01/repair_failed_job_keys.txt
Quality status:
648/648unique job keys648/648successful API requests648/648parse-success rows- all
model x langcells satisfy the preregistered quality threshold - provenance:
631records from the main run and17records from the failure-only patch run
raw_public.jsonl is a redacted public raw log. It preserves job metadata, request prompts, raw assistant text, parse results, and basic usage metadata, but removes provider response IDs, provider-side response bodies, fine-grained cost details, and unnecessary transport metadata. The parsed CSV files preserve their original schema for analysis compatibility, but the response_id values are replaced with REDACTED.
docs/REVIEWER_GUIDE.md: shortest reviewer-oriented verification pathdocs/DATA_AND_REPRODUCIBILITY.md: dataset layout and reproduction workflowdocs/preregistration/: preregistration and repair amendmentemotion_runner/: experiment runner, parser, analysis, and asset generation codeoutputs/models6_multilingual_2026-03-06_combined01/: official public dataset and analysis outputsprompt_template.md: fixed multilingual prompt and stimulus specificationmodels.yaml: preregistered model listtests/: unit tests for prompt generation, parsing, repair workflow, and analysis helpers
Create a Python environment and install the analysis dependencies:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txtRun the tests:
pytest -qRecreate the analysis outputs from the official combined dataset:
python -m emotion_runner analyze \
--input-wide outputs/models6_multilingual_2026-03-06_combined01/parsed_wide.csv \
--outdir outputs/models6_multilingual_2026-03-06_combined01/analysis \
--study-id models6_multilingual_2026-03-06The repository also includes the runner needed to reproduce or extend the API collection workflow. Re-running the full experiment requires an OpenRouter API key and may produce different outputs because hosted LLMs can change over time. Do not overwrite the official combined01 dataset; use a new output directory for reruns.
Source code is released under the MIT License. Datasets, documentation, prompt/stimulus specifications, analysis outputs, and machine-translated texts are released under CC BY 4.0 unless otherwise noted.
If you use this repository, please cite the associated paper and this repository release. A CITATION.cff file is included for software citation metadata.