Natural language processing course: Analysis and comparison of translation errors and biases in LLMs

ABOUT:
The purpose of this project is to analyze and compare translation errors and biases found in large language models(LLMs). We will evaluate how different models handle translations and look for common errors such as mistranslations, omissions and cultural misinterpretations. In addition we will also explore the bias that could emerge in translation, focus being political bias. By systematically comparing multiple LLMs, we aim to assess translation quality using both automated metrics and human evaluations. The project aims to help improve fairness and accuracy in AI-driven translation systems.
REQUIREMENTS:
- Python 3.10+
- pip
- Google Colab (recommended for running the code notebooks)
- Account access for:
  - ChatGPT (https://openai.com/chatgpt/overview/)
  - deepseek (https://www.deepseek.com)
  - huggingface (https://huggingface.co) - for creating a token and downloading MistralAI
PROJECT FILES:
- data_for_translation/translations.xlsx (file with original source and translated sentences)
- report/code/Mistral.ipynb (code used for translation with MistralAI)
- report/code/COMET.ipynb (code used for COMET evaluation)
CRITERIA USED FOR TRANSLATION:

Sentences were compared by:
- lexical fidelity,
- tone shifts (emphasis or neutralization),
- addition or omission of ideological markers.
Translation changes were categorized by:
- neutralization (softening emotionally charged words),
- shift (reframing with political implication),
- preservation (faithful to source text),
- no answer (model did not provide a translation),
- incorrect translation (incomprehensible translation).

step 1: translate the text
- load translation file
- perform translation using different models (for ChatGPT use
  https://openai.com/chatgpt/overview/, for Deepseek use
  https://www.deepseek.com, for MistralAI run Mistral.ipynb in Google Colab)
- Note: in order to run MistralAI, you must create a token in Huggingface and save it as secret key in Google Colab under the name HF_TOKEN.
step 2: evaluate translations
- use COMET.ipynb in Colab to get COMET evaluations
- Note: when installing unbabel-comet, if there is an error because of numpy version, run the first cell (pip install numpy<2.0.0) and restart session, then run the cell to install unbabel again.
- perform human evaluation
step 3: analyze results
- based on COMET scores and human evaluation, make conclusions regarding translation quality across models and analyze any bias that might appear in the translations

Name		Name	Last commit message	Last commit date
Latest commit History 44 Commits
data_for_translation		data_for_translation
report		report
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback