Skip to content

thinklis/GuWen-Align

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

4 Commits
Β 
Β 
Β 
Β 

Repository files navigation

A Weakly Supervised Preference Alignment Framework for Robust Ancient Chinese Translation

Paper Status Hugging Face Dataset License

This project focuses on enhancing the translation of Ancient Chinese into Modern Chinese, specifically addressing the challenge of linguistic variations and model robustness through a weakly supervised preference alignment framework.


πŸ“… News

  • [2025-12]: Released the Robust-Erya benchmark on Hugging Face Hub!
  • [Coming Soon]: The core implementation of Stage 2 (Preference Alignment) will be fully open-sourced upon official paper acceptance.

πŸ“Š Datasets

We utilize two main datasets for our research. Detailed descriptions, noise taxonomy, and data samples can be found on our Hugging Face page.

1. Erya Benchmark

The foundation of our training and standard evaluation.

2. Robust-Erya Benchmark

An extension of the Erya test suite designed for robustness evaluation. It covers 5 domains with 3 categories of noise at 5 intensity levels.

πŸ‘‰ Access and download the dataset here: Hugging Face: thinklis/Robust-Erya


βš™οΈ Training Framework

Our framework follows a two-stage training paradigm:

  • Stage 1 (Supervised Fine-Tuning): Initial SFT using the LLaMA-Factory framework on base models (e.g., LLaMA-3, Qwen2.5, InternLM3.
  • Stage 2 (Preference Alignment): Our proposed framework aligns model outputs with human-preferred stylistic and linguistic norms using weak supervision. (Code for Stage 2 is currently withheld for the review process).

πŸš€ Usage

1. Inference

Inference and generation are performed via LLaMA-Factory. Please download the Robust-Erya test files from Hugging Face and follow the LLaMA-Factory Prediction Guide to generate translations and obtain initial BLEU-4 scores.

2. Evaluation

For advanced semantic evaluation, we employ an LLM-as-a-Judge approach:

  • evaluation/llm_judge_vllm.py: A Python script that utilizes LLMs to evaluate the semantic alignment and translation fluency between the model outputs and ground-truth references.

πŸ“œ Citation

If you find this project or the Robust-Erya dataset helpful, please cite this repository:

@misc{GuWenAlign2025,
  author = {Thinklis},
  title = {A Weakly Supervised Preference Alignment Framework for Robust Ancient Chinese Translation},
  year = {2025},
  publisher = {GitHub},
  journal = {GitHub repository},
  howpublished = {\url{[https://github.com/thinklis/GuWen-Align](https://github.com/thinklis/GuWen-Align)}}
}

About

A Weakly Supervised Preference Alignment Framework for Robust Ancient Chinese Translation

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages