Classical Chinese Translation with QLoRA & Prompt Engineering

基於 QLoRA 與 Prompt Engineering 的文言文翻譯

State-of-the-Art Performance: Achieved 6.56 Perplexity with QLoRA fine-tuning, a 30x improvement over the base model.

本專案專注於文言文與現代文的雙向翻譯任務。透過 QLoRA 技術微調 Qwen3-4B 模型，並結合 Prompt Engineering 的深入研究，實現了超越更大參數模型（如 Llama3-Taiwan-8B 通用模型）的翻譯表現。

🏆 Project Highlights (專案亮點)

高效微調 (Efficient Fine-tuning): 使用 4-bit NF4 量化與 QLoRA，在有限算力下完成高品質訓練。
極致效能 (Extreme Performance): 微調後模型 PPL 降至 6.56，遠優於 Zero-shot 的 204.07。
提示詞工程 (Prompt Engineering): 實驗證明「直接指令」比「角色扮演」更適合此翻譯任務。
模型評比 (Benchmark): 在特定領域任務上，4B 微調模型表現優於 8B 通用模型。

📊 Performance Analysis (效能分析)

我在 public_test.json 上進行了廣泛的評測，比較了不同模型與策略的 Perplexity (PPL) 分數（越低越好）：

Model	Method	Perplexity (Lower is Better)	Improvement
Qwen3-4B (Fine-tuned)	QLoRA (r=64)	6.56	Baseline
Llama3-Taiwan-8B	Few-Shot (4-shot)	11.02	+68%
Llama3-Taiwan-8B	Zero-Shot (Optimized)	17.73	+170%
Qwen3-4B (Base)	Zero-Shot (Optimized)	204.07	+3000%
Qwen3-4B (Base)	Few-Shot (4-shot)	273.95	+4000%

觀察:

QLoRA 的統治力: 針對特定領域資料進行微調，能讓小模型 (4B) 展現出遠超大模型 (8B) 的能力。

Prompt 的影響: 對於 8B 模型，Few-shot 有顯著幫助；但對於未微調的 4B 模型，Few-shot 反而可能引入雜訊，導致 PPL 上升。

🛠️ Methodology (方法論)

QLoRA Configuration

為了在消費級顯卡上進行訓練，我採用了以下配置：

Quantization: 4-bit NF4 (Normal Float 4)
LoRA Rank (r): 64
LoRA Alpha: 16
Dropout: 0.05
Target Modules: All Linear Layers (q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj)

Prompt Engineering Insights

在實驗中，發現提示詞的設計對結果有巨大影響：

✅ 最佳策略 (Direct Instruction): "這是文言文與現代文的翻譯問題，請根據下方指令與要翻譯的內容，提供正確的回答..."
❌ 較差策略 (Role Playing): "你是一個專業助理..." (甚至 "你是一個中文專業教授")
結論: 在翻譯任務中，直接明確的指令比賦予模型人格設定更有效。

🚀 Getting Started

1. Project Structure

hw2/
├── README.md                # 本說明文件
├── requirements.txt         # Python 套件依賴
├── download.sh             # 下載預訓練模型腳本
├── train.py                # 訓練腳本
├── predict.py              # 推論腳本
├── ppl.py                  # 困惑度(Perplexity)評估腳本
├── utils.py                # 工具函數
├── run.sh                  # 推論執行腳本
└── data/                   # 資料集目錄

2. Environment Setup

# 建立虛擬環境
python -m venv venv
source venv/bin/activate  # Linux/Mac
# venv\Scripts\activate  # Windows

# 安裝依賴
pip install -r requirements.txt

3. Download Pre-trained Model

執行下載腳本來獲取訓練好的 LoRA adapter (PPL 6.56)：

bash download.sh

此腳本會將 adapter 解壓縮到 ./adapter_checkpoint 目錄。

💻 Usage

Training (訓練)

如果你想重新訓練模型：

python train.py

預設參數: Batch size=24 (effective), LR=8e-5, Epochs=4

Inference (推論)

使用 run.sh 進行快速推論：

# 用法: bash run.sh <model_path> <adapter_path> <input_json> <output_json>
bash run.sh "Qwen/Qwen3-4B" "./adapter_checkpoint" "data/private_test.json" "predictions.json"

Evaluation (評估)

計算模型的 Perplexity：

python ppl.py \
    --base_model_path "Qwen/Qwen3-4B" \
    --peft_path "./adapter_checkpoint" \
    --test_data_path "data/public_test.json"

🔧 Technical Details

Base Model: Qwen/Qwen3-4B
Optimizer: AdamW (warmup_ratio=0.1)
Scheduler: Cosine
Precision: bfloat16 mixed precision
Max Length: 512 tokens

Prompt Format

微調與推論時使用的標準格式：

你是一個專業助理，以下是用戶和助理的對話。你要對用戶的需求、問題與資訊提供有用、正確的回答。
USER: {instruction}
ASSISTANT:

(註：雖然實驗顯示其他 Prompt 在 Zero-shot 表現更好，但為了保持微調的一致性，訓練時使用此標準格式)

⚠️ Troubleshooting

OOM (Out of Memory): 嘗試減少 per_device_train_batch_size 或增加 gradient_accumulation_steps。
Bitsandbytes Error: 確保 CUDA 版本與 bitsandbytes 相容 (建議 CUDA 11.8+)。

👤 Author

Student ID: b11109005
Course: Applied Deep Learning 2025
Task: Exercise 2 - Language Model Fine-tuning

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Classical Chinese Translation with QLoRA & Prompt Engineering

基於 QLoRA 與 Prompt Engineering 的文言文翻譯

🏆 Project Highlights (專案亮點)

📊 Performance Analysis (效能分析)

🛠️ Methodology (方法論)

QLoRA Configuration

Prompt Engineering Insights

🚀 Getting Started

1. Project Structure

2. Environment Setup

3. Download Pre-trained Model

💻 Usage

Training (訓練)

Inference (推論)

Evaluation (評估)

🔧 Technical Details

Prompt Format

⚠️ Troubleshooting

👤 Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
data		data
ADL2025-HW2.pdf		ADL2025-HW2.pdf
README.md		README.md
download.sh		download.sh
learning_curve.png		learning_curve.png
ppl.py		ppl.py
predict.py		predict.py
report_b11109005.pdf		report_b11109005.pdf
requirements.txt		requirements.txt
run.sh		run.sh
train.py		train.py
trainer_state.json		trainer_state.json
utils.py		utils.py

Folders and files

Latest commit

History

Repository files navigation

Classical Chinese Translation with QLoRA & Prompt Engineering

基於 QLoRA 與 Prompt Engineering 的文言文翻譯

🏆 Project Highlights (專案亮點)

📊 Performance Analysis (效能分析)

🛠️ Methodology (方法論)

QLoRA Configuration

Prompt Engineering Insights

🚀 Getting Started

1. Project Structure

2. Environment Setup

3. Download Pre-trained Model

💻 Usage

Training (訓練)

Inference (推論)

Evaluation (評估)

🔧 Technical Details

Prompt Format

⚠️ Troubleshooting

👤 Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages