Skip to content

Commit a39fc75

Browse files
authored
Update README.md
1 parent f7e3856 commit a39fc75

File tree

1 file changed

+215
-49
lines changed

1 file changed

+215
-49
lines changed

README.md

Lines changed: 215 additions & 49 deletions
Original file line numberDiff line numberDiff line change
@@ -1,80 +1,246 @@
1-
# AIXpert
1+
# AIXpert: Factual Preference Alignment for Large Language Models
22

3-
----------------------------------------------------------------------------------------
3+
### A Modular Benchmark & Training Framework for Factual-Aware DPO
44

5-
[![code checks](https://github.com/VectorInstitute/AIXpert/actions/workflows/code_checks.yml/badge.svg)](https://github.com/VectorInstitute/AIXpert/actions/workflows/code_checks.yml)
6-
[![unit tests](https://github.com/VectorInstitute/AIXpert/actions/workflows/unit_tests.yml/badge.svg)](https://github.com/VectorInstitute/AIXpert/actions/workflows/unit_tests.yml)
7-
[![integration tests](https://github.com/VectorInstitute/AIXpert/actions/workflows/integration_tests.yml/badge.svg)](https://github.com/VectorInstitute/AIXpert/actions/workflows/integration_tests.yml)
8-
[![docs](https://github.com/VectorInstitute/AIXpert/actions/workflows/docs.yml/badge.svg)](https://github.com/VectorInstitute/AIXpert/actions/workflows/docs.yml)
5+
<p align="center">
6+
<b>🧠 Factual Alignment · 🧪 Preference Optimization · ⚙️ Reproducible AI Engineering</b>
7+
</p>
98

9+
<p align="center">
10+
<b>📄 Paper:</b> <i>In Preparation</i>
11+
&nbsp;|&nbsp;
12+
<b>📊 Base Dataset:</b>
13+
<a href="https://huggingface.co/datasets/Skywork/Skywork-Reward-Preference-80K-v0.1">
14+
Skywork-Reward-Preference-80K
15+
</a>
16+
&nbsp;|&nbsp;
17+
<b>🏛️ Affiliation:</b> Vector Institute for Artificial Intelligence
18+
</p>
1019

11-
<!-- TODO: Uncomment this with the right credentials once codecov is set up for this repo.
12-
[![codecov](https://codecov.io/github/VectorInstitute/AIXpert/graph/badge.svg?token=83MYFZ3UPA)](https://codecov.io/github/VectorInstitute/AIXpert)
13-
-->
14-
<!-- TODO: Uncomment this when the repository is made public
15-
![GitHub License](https://img.shields.io/github/license/VectorInstitute/AIXpert)
16-
-->
20+
---
1721

18-
<!--
19-
TODO: Add picture / logo
20-
-->
22+
## 🧭 About
2123

22-
<!--
23-
TODO: Add introduction about AIXpert here
24-
-->
24+
**AIXpert Preference Alignment** is a full-stack **research and engineering framework** for studying and improving **factual alignment in preference-optimized Large Language Models (LLMs)**.
2525

26+
The project introduces **Factual-DPO**, a factuality-aware extension of **Direct Preference Optimization (DPO)** that incorporates:
2627

27-
## 🧑🏿‍💻 Installation
28+
* Explicit factuality supervision
29+
* Synthetic hallucination inversion
30+
* Margin-based factual penalties
2831

29-
### Installing dependencies
32+
The repository provides **end-to-end infrastructure** for:
3033

31-
The development environment can be set up using
32-
[uv](https://github.com/astral-sh/uv?tab=readme-ov-file#installation).
33-
Instructions for installing uv can be found [here](https://docs.astral.sh/uv/getting-started/installation/).
34+
* Dataset construction
35+
* Multi-model preference fine-tuning
36+
* Automated factuality evaluation
3437

38+
All components are **config-driven**, reproducible, and aligned with the **Vector Institute AI Engineering Template**.
3539

36-
Once installed, run:
40+
---
41+
42+
## ✨ Key Contributions
43+
44+
* 🔍 Binary factuality supervision integrated into preference learning
45+
* 🧪 Synthetic hallucination inversion pairs
46+
* 📐 Δ-margin factual penalties for controllable hallucination suppression
47+
* ⚙️ Fully config-driven data, training, and evaluation pipelines
48+
* 📊 Multi-model × multi-Δ benchmarking at scale
49+
50+
---
51+
52+
## 📦 Repository Structure
53+
54+
```
55+
aixpert/
56+
57+
├── src/aixpert/
58+
│ ├── config/ # Central config.yaml
59+
│ ├── data_construction/ # 8-stage factual dataset pipeline
60+
│ ├── training/ # Original-DPO & Factual-DPO training
61+
│ ├── evaluation/ # GPT-4o-mini judge evaluation
62+
│ └── utils/ # Shared helpers
63+
64+
├── README.md
65+
└── pyproject.toml
66+
```
67+
68+
---
69+
70+
## 🧠 What Is Factual-DPO?
71+
72+
Standard DPO aligns models to **human preferences**, but does not explicitly discourage **hallucinated yet preferred responses**.
73+
74+
**Factual-DPO** introduces a factuality-aware margin:
75+
76+
* Each preference tuple includes `(h_w, h_l)` factuality indicators
77+
* A penalty λ is applied when the preferred response is less factual
78+
* Optimization pressure shifts toward **factually correct preferences**
79+
80+
➡️ Result: **Lower hallucination rates without sacrificing preference alignment**
81+
82+
---
83+
84+
## 🔬 Skywork → Factual-DPO Data Construction Pipeline
85+
86+
This repository contains a complete **eight-stage pipeline** for converting the **Skywork Reward-Preference-80K** dataset into **balanced, factual-aware DPO datasets**.
87+
88+
### Pipeline Stages
89+
90+
| Stage | Description |
91+
| ----- | --------------------------------------- |
92+
| 1 | Skywork extraction & de-duplication |
93+
| 2 | Preference pair conversion |
94+
| 3 | Binary factuality scoring (GPT-4o-mini) |
95+
| 4 | Canonical DPO transformation |
96+
| 5 | Synthetic hallucination generation |
97+
| 6 | Dataset merging |
98+
| 7 | Balanced bucket construction |
99+
| 8 | Optional preference flipping |
100+
101+
All paths and parameters are defined in:
102+
103+
```
104+
src/aixpert/config/config.yaml
105+
```
106+
107+
---
108+
109+
## ⚙️ Configuration-Driven Design
110+
111+
Every component — **datasets, models, hyperparameters, outputs, and evaluation** — is controlled via:
112+
113+
```
114+
src/aixpert/config/config.yaml
115+
```
116+
117+
Loaded using:
118+
119+
```python
120+
from utils.config_loader import load_config
121+
cfg = load_config()
122+
```
123+
124+
This enables:
125+
126+
* Full reproducibility
127+
* Multi-model automation
128+
* Zero hard-coded paths
129+
130+
---
131+
132+
## 🏋️ Training Pipelines
133+
134+
### 1️⃣ Original-DPO (Baseline)
37135

38136
```bash
39-
uv sync
40-
source .venv/bin/activate
137+
python -m aixpert.training.run_dpo_training \
138+
--model "google/gemma-2-9b-it"
41139
```
42-
Note that uv supports [optional dependency groups](https://docs.astral.sh/uv/concepts/projects/dependencies/#dependency-groups)
43-
which helps to manage dependencies for different parts of development such as
44-
`documentation`, `testing`, etc.
45-
The core dependencies are installed using the command `uv sync`
46140

47-
In order to install dependencies for testing (codestyle, unit tests, integration tests),
48-
run:
141+
Trains standard DPO using Skywork preferences.
142+
143+
---
144+
145+
### 2️⃣ Factual-DPO (Δ-Margin Training)
49146

50147
```bash
51-
uv sync --dev
52-
source .venv/bin/activate
148+
python -m aixpert.training.run_factual_training \
149+
--model_id "google/gemma-2-9b-it" \
150+
--short "gemma2-9b" \
151+
--delta 10
53152
```
54153

55-
In order to exclude installation of packages from a specific group (e.g. docs),
56-
run:
154+
Each Δ value produces a **separate fine-tuned model**.
155+
156+
---
157+
158+
## 📊 Evaluation Pipeline
159+
160+
Evaluation is performed using **GPT-4o-mini as an LLM-as-a-Judge**.
161+
162+
### Metrics
163+
164+
| Metric | Meaning |
165+
| ----------- | ------------------------- |
166+
| factuality | Mean factual score |
167+
| halluc_rate | % outputs below threshold |
168+
| win_rate | Δ-model vs baseline |
169+
| count | Prompts evaluated |
170+
171+
Run evaluation:
57172

58173
```bash
59-
uv sync --no-group docs
174+
python -m aixpert.evaluation.evaluations.run_all_evaluations
175+
```
176+
177+
Outputs:
178+
179+
```
180+
eval_results.json
181+
```
182+
183+
---
184+
185+
## 🧪 Supported Models
186+
187+
* Gemma-2 (2B, 9B)
188+
* Qwen-2.5 / Qwen-3
189+
* LLaMA-3.x
190+
* Any TRL-compatible causal LLM
191+
192+
Models are registered centrally in `config.yaml`.
193+
194+
---
195+
196+
## 🧰 Frameworks & Tooling
197+
198+
* **Hugging Face TRL** — DPO reference implementation
199+
* **Unsloth** — QLoRA optimization
200+
* **BitsAndBytes** — 4-bit quantization
201+
* **Flash-Attention-2**
202+
* **Weights & Biases** — experiment tracking
203+
* **Accelerate** — multi-GPU orchestration
204+
205+
---
206+
207+
## 📚 Dataset Attribution & Credits
208+
209+
This project **builds upon and extends** the **Skywork Reward-Preference-80K** dataset.
210+
211+
> **We do not claim ownership of the Skywork dataset.**
212+
> All credit belongs to the original authors.
213+
214+
If you use this repository, **please cite Skywork**:
215+
216+
```bibtex
217+
@article{liu2024skywork,
218+
title={Skywork-Reward: Bag of Tricks for Reward Modeling in LLMs},
219+
author={Liu, Chris Yuhao and Zeng, Liang and Liu, Jiacai and Yan, Rui and He, Jujie and Wang, Chaojie and Yan, Shuicheng and Liu, Yang and Zhou, Yahui},
220+
journal={arXiv preprint arXiv:2410.18451},
221+
year={2024}
222+
}
60223
```
61224

62-
## Getting Started
225+
For dataset-related concerns, please contact the **Skywork authors** via their paper or Hugging Face repository.
226+
227+
---
228+
229+
## 📖 Citation (AIXpert / Factual-DPO)
230+
231+
A citation for this work will be released with the accompanying paper.
63232

64-
## Features / Components
233+
---
65234

66-
## Examples
235+
## 📬 Contact
67236

68-
## Contributing
69-
If you are interested in contributing to the library, please see
70-
[CONTRIBUTING.MD](CONTRIBUTING.MD). This file contains many details around contributing
71-
to the code base, including development practices, code checks, tests, and more.
237+
For questions, collaborations, or issues:
72238

73-
<!--
74-
TODO:
239+
* Open a GitHub Issue
240+
* Or contact the maintainers via the Vector Institute
75241

76-
## Acknowledgements
242+
---
77243

78-
## Citation
244+
### 🚀 AIXpert advances **factually aligned, preference-optimized language models** through principled data construction, training, and evaluation.
79245

80-
-->
246+
**We invite researchers and practitioners to build upon this framework.**

0 commit comments

Comments
 (0)