ClinAlign is a clinician-grounded healthcare alignment framework that scales from instance rubrics to a reusable library of clinical principles, enabling robust preference alignment and inference-time self-revision for medical LLMs.
📃 Paper
- Clinician-verified rubrics (HealthRubrics): a physician-validated preference dataset built by having clinicians revise and finalize LLM-drafted, checkable rubrics.
- Reusable principle library (HealthPrinciples): distilled clinician consensus as 119 broadly reusable principles organized by clinical dimensions (urgency / uncertainty / expertise / task type).
- Scalable supervision: principles can be converted into per-question rubrics for new, unlabeled medical queries—scaling training data without per-instance clinician authoring.
- Inference-time alignment tool: retrieve matched principles → generate rubric references → guide iterative self-revision at test time.
🤗 Models: ClinAlign-4B • ClinAlign-30B-A3B
Generate Scenario Classification:
python run_scenario_gen.py \
--mode api \
--model_name gpt-5.1 \
--model_api_url "https://your-provider.example.com/v1/chat/completions" \
--model_api_key "YOUR_KEY" \
--input_path data/input.jsonl \
--output_dir outputs \
--temperature 0.2 \
--concurrency 50 \
--overwrite
Extract Principle Pool:
python attach_principle_pool.py \
--in_jsonl /path/to/input.jsonl \
--principles_json principles_v1.json \
--out_jsonl /path/to/output.with_pool.jsonl
Generate Principles-Based Rubrics
python gen_rubric_from_principles.py \
--mode api \
--model_name gpt-5.1 \
--model_api_url "https://your-provider.example.com/v1/chat/completions" \
--model_api_key "your_api_key_here" \
--input_path data/input.jsonl \
--output_dir outputs \
--temperature 0.2 \
--concurrency 50

