This is the code for BACo proposed in Optimizing Diversity and Quality through Base–Aligned Model Collaboration (Arxiv).
The repo is currently under construction, more code, data, and instruction will be gradually added ASAP.
The code can be run with Python 3.12 and above.
We recommend using conda to set up the environment:
conda create --name baco python=3.12
pip install -r requirements.txt
Use vllm to host the models.
# serve base model
bash scripts/serve_base_1gpu.sh
# serve aligned model
bash scripts/serve_aligned_1gpu.sh
Note: Shell output started with the port is used for each model. Please keep a note as we will use it later. (we assign models based on available ports on the host machine, and each hosting instance may be assigned differently).
bash scripts/serve_reward_1gpu.sh
Note: Keep a note of the port is used.
bash scripts/run_baco.sh
Detailed comments is:
python -m src.baco.inference.run \
--dataset_name detaset_name\
--subset subset_name\
--base_host base_host_url \
--aligned_host aligned_host_url \
--base_model base_model_name \
--aligned_model aligned_model_name \
--num_sample num_of_sampled_prompt \
--base_temperature base_model_temperature \
--aligned_temperature aligned_model_temperature \
--exp experiment_type \
--rerun --num_threads num_of_threads_in_pararel \
--router router_name \
--output_root_dir output_root_dir \
--top_prob_thres threshold_value
The outputs are at
- Checkpoint Dir Path: it saves all probability and model contributions, which also can be continued if the generation is accidentally killed.
(i.e.,
ckpt_dir, e.g.,ckpts/baco/runs/0/outputs_verbose/baco_top_prob_rev+punc_rule/novelty-bench-curated/tbase_1.0_tnudge_1.0_thres_0.1/all_info). - Output Dir Path: it saves the clean outputs into a Jsonl file, including prompt_id, model, config, prompts, and all generations.
(i.e.,
output_dir, e.g.,outputs/baco/runs/0/novelty-bench-curated/baco_base_Meta-Llama-3-8B_align_Meta-Llama-3-8B-Instruct/prob+punc_thres_0.1/input-num_4_samples)
If you want to manually check the outputs, we provide two visualization with also hightlight tokens with which model (base or aligned) contributes it.
Visualize directly in terminal:
python -m src.baco.inference.text_visualization --data_dir <ckpt_dir> --n <number_of_outputs_to_visualize>
<ckpt_dir> should be assigned as Checkpoint Dir Path (from the inference stage).
Or you can visualize on Ipynb with a better interface: Run src/baco/inference/text_visualization.ipynb. Update data_dir in the first cell with your ckpt_dir before start runing.
Evaluate with all the diversity and quaility metrics for the given generations.
bash scripts/run_eval.sh <output_dir> <align_port> <reward_port>
<output_dir> should be assigned as Output Dir Path (from the inference stage), <align_port> is the port id of the aligned model (4 digit number from the hosting stage), <reward_port> is the port id of the reward model.
The repo is built based on Nudging.