Run bias control experiments for authority, bandwagon, framing, and confirmation biases.
cd examples/unified_bias
# Run prompt-based experiment (fast)
python run_pipelines.py --bias authority --method prompt --test
# Run RepE experiment (requires training)
python run_pipelines.py --bias authority --method repe --test
# Run all models in batch
python run_batch.py| File | Purpose |
|---|---|
run_pipelines.py |
Main entry point - Run single experiments |
pipelines.py |
Core logic for prompt/RepE pipelines |
run_batch.py |
Batch runner for multiple models |
utils_bias.py |
Data management and utilities |
bias.py |
run_pipelines.py |
- Separation of concerns: Prompt and RepE are separate pipelines
- Reusable training: RepReader trained once, used for all experiments
- Clean data flow: Generated data for training, original data for testing
cd examples/unified_bias# Basic usage
python run_pipelines.py --bias authority --method prompt
# Test mode (faster, fewer scenarios)
python run_pipelines.py --bias bandwagon --method prompt --test
# With custom temperature
python run_pipelines.py --bias framing --method prompt --temp 0.7# Basic usage
python run_pipelines.py --bias authority --method repe
# With specific operators
python run_pipelines.py --bias confirmation --method repe --operators linear_comb,projection
# Test mode
python run_pipelines.py --bias framing --method repe --test| Option | Values | Description |
|---|---|---|
--bias |
authority, bandwagon, framing, confirmation |
Which bias type to test |
--method |
prompt, repe |
Experiment method |
--test |
flag | Run in test mode (4 scenarios only) |
--temp |
float (e.g., 0.7) |
Temperature for generation |
--model |
model name | Specific model to use |
--operators |
comma-separated | RepE operators (e.g., linear_comb,projection) |
Located in ../../data/
| Bias Type | Experiments |
|---|---|
| Authority | authority_MilgramS.json (milgram)authority_StanPri.json (stanford) |
| Bandwagon | bandwagon_Asch.json (asch)bandwagon_Hotel.json (hotel) |
| Framing | framing_Asian.json (asian)framing_Invest.json (invest) |
| Confirmation | confirmation_BiasInfo.json (bias_info)confirmation_Wason.json (wason) |
Located in ../../data_generated/
authority_generated_20250810_160938.jsonbandwagon_generated.jsonframing_generated.jsonconfirmation_generated.json
- Load model and tokenizer
- Load test scenarios from original datasets
- Run prompt-based control experiments
- Save results to
{bias_type}_plots/
- Load model and tokenizer
- Load generated data for RepReader training
- Train RepReader to detect bias directions
- Evaluate RepReader accuracy
- Load test scenarios from original datasets
- Run RepE control experiments with trained RepReader
- Save results to
{bias_type}_plots/
- Plots:
{bias_type}_plots/*.png - Data:
{bias_type}_plots/*_plot_data.json - Logs: Console output
$ cd examples/unified_bias
$ python run_pipelines.py --bias authority --method prompt --test
=== PROMPT PIPELINE :: authority ===
--- 1. Loading Model and Tokenizer ---
Model loaded on device: cuda
--- Preparing Generated Authority Dataset ---
Using dataset: ../../data_generated/authority_generated_20250810_160938.json
Training data size: 512, Test data size: 256
--- Loading Original Authority Scenarios ---
Loaded 4 scenarios for milgram
Loaded 4 scenarios for stanford
--- Prompt Control :: Authority - Milgram ---
Running experiments...
✓ Results saved
--- Prompt Control :: Authority - Stanford ---
Running experiments...
✓ Results saved
[Prompt] Results saved in authority_plots/