LLM Framework for Goal-oriented Dialogue Systems

This is the official repo for the paper: "Guideline Compliance in Task-Oriented Dialogue: The Chained Prior Approach".

The code base relies on the huggingface transformer library.

News:

Our paper is accepted by NAACL 2025!

Data

In this work we use two dataset ABCD (Chen et al., 2021) and MultiWOZ 2.2 (Zang et al., 2020)(pending).

Create folder Structure

Create a following folder structure to contain all the data

<Project Directory>/
└── data/
    ├── raw 
    └── processed

mkdir -p data/raw
mkdir -p data/processed

Copy Action Mapping files

In this work we use a mapping for the action names to convert them to a human written names (e.g., "pull up customer account" instead of "pull-up-account"). This code base includes the mapping that were use for all the experiments in our work for both datasets.

cp ${Clone_Directory}/resources/abcd_action_mappings.json data/raw
cp ${Clone_Directory}/resources/multiwoz_action_mappings.json data/raw

Download ABCD Dataset

Since ABCD is not on huggingface datasets, we need to download it manually:

cd data/raw
wget https://github.com/asappresearch/abcd/raw/master/data/abcd_v1.1.json.gz
wget https://raw.githubusercontent.com/asappresearch/abcd/master/data/guidelines.json
wget https://raw.githubusercontent.com/asappresearch/abcd/master/data/ontology.json
wget https://raw.githubusercontent.com/asappresearch/abcd/master/data/utterances.json
gunzip abcd_v1.1.json.gz

Install requirements

# Enable you virtual env
# Chained prior module relies on aalpy
pip install -r requirements.txt

Create Datasets for ABCD

bash generate_data.sh

Once the script above runs successfully, you should see the following files in the processed data folder

<Project Directory>/
└── data/
    └── processed 
       ├── train_workflow_discovery_abcd.json 
       ├── dev_workflow_discovery_abcd.json 
       ├── test_workflow_discovery_abcd.json 
       ├── train_AST_abcd.json 
       ├── dev_AST_abcd.json 
       ├── test_AST_abcd.json 
       ├── train_CDS_abcd.json 
       ├── dev_CDS_abcd.json 
       ├── test_CDS_abcd.json 
       ├── train_workflow_discovery_multiwoz.json 
       ├── validation_workflow_discovery_multiwoz.json 
       └── test_workflow_discovery_multiwoz.json

Train Policy Model

Run

# set cuda device
export CUDA_VISIBLE_DEVICES=2,3
python train.py --experiment_name abcdASTWOActionFull \
 --model_name_or_path t5-small \
  --do_eval \
  --do_predict \
  --num_train_epochs 100 \
  --train_file ./data/processed/train_AST_abcd-full.json \
  --validation_file ./data/processed/dev_AST_abcd-full.json \
  --test_file ./data/processed/test_AST_abcd-full.json \
  --text_column input \
  --summary_column target \
  --per_device_train_batch_size 32 \
  --per_device_eval_batch_size 32 \
  --predict_with_generate \
  --output_dir ./results/ \
  --save_strategy epoch \
  --source_prefix "Predict AST: " \
  --max_source_length 1024 \
  --max_target_length 256 \
  --val_max_target_length 256 \
  --learning_rate 5e-5 \
  --warmup_steps 500 \
  --use_ast_metrics \
  --use_fast_tokenizer False

Evaluate

# set cuda device
export CUDA_VISIBLE_DEVICES=6,7
python train.py --experiment_name abcdASTWAction \
 --model_name_or_path results/abcdASTWAction_input_target_t5-small/checkpoint-30500 \
  --do_predict \
  --train_file ./data/processed/test_AST_abcd_50.json \
  --validation_file ./data/processed/dev_AST_abcd-full.json \
  --test_file ./data/processed/test_AST_abcd_w_action_full.json \
  --text_column input \
  --summary_column target \
  --per_device_train_batch_size 32 \
  --per_device_eval_batch_size 32 \
  --predict_with_generate \
  --output_dir ./results/ \
  --save_strategy epoch \
  --source_prefix "Predict AST: " \
  --max_source_length 1024 \
  --max_target_length 256 \
  --val_max_target_length 256 \
  --learning_rate 5e-5 \
  --warmup_steps 500 \
  --use_fast_tokenizer False \
  --use_ast_metrics \
  --num_beams 4

Note:

The --num_beams parameter is used to set the number of beams for beam search, it is required to be set to 4 for evaluation.

Cite:

@inproceedings{wen-etal-2025-guideline,
    title = "Guideline Compliance in Task-Oriented Dialogue: The Chained Prior Approach",
    author = "Wen, Xiangyu and Zhong, Jianyuan and Xu, Zhijian and Xu, Qiang",
    booktitle = "Findings of the Association for Computational Linguistics: NAACL 2025",
    year = "2025",
    url = "https://aclanthology.org/2025.findings-naacl.377/",
    pages = "6750--6776",
}

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
chainPrior		chainPrior
data		data
resources		resources
src		src
Apache-2.0 License.txt		Apache-2.0 License.txt
README.md		README.md
eval_abcd_AST_woaction.sh		eval_abcd_AST_woaction.sh
eval_abcd_AST_woaction_flow.sh		eval_abcd_AST_woaction_flow.sh
eval_dialogue.ipynb		eval_dialogue.ipynb
eval_dialogues.sh		eval_dialogues.sh
generate_data.sh		generate_data.sh
generate_dataset_WO_action.py		generate_dataset_WO_action.py
get_dialogue_for_updating.ipynb		get_dialogue_for_updating.ipynb
get_dialogue_for_updating.py		get_dialogue_for_updating.py
get_dialogue_for_updating.sh		get_dialogue_for_updating.sh
processData.py		processData.py
prompts_dialog_abcd.txt		prompts_dialog_abcd.txt
prompts_dialog_multiwoz.txt		prompts_dialog_multiwoz.txt
prompts_response_abcd.txt		prompts_response_abcd.txt
prompts_response_multiwoz.txt		prompts_response_multiwoz.txt
requirements.txt		requirements.txt
select_updating_dialog_10p.py		select_updating_dialog_10p.py
select_updating_dialog_1p.py		select_updating_dialog_1p.py
simulatedDialogue.py		simulatedDialogue.py
tmp.json		tmp.json
train.py		train.py
train_abcd_ast_woaction.sh		train_abcd_ast_woaction.sh
train_abcd_ast_woaction_flow.sh		train_abcd_ast_woaction_flow.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LLM Framework for Goal-oriented Dialogue Systems

News:

Data

Create folder Structure

Copy Action Mapping files

Download ABCD Dataset

Install requirements

Create Datasets for ABCD

Train Policy Model

Run

Evaluate

Note:

Cite:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

xywen97/GuidedTOD

Folders and files

Latest commit

History

Repository files navigation

LLM Framework for Goal-oriented Dialogue Systems

News:

Data

Create folder Structure

Copy Action Mapping files

Download ABCD Dataset

Install requirements

Create Datasets for ABCD

Train Policy Model

Run

Evaluate

Note:

Cite:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages