This project implements a pipeline to convert clinical text guidelines into a structured decision tree, generate synthetic clinical vignettes, and traverse the decision tree to simulate classification or referral suggestions based on the vignettes.
- text_headache_guideline.txt
The original guideline document describing headache diagnosis and referral criteria in free text.
-
text2tree.py
Parses and converts the raw guideline text into a set of decision tree substructures.- Output:
decision_trees_output_full.json - Note: The guideline text is split into smaller segments to improve model performance during parsing.
- Output:
-
merge_tree.py
Merges all subtrees into a unified structured decision tree.- Output:
merged_decision_tree.json
- Output:
-
generate_vignette.py
Automatically generates clinical vignettes with various combinations of positive and negative features.- Output: Multiple JSON files containing vignettes in the
vignettes_part/directory
- Output: Multiple JSON files containing vignettes in the
-
merge_data.py
Merges all vignette JSON files into a single CSV for easy processing and analysis.- Output:
vignettes_data.csv
- Output:
-
traverse_tree.py
Main execution script that:- Loads the merged decision tree
- Uses each vignette as input
- Generates prompts for each decision node
- Traverses the tree according to model decisions
- Output results are saved in the
results/directory
-
results.ipynb
Jupyter Notebook for evaluating the outputs. Includes analysis of the traversal and classification performance.