-
Notifications
You must be signed in to change notification settings - Fork 0
jodie-c/DCC-Tree
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
############################################## DCC-Tree ############################################## DCC-Tree is a method for exploring the posterior distribution of Bayesian decision trees that combines the work of the Divide Conquer Combine algorithm (Zhou et al. (2020)) with Hamiltonian Monte Carlo sampling. ############################################## Requirements/Installation ############################################## DCC-Tree is implemented in Python. Use the following commands to create the required environment to run the method. - ensure you are in the correct directory - run ./scripts/setup.sh via terminal/command prompt Manual changes have been made to the NumPyro package when running setup.sh, which need to be copied across correctly to run the method. ############################################## Usage ############################################## Once requirements have been installed the DCC-Tree method can be run. Use the following commands: Activate environment from the correct folder (the dcc-env must be active to correctly use DCC-Tree) using: source activate ./envs/dcc-env Example usage: python ./src/dcc-tree/DCC-Tree.py --dataset toy-class --out_dir ./results/ --init_id 1 --h_init 0.1 --h_final 0.001 --num_chains 16 --num_samps 100 --num_warmup 1000 --T0 100 --T 200 --save 1 --alpha 0.95 --beta 1.0 --M 10 python ./src/dcc-tree/DCC-Tree.py --dataset toy-non-sym --out_dir ./results/ --init_id 1 --tag 1 --h_init 0.1 --h_final 0.001 --num_chains 16 --num_samps 100 --num_warmup 1000 --T0 100 --T 500 --save 1 --alpha 0.95 --beta 1.0 --M 10 For further help and information on arguments: python ./src/dcc-tree/DCC-Tree.py -h All methods (both final runs and cross-validation) were run using shell scripts located in the "scripts" directory. Uncomment the relevant lines corresponding to the desired method script to run. ############################################## Training/Evaluation ############################################## Running the DCC-Tree.py file will run the training. Tree information/statistics are saved and output in a pickle file in the specified output directory (changed using the --out_dir argument). Performance metrics can be computed from this using the convert.py file. This will convert the output results to a matlab data file. There are several synthetic testing datasets that can be selected via the --dataset argument, available options are listed in the load_data method in utils.py. More information about formatting datasets into the appropriate structure can be found in ./datasets/README.txt Results from the paper were generated via CPU cores, either using Intel Cascade Lake CPU's (OS: Rocky Linux release 8.9 (Green Obsidian)) via a HPC cluster or using a MacBook Pro (16-inch, 2021) with an Apple M1 Pro processor.
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published