Python code to reproduce all plots in:
❝Bayesian optimization of nanoporous materials❞ A. Deshwal, C. Simon, J. R. Doppa. Molecular Systems Design & Engineering. (2021) link preprint
the Python 3 libraries required for the project are in requirements.txt. use Jupyter Notebook or Jupyter Lab to run Python 3 in the *.ipynb.
our paper relies on data from Mercado et al. here. we visited Materials Cloud to download and untar properties.tgz giving properties.csv in new/. this is the data we use.
run the code in the Jupyter Notebook prepare_Xy.ipynb to prepare the data and write inputs_and_outputs.pkl to be read in by other Notebooks. in here, you can set the number of runs nb_runs, number of iterations for each run nb_iterations, and, if you wish, a flag downsample_data for testing.
run the following Jupyter Notebooks, which will write search results to .pkl files.
random_search.ipynbfor random searchevol_search.ipynbfor evolutionary search (CMA-ES)random_forest_run.ipynbfor one-shot supervised machine learning (via random forests). run twice, one with the flagdiversify_training = True, the other withdiversify_training = False.BO_run.ipynbfor Bayesian optimization. run three times, withwhich_acquisitionset to"EI","max y_hat", andmax sigma.
each .ipynb can be run on a desktop computer. the BO code takes the longest, at ~10 min per run.
finally, run viz.ipynb to read in the *.pkl files output from the search runs and visualize the results.
see synthetic_example.ipynb for the toy GP plots in the paper.