GitHub - BioE-KimLab/aS0T1pred: The present repository contains the code for the ML training, preprocessors and weights for the article "A fragment based approach towards curating, comparing and developing machine learning models applied in photochemistry" with DOI: 10.1039/d5sc05615b

Introduction

The present repository contains the code used for the hyperparameter tuning (arch_LR_optimization), the database analysis using the described molecular fragmentation algorithm (added as a standalone package in "fragmentation") and the code for the trained model predictions as well as their training per each database (selected_models).

Development Environment Setup

The environment used for the development of the models and all the training in the manuscript is described by the original_conda_reqs.txt. This environment was created by cloning an enviroment of the specific HPC resources used, and then proceeding to the installation of the missing libraries. The environment that was cloned contained some libraries tuned to the HPC resources, and as such setting up the environment is not easy. After multiple attempts we developed a minimal version of the environment that is much easier to install and that allows for a simpler setup. In all the tests that we carried out the models trained with the minimal version achieved similar results to the ones obtained with the original environment, being the only notable difference the time it took to complete the trainings.

Minimal Environment Setup

To set up a minimal working environment to run the predictions or the model trainings:

First, create the environment using:

conda create -p path/to/my/env --file conda_reqs.txt

or if you prefer your conda environments by name

conda create -n my_env_name --file conda_reqs.txt

Next, activate the environment and install using pip:

python -m pip install -r pip_reqs.txt

Databases and Training Data

The databases have been uploaded to Zenodo under the DOI:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Introduction

Development Environment Setup

Minimal Environment Setup

Databases and Training Data

About

Uh oh!

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
arch_LR_optimization		arch_LR_optimization
delta_fragment_model		delta_fragment_model
fragmentation		fragmentation
selected_models		selected_models
.gitignore		.gitignore
README.rst		README.rst
conda_reqs.txt		conda_reqs.txt
original_conda_reqs.txt		original_conda_reqs.txt
pip_reqs.txt		pip_reqs.txt

BioE-KimLab/aS0T1pred

Folders and files

Latest commit

History

Repository files navigation

Introduction

Development Environment Setup

Minimal Environment Setup

Databases and Training Data

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages