-
Notifications
You must be signed in to change notification settings - Fork 25
Open
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationfeatureNew feature to implementNew feature to implement
Description
We should add example usage of molecular fingerprints, in particular:
- Molecular property prediction (classification, regression, multioutput classification), with MoleculeNet and TDC benchmarks
- Visualization and clustering
- Virtual screening, e.g. with https://github.com/rdkit/benchmarking_platform
Those should also include computing fingerprints, tuning, and using parallelization. We should cover hashed fingerprints, descriptors, 3D variants with conformations etc.
Proposed tutorials list:
- introduction
- comparing different fingerprints (e.g. types, outputs, computation time)
- scikit-learn pipelines (e.g. concatenating fingerprints, normalizing, preprocessing)
- conformers and 3D fingerprints
- hyperparameter tuning
- different dataset splits
- loading built-in datasets, benchmarking
- distances and similarities, kNN, bulk functions
- custom fingerprints
- fingerprints for peptides, custom fingerprints using FASTA
- molecular filters, custom filters
- introduction to fingerprints and scikit-fingerprints for different backgrounds, e.g. chemists, chemoinformaticians, GNN researchers, ML scientists
- virtual screening, similarity searching, classification
- applicability domain checkers
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
documentationImprovements or additions to documentationImprovements or additions to documentationfeatureNew feature to implementNew feature to implement