You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
MANU - Systematic HPO Benchmark for Molecular GNNs
Systematic Hyperparameter Optimization for Molecular Property Prediction with Graph Neural Networks
Overview
A comprehensive benchmark comparing seven HPO algorithms (including TPE/Bayesian optimization) for GNN-based ADMET property prediction across six datasets from the Therapeutics Data Commons (TDC). Includes comparisons with foundation models (ChemBERTa, MolCLR) and multi-seed statistical validation.
Key Statistics
Metric
Value
Datasets
6 (4 ADME + 2 Toxicity)
Total Molecules
11,805
HPO Algorithms
7 (Random, PSO, ABC, GA, SA, HC, TPE)
Trials per Run
50
Total HPO Runs
42
Total Model Evaluations
2,100
Multi-Seed Validation
5 seeds per dataset
Foundation Models
ChemBERTa (fine-tuned), MolCLR, Morgan-FP
Key Findings
Random Search competitive for regression - Wins on 2/4 ADME datasets (Caco2, Clearance_Microsome)
TPE excels on complex clearance tasks - Best on Clearance_Hepatocyte (47.52 vs 68.22 RMSE)
Metaheuristic algorithms excel on classification - SA wins on Tox21, ABC wins on hERG
ChemBERTa fine-tuning improves toxicity prediction - AUC 0.79 on hERG, 0.73 on Tox21
No universal winner - Algorithm selection should be task-dependent
50 trials is sufficient - Diminishing returns beyond this budget
Results (50 Trials)
ADME Regression (Test RMSE - lower is better)
Dataset
Random
PSO
ABC
GA
SA
HC
TPE
ChemBERTa-FT
Caco2_Wang
0.0027
0.0031
0.0029
0.0031
0.0029
0.0030
0.526
0.506
Half_Life_Obach
22.31
21.66
21.66
21.66
23.70
24.52
98.47
21.99
Clearance_Hepatocyte
68.22
70.21
72.04
71.34
72.04
72.04
47.52
49.39
Clearance_Microsome
38.75
42.76
42.29
42.29
40.94
41.63
39.04
43.25
Toxicity Classification (Test AUC-ROC - higher is better)
Dataset
Random
PSO
ABC
GA
SA
HC
TPE
ChemBERTa-FT
Tox21
0.713
0.692
0.735
0.735
0.743
0.652
0.742
0.735
hERG
0.747
0.747
0.825
0.747
0.802
0.821
0.745
0.791
Winner Summary
Algorithm
Wins
Datasets
Random Search
2/6
Caco2, Clearance_Microsome
PSO
1/6
Half_Life (tie with ABC, GA)
TPE
1/6
Clearance_Hepatocyte
SA
1/6
Tox21
ABC
1/6
hERG
Multi-Seed Validation (n=5 seeds)
Statistical robustness with 95% confidence intervals: