@@ -7,12 +7,12 @@ the consistency between different experimental references/measurements. This dat
77
88A subset of the
99molecules in B3DB has numerical ` logBB ` values (1058 compounds), while the whole dataset
10- has categorical (BBB+ or BBB-) BBB permeability labels (7807 compounds). Some physicochemical properties
10+ has categorical (BBB+ or BBB-) BBB permeability labels (7807 compounds prior to v1.0.0 and 7982 compounds after ). Some physicochemical properties
1111of the molecules are also provided.
1212
1313## Citation
1414
15- Please use the following citation in any publication using our * B3DB* dataset:
15+ Please use the following citations in any publication using our * B3DB* dataset:
1616
1717``` md
1818@article {Meng_A_curated_diverse_2021,
@@ -26,6 +26,18 @@ year = {2021},
2626url = {https://www.nature.com/articles/s41597-021-01069-5} ,
2727publisher = {Springer Nature}
2828}
29+
30+ @article {Meng_B3clf_2025,
31+ author = {Meng, Fanwang and Chen, Jitian and Collins-Ramirez, Juan Samuel and Ayers, Paul W.},
32+ doi = {xxx},
33+ journal = {xxx},
34+ number = {xxx},
35+ title = {B3clf: A Resampling-Integrated Machine Learning Framework to Predict Blood-Brain Barrier Permeability},
36+ volume = {x},
37+ year = {xxx},
38+ url = {xxx},
39+ publisher = {xxx}
40+ }
2941```
3042
3143## Features of * B3DB*
@@ -63,6 +75,17 @@ from B3DB import B3DB_DATA_DICT
6375# 'B3DB_regression_extended'
6476# 'B3DB_classification'
6577# 'B3DB_classification_extended'
78+ # "B3DB_classification_external"
79+ df_b3db_reg = B3DB_DATA_DICT [" B3DB_regression" ]
80+ df_b3db_reg.head()
81+ # NO. compound_name ... group comments
82+ # 0 1 moxalactam ... A NaN
83+ # 1 2 schembl614298 ... A NaN
84+ # 2 3 morphine-6-glucuronide ... A NaN
85+ # 3 4 2-[4-(5-bromo-3-methylpyridin-2-yl)butylamino]... ... A NaN
86+ # 4 5 NaN ... A NaN
87+
88+ # [5 rows x 10 columns]
6689
6790```
6891
@@ -111,3 +134,9 @@ Detailed procedures for data curation can be found in [data curation section](da
111134
112135The materials and data under this repo are distributed under the
113136[ CC0 Licence] ( http://creativecommons.org/publicdomain/zero/1.0/ ) .
137+
138+ ## ChangeLog
139+
140+ - 2025Aug16, the B3DB dataset is avaliable via PyPI.
141+ - 2025Aug16, we have added a new set of 171 BBB+ and 4 BBB- compounds to the dataset since
142+ version 1.1.0.
0 commit comments