Saber.load_dataset() should be able to pull from pubannotation.org given a projects URL.
E.g.
saber.load_dataset('http://pubannotation.org/projects/AGAC_training/annotations.tgz')
should download the dataset to ~/saber/datasets, convert it to the CoNLL 2003 format, and load it into a Dataset object. Furthermore, if this URL is ever supplied again, load_dataset() should use the cached version of the dataset in ~/saber/datasets.
Considering pubannotation.org contains most of the most popular datasets for BioNLP, this would nearly eliminate the need to maintain datasets locally.