Having preprocessed data sets at hand matters a lot for easier experimenting. Links to online data can break. This happened for Persephone-related materials: #226. The issue was fixed quickly, but in the mid & long run the answer lies in stable hosting (+long-term archiving) of preprocessed data sets.
Some data sets preprocessed by @gw17 for experiments in 2020 are up here:
https://github.com/gw17/sltu_corpora
It's fine to have those in different places, hopefully with some sort of inventory somewhere (in Wiki mode?). Or could the Persephone / Elpis team also offer hosting solutions?