If the purpose is to develop TTS that can be run on resource-constrained embedded systems, then you really need to do something about all the dependencies that are pulled in by the setup script. I was trying to benchmark this on a system that only has a 16Gb disk and this eats up almost all of the remaining free space.
As it stands, this is utterly unusable on the very systems it's supposed to be used on.