Due to the very degraded performance with many small files, Functionalizer spawns a Hadoop file system cluster and stores checkpoint data there.
This blows up the Functionalizer Docker container size due to a full installation of Hadoop being required, and it requires us to use larger SSD storage on nodes. We should look into storing checkpoints somewhere else.