It was recently observed that foundation is faster than text/bytestring even though it allocates a lot more: ndmitchell/weeder#27
I suspect if you allocated less you'd we way faster. You might want to use https://hackage.haskell.org/package/weigh to benchmark your allocations, as that would then let you measure and confirm progress, and prevent regressions
Given this project, I suspect this will cause you to reinvent the package first...