A large part of running cabal test all (one possible benchmark) is running checkPred. This in turns spends most of its time unpacking the indexed tuples in App:
Implementing this should be straight-forward and can be made backwards compatible with a clever use of patterns.
App1 and App2 should be sufficient as all built-in functions are of either arity 1 or 2. Supporting higher arities and having code be generic over arity is still a goal, but for performance sensitive parts it may make sense to avoid extra unpacking.