When rewriting stress tests, it would be great to reproduce old (already fixed) bugs and check that we can detect manually embedded bugs.