⚡️ Speed up function compare_test_results by 37% in PR #687 (granular-async-instrumentation)
#732
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
⚡️ This pull request contains optimizations for PR #687
If you approve this dependent PR, these changes will be merged into the original PR branch
granular-async-instrumentation.📄 37% (0.37x) speedup for
compare_test_resultsincodeflash/verification/equivalence.py⏱️ Runtime :
742 microseconds→540 microseconds(best of5runs)📝 Explanation and details
The optimized code achieves a 37% speedup through several key micro-optimizations in the
comparatorfunction, which is the performance bottleneck (consuming 80% of runtime):Primary Optimization - Identity Check: Added
if orig is new: return Trueat the start ofcomparator. This short-circuits expensive recursive comparisons when objects are identical in memory, which happens frequently when comparing the same data structures.Loop Optimizations: Replaced
all()generator expressions with explicitforloops in multiple places:all(comparator(elem1, elem2, superset_obj) for elem1, elem2 in zip(orig, new))to a loop with early return on first mismatchall(k in new and comparator(v, new[k], superset_obj) for k, v in orig.items())to explicit iterationThis eliminates the overhead of generator creation and the
all()function call, while enabling faster short-circuit evaluation.Why This Works: The
all()function with generators creates additional Python objects and function call overhead. Direct loops with early returns are more efficient, especially when mismatches occur early in the comparison (which triggers the short-circuit behavior).Test Case Performance: The optimizations are particularly effective for test cases with:
The optimizations maintain identical behavior while reducing function call overhead and memory allocations during the comparison process.
✅ Correctness verification report:
⚙️ Existing Unit Tests and Runtime
test_codeflash_capture.py::test_codeflash_capture_basictest_codeflash_capture.py::test_codeflash_capture_multiple_helperstest_codeflash_capture.py::test_codeflash_capture_recursivetest_codeflash_capture.py::test_codeflash_capture_super_inittest_codeflash_capture.py::test_instrument_codeflash_capture_and_run_teststest_comparator.py::test_compare_results_fntest_instrument_all_and_run.py::test_bubble_sort_behavior_resultstest_instrument_all_and_run.py::test_class_method_full_instrumentationtest_instrumentation_run_results_aiservice.py::test_class_method_full_instrumentationtest_instrumentation_run_results_aiservice.py::test_class_method_test_instrumentation_onlytest_pickle_patcher.py::test_run_and_parse_picklepatchTo edit these changes
git checkout codeflash/optimize-pr687-2025-09-13T01.22.10and push.