- 
                Notifications
    You must be signed in to change notification settings 
- Fork 22
implement support for hypothesis as a test generator #857
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
- Modified generate_hypothesis_tests() to return the temp directory Path - Added hypothesis_tests_dir tracking in FunctionOptimizer - Extended cleanup_generated_files() to remove hypothesis test directories - Added hypothesis_tests_dirs list in Optimizer to track all directories - Updated cleanup_temporary_paths() to cleanup hypothesis test directories - Ensures cleanup on success, errors, and KeyboardInterrupt - Changed temp dir prefix to 'codeflash_hypothesis_' for clarity
…ai/codeflash into feat/hypothesis-tests
| PR Reviewer Guide 🔍Here are some key observations to aid the review process: 
 | 
| PR Code Suggestions ✨Explore these optional code suggestions: 
 | 
The optimized code achieves a **32% speedup** by eliminating redundant data structures and reducing iteration overhead through two key optimizations: **1. Single-pass aggregation instead of list accumulation:** - **Original**: Uses `defaultdict(list)` to collect all `FunctionTestInvocation` objects per test function, then later iterates through these lists to compute failure flags with `any(not ex.did_pass for ex in orig_examples)` - **Optimized**: Uses plain dicts with 2-element lists `[count, had_failure]` to track both example count and failure status in a single pass, eliminating the need to store individual test objects or re-scan them **2. Reduced memory allocation and access patterns:** - **Original**: Creates and stores complete lists of test objects (up to 9,458 objects in large test cases), then performs expensive `any()` operations over these lists - **Optimized**: Uses compact 2-item lists per test function, avoiding object accumulation and expensive linear scans The line profiler shows the key performance gains: - Lines with `any(not ex.did_pass...)` in original (10.1% and 10.2% of total time) are completely eliminated - The `setdefault()` operations replace the more expensive `defaultdict(list).append()` calls - Overall reduction from storing ~9,458 objects to just tracking summary statistics **Best performance gains** occur in test cases with: - **Large numbers of examples per test function** (up to 105% faster for `test_large_scale_all_fail`) - **Many distinct test functions** (up to 75% faster for `test_large_scale_some_failures`) - **Mixed pass/fail scenarios** where the original's `any()` operations were most expensive The optimization maintains identical behavior while dramatically reducing both memory usage and computational complexity from O(examples) to O(1) per test function group.
| ⚡️ Codeflash found optimizations for this PR📄 32% (0.32x) speedup for  | 
Co-authored-by: codeflash-ai[bot] <148906541+codeflash-ai[bot]@users.noreply.github.com>
| This PR is now faster! 🚀 Kevin Turcios accepted my code suggestion above. | 
…25-10-26T20.37.41 ⚡️ Speed up function `_compare_hypothesis_tests_semantic` by 32% in PR #857 (`feat/hypothesis-tests`)
| This PR is now faster! 🚀 @KRRT7 accepted my optimizations from: | 
PR Type
Enhancement, Tests
Description
Add Hypothesis test generation pipeline
Discover and track Hypothesis tests
Compare Hypothesis results semantically
Cleanup Hypothesis temp directories
Diagram Walkthrough
File Walkthrough
8 files
Add helper to build qualified function pathAST-based discovery and Hypothesis test supportIntroduce Hypothesis test type and labelIntegrate Hypothesis generation, merge, and cleanupTrack and cleanup Hypothesis test directoriesUse helper for qualified function path; filter discoverySemantic comparison for Hypothesis test resultsImplement Hypothesis ghostwriter generation and filtering1 files
No-op whitespace change in is_pr_draft1 files
Unit tests for deterministic Hypothesis adjustments1 files
Add Hypothesis dependency