Current Version: See CHANGELOG.md for the latest release and version history.
This document outlines the planned development trajectory for Tach. Items are aspirational and subject to change based on community feedback and technical discoveries.
gantt
title Tach Development Phases
dateFormat YYYY-MM
section Foundation
0.1.x Core Infrastructure :done, 2026-01, 2026-02
section Compatibility
0.2.x Plugin Ecosystem :done, 2026-02, 2026-04
0.3.x Database Integration :2026-04, 2026-06
section Fixtures
0.4.x Hierarchical Zygotes :done, 2026-06, 2026-08
0.5.x Developer Experience :2026-08, 2026-10
section Performance
0.6.x Configuration :2026-10, 2026-12
0.7.x Memory Snapshotting :2027-01, 2027-03
section Platform
0.8.x CI/CD + Sub-Interpreters :2027-03, 2027-06
0.9.x Stability :active, 2027-06, 2027-08
section Release
0.10.x Beta 1 :2027-08, 2027-09
0.11.x Beta 2 + RC :2027-09, 2027-10
1.0.0 Production :milestone, 2027-10, 0d
flowchart TB
subgraph Phase1["Phase 1: Foundation ✅ COMPLETE"]
direction TB
P1_1["0.1.1 Docs & Polish"]
P1_2["0.1.2 Test Compatibility"]
P1_3["0.1.3 Error Handling"]
P1_4["0.1.4 Dependency Updates"]
P1_5["0.1.5 Tooling Research"]
P1_1 --> P1_2
P1_2 --> P1_3
P1_3 --> P1_4
P1_4 --> P1_5
end
subgraph Phase2["Phase 2: Plugin Compatibility ✅ COMPLETE"]
direction TB
P2_0["0.2.0 Hook Framework ✅"]
P2_1["0.2.1 pytest-django ✅"]
P2_2["0.2.2 pytest-asyncio ✅"]
P2_3["0.2.3 pytest-mock/env/timeout + Django Markers ✅"]
P2_4["0.2.4 Landlock V4-V6 ✅"]
P2_5["0.2.5 Plugin Stabilization ✅"]
P2_0 --> P2_1
P2_0 --> P2_2
P2_0 --> P2_3
P2_1 --> P2_3
P2_0 --> P2_4
P2_1 --> P2_5
P2_2 --> P2_5
P2_3 --> P2_5
P2_4 -.->|optional| P2_5
end
subgraph Phase3["Phase 3: Database Integration"]
direction TB
P3_0["0.3.0 Django DB<br>(Transaction Rollback)"]
P3_1["0.3.1 SQLAlchemy<br>(Session Mgmt)"]
P3_2["0.3.2 Connection Mgmt<br>(FD Teleportation)"]
P3_3["0.3.3 Additional DBs<br>(Postgres/MySQL/SQLite)"]
P3_0 --> P3_2
P3_1 --> P3_2
P3_2 --> P3_3
end
subgraph Phase4["Phase 4: Fixture Lifecycle"]
direction TB
P4_0["0.4.0 Session Fixtures<br>(Shared Memory Cache) ✅"]
P4_1["0.4.1 Module Fixtures<br>(Boundary Detection) ✅"]
P4_2["0.4.2 Class Fixtures ✅"]
P4_3["0.4.3 Autouse Injection<br>(Auto-inject autouse=True)"]
P4_4["0.4.4 Parametrized Fixtures<br>(Expand params at discovery)"]
P4_5["0.4.5 Zygote Warmup<br>(Configurable pre-imports)"]
P4_6["0.4.6 Zygote Pool<br>(Per-scope pools)"]
P4_0 --> P4_1
P4_1 --> P4_2
P4_2 --> P4_5
P4_3 --> P4_5
P4_4 --> P4_5
P4_5 --> P4_6
end
subgraph Phase5["Phase 5: Developer Experience"]
direction TB
P5_0["0.5.0 Enhanced Tracebacks ✅<br>(Colorization done)"]
P5_1["0.5.1 Debug Mode"]
P5_2["0.5.2 Interactive Debugging<br>(pdb/breakpoint)"]
P5_3["0.5.3 Watch Mode Enhancements<br>(Targeted re-discovery)"]
P5_4["0.5.4 Smart Watch Filtering<br>(.tachignore support)"]
P5_5["0.5.5 Log Capture<br>(Structured parsing)"]
P5_6["0.5.6 Coverage Optimization ✅<br>(PEP 669 done)"]
P5_1 --> P5_2
P5_3 --> P5_4
end
subgraph Phase6["Phase 6: Configuration"]
direction TB
P6_0["0.6.0 pyproject.toml Schema"]
P6_1["0.6.1 ENV_DENYLIST<br>(Security filtering)"]
P6_2["0.6.2 Toxicity Config<br>(Configurable blocklist)"]
P6_3["0.6.3 Plugin Config<br>(Priority/disabled)"]
P6_4["0.6.4 Scheduler Persistence<br>(Resume interrupted runs)"]
P6_5["0.6.5 Config Profiles"]
P6_0 --> P6_1
P6_0 --> P6_2
P6_0 --> P6_3
P6_0 --> P6_4
P6_4 --> P6_5
end
subgraph Phase7["Phase 7: Performance"]
direction TB
P7_0["0.7.0 Test History Store ✅<br>(SQLite duration cache)"]
P7_1["0.7.1 Memory Optimization<br>(Snapshot Compression)"]
P7_2["0.7.2 UFFD Write-Protect<br>(Dirty Page Tracking)"]
P7_3["0.7.3 Vectorized Restore<br>(Batch UFFDIO_COPY)"]
P7_4["0.7.4 TLS Calibration ✅<br>(Sentinel scan done)"]
P7_5["0.7.5 Adaptive Scheduling ✅<br>(Duration Prediction)"]
P7_6["0.7.6 Lazy Loading<br>(On-demand Import)"]
P7_7["0.7.7 Advanced Snapshots<br>(Kernel LKM Research)"]
P7_8["0.7.8 UFFD_EVENT_FORK<br>(Fork Tracking)"]
P7_9["0.7.9 UFFD_EVENT_REMAP<br>(mremap Tracking)"]
P7_0 --> P7_5
P7_1 --> P7_2
P7_2 --> P7_3
P7_3 --> P7_7
P7_1 --> P7_8
P7_8 --> P7_9
end
subgraph Phase8["Phase 8: Platform Integration"]
direction TB
P8_0["0.8.0 GitHub Actions<br>(Annotations/Summary)"]
P8_1["0.8.1 JUnit XML ✅<br>(Already implemented)"]
P8_2["0.8.2 Other CI Platforms<br>(TeamCity/Azure DevOps)"]
P8_3["0.8.3 Coverage Formats<br>(Cobertura/HTML)"]
P8_4["0.8.4 Sub-Interp Architecture<br>(Design: Zygote hybrid)"]
P8_5["0.8.5 Sub-Interpreters<br>(PEP 684 Experimental)"]
P8_6["0.8.6 Sub-Interp State Reset<br>(Module re-init)"]
P8_0 --> P8_2
P8_2 --> P8_3
P8_4 --> P8_5
P8_5 --> P8_6
end
subgraph Phase9["Phase 9: Stability"]
direction TB
P9_0["0.9.0 Crash Recovery ✅<br>(SIGCHLD detection)"]
P9_1["0.9.1 Signal Routing<br>(Debug mode handling)"]
P9_2["0.9.2 CleanupGuard<br>(Mutex poison immunity)"]
P9_3["0.9.3 UFFD FD Limits<br>(Per-worker tracking)"]
P9_4["0.9.4 Snapshot Memory<br>(Golden page budget)"]
P9_5["0.9.5 OverlayFS Cleanup<br>(Upperdir pruning)"]
P9_6["0.9.6 Seccomp Limits<br>(BPF instruction count)"]
P9_7["0.9.7 Protocol Versioning<br>(Upgrade path)"]
P9_8["0.9.8 Stress Testing<br>(10k+ Tests)"]
P9_0 --> P9_1
P9_1 --> P9_2
P9_2 --> P9_3
P9_3 --> P9_4
P9_4 --> P9_5
P9_5 --> P9_6
P9_6 --> P9_7
P9_7 --> P9_8
end
subgraph Phase10["Phase 10: Release 🔵 MILESTONE"]
direction TB
P10_0["0.10.0 Beta 1<br>(Feature Freeze)"]
P10_1["0.10.1 Beta 1 Fixes"]
P10_2["0.11.0 Beta 2"]
P10_3["0.11.1 RC1"]
P10_4["0.11.2 RC2"]
P10_5["1.0.0 Production<br>(API Stability)"]
P10_0 --> P10_1
P10_1 --> P10_2
P10_2 --> P10_3
P10_3 --> P10_4
P10_4 --> P10_5
end
subgraph Future["Future (Post-1.0)"]
%% Details in "Future Phases (Post-1.0)" table below
direction LR
F0["1.1.x Maintenance"]
F1["1.2.x Features"]
F2["0.12.x Remote Execution"]
F3["0.13.x Test Sharding ✅<br>(Shipped in 0.9.0)"]
F4["0.14.x Visual Testing"]
F5["0.15.x AI-Powered"]
F6["0.16.x Mutation Testing"]
F7["0.17.x Property-Based"]
F8["0.18.x Contract Testing"]
F9["0.19.x Benchmarking"]
F10["0.20.x Observability"]
end
%% Phase Dependencies (Phase 1 enables parallel work)
Phase1 --> Phase2
Phase1 --> Phase5
Phase1 --> Phase6
Phase1 --> Phase7
Phase1 --> Phase8
Phase1 --> Phase9
Phase2 --> Phase3
Phase3 --> Phase4
Phase4 --> Phase10
Phase9 --> Phase10
Phase10 --> Future
%% Cross-phase dependencies
P2_1 -.->|"Django fixtures"| P3_0
P3_0 -.->|"DB transactions"| P4_0
P5_6 -.->|"PEP 669 coverage"| P8_3
P7_8 -.->|"fork tracking"| P9_0
%% Note: Some nodes (P5_0, P5_5, P5_6, P7_4, P7_6, P8_1) are intentionally disconnected
%% They represent completed standalone work or items that can start independently
%% Styling
classDef done fill:#22c55e,stroke:#16a34a,color:#fff
classDef inProgress fill:#f59e0b,stroke:#d97706,color:#fff
classDef canStart fill:#3b82f6,stroke:#1d4ed8,color:#fff
classDef pending fill:#94a3b8,stroke:#64748b,color:#fff
classDef milestone fill:#8b5cf6,stroke:#7c3aed,color:#fff,stroke-width:2px
class P1_1,P1_2,P1_3,P1_4,P1_5 done
class P2_0,P2_1,P2_2,P2_3,P5_0,P5_6,P7_4,P8_1 done
class P6_0,P6_1,P6_2,P6_4,P8_0,P9_2,P9_5,P9_7 done
class P5_1 done
class P5_3,P5_5,P6_3,P6_5,P7_1,P7_6,P9_6 canStart
class P2_4,P2_5 done
class P3_0,P3_1 done
class P3_2,P3_3 pending
class P4_0,P4_1,P4_2 done
class P4_3,P4_4,P4_5,P4_6 pending
class P5_2,P5_4 pending
class P6_3,P6_4,P6_5 pending
class P7_0,P7_5 done
class P7_2,P7_3,P7_7,P7_8,P7_9 pending
class P8_2,P8_3,P8_4,P8_5,P8_6 pending
class P9_0 done
class P9_1,P9_3,P9_4,P9_8 pending
class P10_0,P10_1,P10_2,P10_3,P10_4 pending
class P10_5 milestone
class F0,F1,F2,F3,F4,F5,F6,F7,F8,F9,F10 pending
Legend: 🟢 Done | 🟠 In Progress | 🔵 Can Start Now | ⚪ Pending | 🟣 Milestone
Current Status (v0.9.0):
- Phase 1 (0.1.x): Complete
- Phase 2 (0.2.x): Complete
- Phase 3 (0.3.x): 0.3.0 + 0.3.1 done
- Phase 4 (0.4.x): 0.4.0-0.4.2 done (scope-aware fixture lifecycle for session/module/class)
- Phase 5 (0.5.x): 0.5.0-0.5.1 + 0.5.6 done
- Phase 6 (0.6.x): 0.6.0-0.6.4 done
- Phase 7 (0.7.x): 0.7.0 (history store) + 0.7.4 (TLS calibration) + 0.7.5 (adaptive scheduling) done
- Phase 8 (0.8.x): 0.8.0-0.8.6 done (GitHub Actions, JUnit XML, bench, ~68 CLI flags)
- Phase 9 (0.9.x): 0.9.0 (SIGCHLD crash detection) + 0.9.2 (CleanupGuard) + 0.9.5 (stale cleanup) + 0.9.7 (protocol versioning) done
- Test sharding (--shard) shipped early in 0.9.0 (originally planned post-1.0)
- 1083 tests, scope-aware fixtures, SIGCHLD crash detection, adaptive scheduling
- Phase 10: Not started
Research Foundation: This roadmap is informed by 12 research papers and competitive analysis of 10+ Rust-Python test tools. See research/README.md for paper analysis and implementation mapping, and external-research.md for competitive landscape.
| Tool | Approach | Startup | Tach Advantage |
|---|---|---|---|
| pytest-xdist | execnet workers | ~50-100ms | 1000x faster isolation |
| pytest-forked | fork() per test | ~500-1000μs | 10x faster reset |
| Maelstrom | Container per test | 50-100ms | 1000x faster startup |
| rtest/karva | No isolation | N/A | Full isolation + fixtures |
| snob | Test selection only | N/A | Full execution engine |
Key Insight: No existing tool combines Tach's speed (<50μs reset), isolation (userfaultfd), and compatibility (full pytest fixtures). See external-research.md §2.3 for detailed analysis.
| Feature | Tach | pytest-xdist | Maelstrom | snob | rtest |
|---|---|---|---|---|---|
| Per-test isolation | userfaultfd | None | Containers | None | None |
| Reset time | <50μs | N/A | 50-100ms | N/A | N/A |
| Full fixtures | Yes | Yes | Limited | N/A | No |
| Test selection | Planned | No | No | Yes | No |
| Distributed | Planned | Yes | Yes | No | No |
| Mutation testing | Planned | No | No | No | No |
| Static discovery | Yes | No | Yes | Yes | Yes |
Key Differentiator: Tach is the only tool combining sub-millisecond isolation with full pytest fixture support. Competitors sacrifice either speed (Maelstrom) or compatibility (rtest, karva).
Full Matrix: See container-compatibility.md for Docker, Podman, and Kubernetes configurations with capability requirements.
Full Matrix: See ../python-compatibility.md for Python 3.10-3.14 support, PyPy status, and free-threading implications.
Full Matrix: See isolation-landlock.md for Landlock ABI V1-V6 requirements and isolation-userfaultfd.md for userfaultfd kernel requirements.
Critical (Blocking Adoption):
- Plugin Shim (0.2.x): pytest-django, pytest-asyncio, pytest-mock support
Complete. Session effects, hook interception, marker extraction all working.
- Database Rollback (0.3.x): Transaction savepoint/rollback for Django ORM, SQLAlchemy
Database tests are ~40% of enterprise test suites. Memory snapshots don't restore DB state.
- Session/Module Fixtures (0.4.x): Fixtures persisting across tests
Complete in 0.9.0. Scope-aware scheduling with skip_reset preserves fixture state across module/class boundaries.
Important (Adoption Friction):
- pytest.raises/warns: Exception and warning assertion helpers
- Parametrized Fixtures:
@pytest.fixture(params=[...]) - Marker Expressions: Full
-mexpression support (-m "slow and not db") - conftest.py Hooks:
pytest_configure,pytest_collection_modifyitems
Nice-to-Have (Competitive Edge):
- Test Impact Analysis: snob-style "only run affected tests" mode
Ref: alexpasmantier/snob - dependency graph analysis Implementation approach:
- Build code-to-test dependency graph during discovery
- Track which source files affect which tests via import analysis
- Integrate with
git difffor "affected tests only" mode - Cache dependency graph with file hash invalidation
- Provide
--affectedCLI flag for CI integration
- Flaky Test Detection: nextest-style retry and flakiness tracking
- Distributed Execution: Maelstrom-style cluster mode for CI farms
Based on external-research.md §24:
From snob (Test Impact Analysis):
- Dependency graph analysis for test selection
- Git commit range integration (
--affected --commit-range HEAD~5..HEAD) - Cache dependency graph with file hash invalidation
From nextest (CI Integration):
- Test partitioning for parallel CI jobs
- Flaky test detection with automatic retry
- Progress reporting UX patterns
From Maelstrom (Distribution):
- Broker/worker architecture for cluster mode
- OCI-like container images for reproducibility
- Cross-node result aggregation
From pymute (Quality):
- Mutation testing integration
- Parallel mutant execution patterns
- Quality score reporting
| Version | Research Phase | Primary Paper | Key Deliverable |
|---|---|---|---|
| 0.1.x | Static Discovery | Python Testing Engine Rust Breakthroughs | AST-based test discovery eliminating "Import Tax" |
| 0.2.x | Plugin Isolation | Project Tach Compatibility Layer Blueprint | Shadow plugin shim with syscall interception |
| 0.3.x | Database Safety | Fork Safety of Python C-Extensions | Transactional rollback, connection dispose pattern |
| 0.4.x | Zygote Hierarchy | Forklift, Python Monorepo Zygote Tree Design | DAAC clustering for hierarchical pre-initialization |
| 0.5.x | Observability | Rust-CPython Execution Blueprint Research | PEP 669 low-impact monitoring integration |
| 0.6.x | Zero-Copy Loading | Zero-Copy Python Module Loading | mmap-based bytecode loading bypassing importlib |
| 0.7.x | Memory Snapshots | Python Memory Snapshotting with Userfaultfd | userfaultfd + MADV_DONTNEED microsecond reset |
| 0.8.x+ | Cross-Platform | Cross-Platform Process Cloning Research | mach_vm_remap (macOS), NT Section Objects (Windows) |
Before 1.0.0, verify all critical research requirements are met.
Tooling and Container Compatibility (Q1 2026):
| Requirement | Status | Documentation |
|---|---|---|
.ignore File Interactions |
Done | tooling-conflicts.md |
| Container Sandbox Behavior | Done | container-compatibility.md |
| Ignored Test Categories (24 total) | Done | test-discovery-analysis.md |
Original Research Requirements:
| Requirement | Research Source | External Ref | Status |
|---|---|---|---|
Allocator Quiesce (thread.tcache.flush) |
Memory Snapshotting with Userfaultfd | jemalloc mallctl | Pending |
| Toxicity Detection (fork-unsafe patterns) | Static Analysis for Toxic Python Modules | POSIX fork() | Pending |
| Namespace Isolation (CLONE_NEWNS/NET) | Compatibility Layer Blueprint | Landlock docs | Pending |
| Database Dispose (connection pools) | Fork Safety of Python C-Extensions | — | Pending |
| TLS Restoration (mimalloc, Python 3.13+) | Userfaultfd and CPython Allocator | mimalloc | Done |
| TLS Calibration (sentinel scan) | Userfaultfd and CPython Allocator | See src/isolation/calibration.rs |
Done |
| Landlock Path Canonicalization | Compatibility Layer Blueprint | PathFd TOCTOU safety | Done |
| Seccomp Blacklist (22 syscalls) | Compatibility Layer Blueprint | See src/isolation/sandbox.rs |
Done |
| Iron Dome Integration | Compatibility Layer Blueprint | apply_iron_dome() in sandbox.rs |
Done |
| Graceful Degradation (kernel < 5.13) | Compatibility Layer Blueprint | SandboxStatus::NotEnforced handling |
Done |
GIL Management (py.allow_threads()) |
— | PyO3 Parallelism | Pending |
| PyO3 0.26+ API Migration | — | PyO3 Migration | Pending |
TLS Segment Registration (fs_base) |
Userfaultfd and CPython Allocator | arch_prctl(2) | Pending |
| Free-Threaded Python (3.13t/3.14t) | — | py-free-threading | Pending |
Complete Index: See README.md for the full documentation map.
| Category | Count | Key Documents |
|---|---|---|
| Deep Dives | 7 | isolation-deep-dive.md, discovery-deep-dive.md, execution-deep-dive.md |
| Isolation Modules | 4 | isolation-landlock.md, isolation-seccomp.md, isolation-userfaultfd.md |
| Research & Analysis | 6 | external-research.md, topic-archive.md, container-compatibility.md |
| User Documentation | 7 | ../quickstart.md, ../configuration.md, ../troubleshooting.md |
Detailed specs in external-research.md and topic-archive.md
| Version | Feature | Learn From |
|---|---|---|
| 0.12.x | Remote Execution | Maelstrom broker/worker |
| 0.13.x | Test Sharding | nextest --shard N/M |
| 0.14.x | Visual Testing | Playwright snapshots |
| 0.15.x | AI-Powered | Flaky detection, test gen |
| 0.16.x | Mutation Testing | pymute patterns |
| 0.17.x | Property-Based | hypothesis integration |
| 0.18.x | Contract Testing | OpenAPI validation |
| 0.19.x | Benchmarking | @benchmark marker |
| 0.20.x | Observability | OpenTelemetry, Prometheus |
Status: All 5 milestones delivered. See CHANGELOG.md for release details.
Research Foundation: Implements the "Kineton" engine from Python Testing Engine Rust Breakthroughs.
| Version | Focus | Key Deliverables |
|---|---|---|
| 0.1.1 | Docs & Polish | Examples directory, quickstart guide, shell completions, --dry-run |
| 0.1.2 | Test Compatibility | pytest.raises/warns/approx, traceback formatting, timeout handling |
| 0.1.3 | Error Handling | Error categorization (E001-E020), --diagnose flag, remediation suggestions |
| 0.1.4 | Dependencies | PyO3 0.27.2, Rust 2024 Edition, Python 3.14 support |
| 0.1.5 | Tooling Research | .ignore conflicts, container compatibility, test discovery analysis |
Implementation Details: For complete task breakdown and research references, see git history for v0.1.1-v0.1.5 tags.
Focus: Shadow plugin shim for pytest ecosystem integration without full
pluggysupport.Research Foundation: Implements the "Matrix Layer" from Project Tach Compatibility Layer Blueprint for syscall isolation.
- "Isolation without overhead requires moving from userspace interception to kernel-level integration—combined with a pragmatic plugin shim that records and replays pytest internals" — Project Tach Compatibility Layer Blueprint
- "Every syscall that modifies global state is transparently isolated per-worker with <5% overhead" — Project Tach Compatibility Layer Blueprint
The 0.2.x series introduces a plugin compatibility layer that intercepts common pytest plugin hooks. This is NOT full pluggy support - instead, we implement targeted shims for the most popular plugins.
Development Flow: See the main flowchart at the top of this document for task dependencies. Items 0.2.1-0.2.4 can be developed in parallel after 0.2.0 is complete.
Target: Core infrastructure for intercepting pytest hooks.
Status: Complete
Completed: Hook registry types with Serde, 10 builtin hook specs, hook detection in conftest.py, marker extraction from decorators (with JSON output), autouse fixture detection, path canonicalization for hook matching, SysPathAction enum (type-safe), session effects IPC bridge (Zygote → Supervisor → Workers), debug logging for effect application, pytest_sessionstart in SESSION_HOOKS, HookEffect enum with all variants, toxicity integration for global-state-modifying hooks, conftest inheritance resolution, effect recording for pytest_configure/sessionstart, effect replay in workers, IPC protocol extension, plugin detection and warning system, HookResult type and aggregation strategies, HookCaller with PyO3 bridge, hook dependency graph, plugin shim registry.
- Design hook interception architecture
Ref: "Most pytest plugins perform one of three actions: Metadata modification, Fixture setup, or Reporting. Only (1) and (2) must be captured" — Project Tach Compatibility Layer Blueprint
- Hook registry for tracking available hooks
- Hook types with Serde derives for IPC serialization
- 10 builtin hook specs (pytestconfigure, pytest_runtest*, etc.)
- Hook caller that invokes registered handlers
- Hook result aggregation (first-result, all-results)
- Hook wrapper specifications
- Implement
conftest.pydiscovery and loading- Scan for
conftest.pyin test directories (existing) - Parse hook function definitions
- Extract pytest markers from @pytest.mark.* decorators
- Detect autouse fixtures
- Build hook dependency graph
- Handle conftest inheritance
- Scan for
-
pytest_configure(config)- Plugin configuration -
pytest_collection_modifyitems(items)- Test collection modificationRef: "By recording effects in the parent and replaying them in the child, Tach avoids the need to re-run complex plugin logic in every worker" — Project Tach Compatibility Layer Blueprint
-
pytest_runtest_setup(item)- Pre-test setup -
pytest_runtest_teardown(item)- Post-test teardown -
pytest_runtest_makereport(item, call)- Result reporting -
pytest_sessionstart(session)- Session initialization -
pytest_sessionfinish(session)- Session cleanup
- Detect installed pytest plugins via
pkg_resources - Create plugin shim registry
Ref: "The Tach supervisor creates a per-worker isolated namespace at clone time" — Project Tach Compatibility Layer Blueprint
- Log warnings for unsupported plugins
- Allow disabling specific plugins via config
- Support plugin ordering/priority
Status: ✅ COMPLETE (Core Infrastructure)
Target: First-class Django test support.
Parallelization: Can be developed in parallel with 0.2.2, 0.2.3, and 0.2.3.1. Only requires 0.2.0 (hook framework) to be complete. No dependencies on other 0.2.x versions.
Note: Marker detection (
django_db,urls, etc.) is already implemented in core discovery. Tests marked with@pytest.mark.django_dbare detected and the marker name is available inTestCase.markers. The items below are about executing the marker behavior.
-
@pytest.mark.django_db- Basic marker detection and savepoint isolation -
DjangoDbSetupHookEffect for database configuration - SAVEPOINT-based transaction rollback in harness
- pytest-django registered as "Supported" plugin
- Integration tests in
tests/gauntlet_django/
See GitHub issues for tracking:
-
transaction=True- Use real transactions (#40) -
reset_sequences=True- Reset auto-increment (#36) -
databases=['default', 'secondary']- Multi-db (#38) -
@pytest.mark.urls('myapp.test_urls')- URL override (#35) -
@pytest.mark.ignore_template_errors- Template error handling (#35)
See #39 for tracking:
-
client- Django test client -
rf- Request factory -
admin_client- Logged-in admin client -
admin_user- Admin user instance -
django_user_model- User model class -
django_username_field- Username field name -
settings- Settings override context manager -
live_server- Live server URL -
db- Database access fixture -
transactional_db- Transactional database
- Hook into Django's transaction management (savepoint-based)
- Preserve database connections across test resets
- Handle database migrations in test database
- Support
--reuse-dbflag for faster test runs (#37) - Support
--create-dbflag for fresh database (#37) - Handle multi-database configurations (#38)
- Support database aliases (#38)
Target: Native async/await test support.
Parallelization: Can be developed in parallel with 0.2.1, 0.2.3, and 0.2.3.1. Only requires 0.2.0 (hook framework) to be complete. No dependencies on other 0.2.x versions.
- Detect async test functions (
async def test_...)Already implemented in core discovery - TestCase.is_async field
- Detect async fixtures (
@pytest.fixtureon async functions) - Handle sync tests that use async fixtures
- Support async context managers
- Handle async generators
- Create event loop per test (default)
Ref: "To solve this, we employ tokio::task::LocalSet to pin interpreter-specific tasks to their originating thread" — Rust-CPython Execution Blueprint Research
- Support session-scoped event loop via marker
- Properly cleanup event loop after test
- Handle
asyncio.run()calls within tests - Support custom event loop policies
- Handle uvloop integration
-
@pytest.mark.asyncio- Mark async tests -
@pytest.mark.asyncio(loop_scope="session")- Shared loop -
@pytest.mark.asyncio(loop_scope="module")- Module loop - Automatic async test detection mode
- Run async tests with proper timeout handling
- Support
awaitin async fixtures - Handle async context managers in fixtures
- Proper cancellation on test timeout
- Support gather/wait patterns
- Handle TaskGroup cleanup
Target: Support for commonly used pytest plugins and additional Django markers.
Status: ✅ COMPLETE
Parallelization: Can be developed in parallel with 0.2.1 and 0.2.2. Only requires 0.2.0 (hook framework) to be complete.
-
mockerfixture providingunittest.mockwrappersWorks natively via pytest's fixture resolution. Tach does not intercept.
-
mocker.patch()context manager -
mocker.patch.object()method -
mocker.patch.dict()dictionary patching -
mocker.spy()for call tracking -
mocker.stub()for stub creation - Automatic mock cleanup after each test
Handled by pytest-mock's built-in teardown
- Support
mocker.stopall()
- Read
[pytest_env]frompyproject.toml - Set environment variables before test collection
- Support variable expansion (
{VAR})Note: Uses
{VAR}format per pytest-env convention, not${VAR} -
Preserve original values for restorationNote: pytest-env does NOT restore values by design. This requirement was incorrect.
- Support conditional env vars
Basic support via expansion. Full conditional syntax deferred.
-
@pytest.mark.timeout(30)marker support - Global timeout via config
- Timeout methods: signal, thread
Note: Tach uses supervisor-level process termination (SIGTERM/SIGKILL)
- Timeout callback for custom handling
Implemented via
timeout_hookin config - Per-phase timeouts (setup, call, teardown)
Note: Current implementation is aggregate timeout. Per-phase is future enhancement.
-
@pytest.mark.urls('myapp.test_urls')- Override ROOT_URLCONF per test -
@pytest.mark.ignore_template_errors- Suppress template errors - Positional argument extraction in Rust scanner
- URL cache clearing on override/restore
- Template debug mode toggle
The following django_db marker options require deeper database transaction support:
transaction=True- Use real transactions (not savepoints)reset_sequences=True- Reset auto-increment sequencesdatabases=['default', 'secondary']- Multi-database support
- Detect pytest-cov and warn about Tach's native coverage
Ref: "employs PEP 669 (Low-Impact Monitoring) to achieve observability with negligible overhead" — Rust-CPython Execution Blueprint Research
- Suggest using
--coverageflag instead - Disable pytest-cov when Tach coverage is active
- Support coverage configuration options
- Detect pytest-xdist and warn about Tach's native parallelism
Ref: "Objects passed between orchestrator and worker processes must be serialized, a CPU-intensive operation that often negates the benefits of parallelism for short-running tests" — Python Testing Engine Rust Breakthroughs
- Support
-nflag as alias for--workers - Ignore xdist-specific markers gracefully
Target: Use Landlock for network isolation when available, reducing reliance on CLONE_NEWNET.
Status: ✅ COMPLETE
Parallelization: Fully independent. Can be developed at any time after 0.2.0. This is a kernel feature enhancement with no dependencies on plugin shims (0.2.1-0.2.4).
- Detect Landlock ABI V4+ at runtime
- Implement TCP bind restrictions per worker
Workers should only bind to assigned port ranges
- Implement TCP connect restrictions
Block outbound connections except to localhost and configured hosts
- Graceful fallback to
CLONE_NEWNETon older kernels
[tool.tach.network]
allow_localhost = true
allow_connect = ["api.example.com:443"]
allow_bind_ports = [8000, 8080] # Empty = no binding allowedExternal Ref: Landlock Kernel Docs - Network
Target: Ensure plugin shims work correctly with real-world projects.
Status: ✅ COMPLETE
Parallelization: SEQUENTIAL - Must wait for 0.2.1, 0.2.2, and 0.2.3 to complete. This version tests and stabilizes all plugin shims, so the plugins must exist first.
- Create plugin compatibility test suite
- Test against popular open-source Django projects
- Test against popular async projects (FastAPI, aiohttp)
- Document plugin compatibility matrix
- Create plugin integration tests
- Benchmark plugin overhead
- Optimize hook dispatch path
- Cache conftest.py parsing results
- Lazy-load plugin shims
Focus: Transaction rollback and connection handling for database-heavy test suites.
Research Foundation: Addresses the "Fork-Safety Paradox" from Fork Safety of Python C-Extensions and database isolation from Rust-Python Test Isolation Blueprint.
- "The fundamental assumptions of fork()—specifically regarding memory isolation and state duplication—are incompatible with the complex internal threading pools, global state mutexes, and hardware contexts managed by modern C libraries" — Fork Safety of Python C-Extensions
- "Ensure that any connection pool created in the parent is explicitly discarded in the child process immediately after startup" — Fork Safety of Python C-Extensions
- "Injecting SAVEPOINT and ROLLBACK TO SAVEPOINT to make DB tests I/O-free" — Rust-Python Test Isolation Blueprint
The 0.3.x series focuses on database test isolation. The key insight is that database state cannot be restored via memory snapshots - we need to hook into the database driver level to rollback transactions.
Target: Django ORM transaction rollback.
- Hook into
django.db.transaction.atomic()Ref: "Regardless of success or failure, Tach injects ROLLBACK TO SAVEPOINT tachtest_start. This instantly reverts the database state" — _Rust-Python Test Isolation Blueprint
- Wrap each test in a savepoint
- Rollback savepoint after test completion
- Handle nested transactions correctly
- Support
transaction.on_commit()hooks - Handle transaction.non_atomic_requests
- Track all database aliases in use
- Apply transaction wrapping to all databases
- Handle cross-database queries
- Support database routers
- Handle read replicas
- Keep database connections alive across tests
Ref: "Ensure that any connection pool created in the parent is explicitly discarded in the child process immediately after startup" — Fork Safety of Python C-Extensions
- Reset connection state without closing
- Handle connection pool exhaustion
- Reconnect on connection drop
- Monitor connection health
- Detect migration state at startup
- Skip migration if test database exists and is current
- Support
--create-dbflag to force recreation - Handle migration conflicts gracefully
- Support migration squashing
Target: SQLAlchemy session management.
Status: ✅ COMPLETE
- Hook into
Session.commit()to prevent actual commits - Wrap sessions in nested transactions (savepoints)
- Handle
Session.rollback()within tests - Support scoped session patterns
- Handle session-per-request patterns
- Detect SQLAlchemy engine configuration
- Apply connection pooling optimizations
- Handle multiple engines (read replicas, etc.)
- Support async SQLAlchemy (asyncpg, aiosqlite)
- Handle engine disposal
- Detect Alembic migration configuration
- Verify migration state matches expected
- Support running migrations before tests
- Handle migration downgrade on test database
- Support migration branching
Target: Advanced connection pool handling.
- Keep connection pools alive across worker restarts
- Implement FD handover via SCM_RIGHTS
Ref: "Pass FDs to worker processes via Unix sockets. Reconstruct connection objects from FDs" — Project Tach Compatibility Layer Blueprint
- Handle pool size limits correctly
- Monitor connection health
- Support connection aging
- Capture database connection file descriptors
- Pass FDs to worker processes via Unix sockets
- Reconstruct connection objects from FDs
- Handle SSL connections specially
Ref: "SSL error: decryption failed or bad record mac" — Fork Safety of Python C-Extensions
- Support connection metadata transfer
- Verify connection validity before test
- Detect stale connections
- Reconnect automatically on failure
- Log connection pool statistics
- Emit metrics for monitoring
Target: Support for other database systems.
- Support PostgreSQL savepoints natively
- Handle advisory locks
- Support LISTEN/NOTIFY cleanup
- Handle temp tables correctly
- Support PostgreSQL-specific types
- Handle pg_dump/pg_restore for fixtures
- Support MySQL savepoints
- Handle MySQL-specific locking
- Support MySQL 8.0+ features
- Handle character set issues
- Support MariaDB extensions
- In-memory database optimization
- File-based database snapshotting
- Handle WAL mode correctly
- Support shared cache mode
- Handle SQLite concurrent access
- Hook into PyMongo sessions
- Transaction support (requires replica set)
- Collection cleanup approach for non-transactional
- Document limitations
- Support Motor (async MongoDB)
- Support Redis transactions
- Handle Redis pub/sub cleanup
- Support Redis Cluster
- Handle connection pooling
- Auto-detect gRPC usage in test dependencies
- Set
GRPC_ENABLE_FORK_SUPPORT=1environment variableRef: "gRPC fork safety requires GRPC_ENABLE_FORK_SUPPORT=1 and epoll1 polling" — Fork Safety of Python C-Extensions
- Verify
epoll1polling engine compatibility - Warn if active RPCs detected before fork
gRPC fork support only works with no active RPCs
- Document gRPC-specific test patterns
External Ref: gRPC Fork Support
Focus: Proper handling of session-scoped and module-scoped fixtures.
Research Foundation: Implements "Hierarchical Zygote Trees" from Forklift and Python Monorepo Zygote Tree Design using DAAC clustering.
- "By moving beyond the traditional single-zygote model to a tiered, hierarchical structure, the proposed system maximizes memory sharing via Copy-on-Write (CoW) mechanisms" — Python Monorepo Zygote Tree Design
- "The root node contains universally shared modules (e.g., os, sys). Child nodes branch off to specialize (e.g., a 'Data Science Zygote' adds numpy)" — Python Monorepo Zygote Tree Design
- "A novel 'Dependency-Aware Agglomerative Clustering' (DAAC) algorithm that synthesizes the dependency graph into an optimal initialization tree" — Python Monorepo Zygote Tree Design
The 0.4.x series addresses one of the biggest gaps in the current implementation: fixtures that should persist across multiple tests. Session-scoped fixtures in particular are tricky because they must survive worker restarts.
Target: Fixtures that persist for the entire test session.
Status: Complete (v0.9.0) - Session-scoped autouse fixtures execute in zygote before fork.
- Identify session-scoped fixtures at discovery time
Ref: "The forked process receives the list of modules to add via a pipe. It imports them. This process becomes the 'DataScience Zygote'" — Python Monorepo Zygote Tree Design
- Execute session fixtures before any tests run
- Store fixture values in shared memory
Ref: "This 'Zero-Copy' approach reduces the overhead of data transfer from O(N) (serialization) to O(1) (pointer passing)" — Rust-Python Test Isolation Blueprint
- Make values available to all workers
- Handle fixture dependencies
- Define serialization protocol for fixture values
- Handle pickle-able objects directly
Ref: "Objects passed between orchestrator and worker processes must be serialized (pickled) and deserialized, a CPU-intensive operation" — Python Testing Engine Rust Breakthroughs
- Support custom serializers for complex objects
- Handle non-serializable fixtures (connections, etc.)
- Support cloudpickle for lambda functions
- Track session fixture finalizers
- Run finalizers after all tests complete
- Handle finalizer errors gracefully
- Support async finalizers
- Ensure finalizer ordering
Target: Fixtures that persist for a single module.
Status: Complete (v0.9.0) - Scheduler groups tests by module and dispatches sequentially with skip_reset.
- Group tests by module at scheduling time
Ref: "In this model, zygotes are specialized at different levels of a dependency tree. A root zygote might hold the OS-level dependencies; a second-level zygote might import pandas and numpy" — Rust Static Analysis for Toxic Python Modules
- Track module transitions during execution
- Trigger fixture finalization on module change
- Handle module re-entry
- Setup module fixtures before first test in module
- Cache fixture values during module execution
- Teardown fixtures when leaving module
- Handle module import errors gracefully
- Support fixture reuse within module
- Batch tests from same module to same worker
Ref: "We define a Weight Vector W where W[j] corresponds to the estimated cost of module mj. These weights are derived from heuristics or optional historical profiling data" — _Python Monorepo Zygote Tree Design
- Minimize fixture setup/teardown overhead
- Share module fixtures between workers when safe
- Prefetch module fixtures
Target: Fixtures that persist for a test class.
Status: Complete (v0.9.0) - Scheduler groups tests by class and dispatches sequentially with skip_reset.
- Group tests by class at scheduling time
- Track class transitions during execution
- Handle class inheritance correctly
- Support nested test classes
- Setup class fixtures before first test in class
- Cache fixture values during class execution
- Teardown fixtures when leaving class
- Handle setup_class/teardown_class methods
Target: Complete fixture compatibility with pytest.
- Detect
@pytest.fixture(autouse=True) - Automatically apply to matching tests
- Respect fixture scope for autouse
- Handle autouse in conftest.py
- Support conditional autouse
- Build fixture dependency graph
Ref: "A novel 'Dependency-Aware Agglomerative Clustering' (DAAC) algorithm that synthesizes the dependency graph into an optimal initialization tree" — Python Monorepo Zygote Tree Design
- Teardown in reverse dependency order
- Handle circular dependencies
- Support
yieldfixtures correctly - Handle generator fixtures
- Support
@pytest.fixture(params=[...]) - Generate test variants for each param
- Handle fixture param ids
- Support indirect parametrization
- Support fixture param marks
- Add
--fixturesflag to show available fixtures - Add
--fixture-graphto visualize dependenciesRef: "The Rust resolver calculates the module's fully qualified name based on its file path relative to the nearest init.py or namespace root" — Python Monorepo Zygote Tree Design
- Show fixture scope and autouse status
- Indicate where fixtures are defined
- Export fixture graph as DOT/Mermaid
Focus: Better error messages, debugging tools, and developer ergonomics.
Research Foundation: Integrates PEP 669 low-impact monitoring from Rust-CPython Execution Blueprint Research for observability.
- "employs PEP 669 (Low-Impact Monitoring) to achieve observability with negligible overhead" — Rust-CPython Execution Blueprint Research
- "the runner is a high-performance native binary—constructed in Rust—that acts as a hypervisor for the Python runtime" — Rust-CPython Execution Blueprint Research
The 0.5.x series focuses on making Tach a joy to use. Better error messages, powerful debugging tools, and smoother integration with development workflows.
Target: pytest-quality error output.
- Implement pytest-style short tracebacks
- Show only relevant frames (hide internal frames)
- Highlight the assertion line
- Support
--tb=short,--tb=long,--tb=native - Support
--tb=linefor one-line summaries - Support
--tb=noto disable tracebacks
- Capture local variables at assertion failure
Ref: "The evaluator inspects the fcode of the frame. It checks a high-performance Rust hash map to see if a mock has been registered" — _Python Testing Engine Rust Breakthroughs
- Display variable values inline with traceback
- Truncate large values intelligently
- Support
--showlocalsflag - Color-code variable types
- Parse assertion expressions
Ref: "The AST visitor walks the tree of a function. It serializes the nodes into a byte stream, deliberately excluding: Docstrings, Type hints, and Formatting" — Python Testing Engine Rust Breakthroughs
- Show sub-expression values
- Support comparison operators (
==,!=,<, etc.) - Handle complex expressions (
assert x in y) - Support
assertwith messages
- Show diffs for string comparisons
- Show diffs for dict comparisons
- Show diffs for list comparisons
- Color-code additions/deletions
- Support unified diff format
Status: 🔨 IN PROGRESS
Target: Deep visibility into Tach internals.
-
--debugflag for detailed logging - Log syscall activity (userfaultfd, fork, etc.)
Ref: "The userfaultfd subsystem fundamentally alters the contract between the memory management unit (MMU) and the user-space application" — Python Memory Snapshotting with Userfaultfd
- Log worker lifecycle events
- Log memory snapshot timing
- Log IPC message flow
- Show worker status in real-time
- Display which test each worker is running
- Show queue depth and scheduling decisions
- Indicate safe vs toxic workers
Ref: "The result is a binary classification for every module in the monorepo: Safe or Toxic" — Rust Static Analysis for Toxic Python Modules
- Show worker memory usage
- Measure time in discovery, execution, reporting
- Show per-test timing breakdown
- Identify slow fixture setup
- Profile memory snapshot overhead
Ref: "If a 1GB heap is snapshotted, but the subsequent execution only touches 50KB, only those 50KB are physically copied and mapped" — Python Memory Snapshotting with Userfaultfd
- Generate flamegraphs
Target: Seamless debugger integration.
-
--pdbflag to drop into debugger on failure - Detect
breakpoint()calls in tests - Disable worker isolation when debugging
Ref: "The Supervisor sets the user's physical terminal to Raw Mode. It enters a loop where it reads bytes from the user's stdin and writes them directly to the worker's PTY master" — Project Tach Compatibility Layer Blueprint
- Support
--pdb-firstfor first failure only - Support custom debuggers (ipdb, pudb)
- Capture exception state for post-mortem
- Support
pytest.set_trace()equivalent - Handle debugger in forked workers
- Serialize debugger context if needed
- Document VS Code launch configurations
- Document PyCharm run configurations
- Support remote debugging
- Handle debugger attach to workers
- Support DAP (Debug Adapter Protocol)
Target: Flexible output formatting.
- Support
--color=auto/always/never - Support
--no-headerfor minimal output - Support
--quietfor summary only - Support
--verboselevels (-v, -vv, -vvv) - Support custom output templates
- Support different progress styles (bar, dots, verbose)
- Support
--no-progressfor CI - Show ETA for test completion
- Show test rate (tests/second)
Target: Near-zero overhead coverage using SlipCover patterns.
Research Foundation: SlipCover achieves 5% overhead vs 218% for coverage.py via runtime de-instrumentation.
- Implement line-level de-instrumentation after first execution
Ref: "Periodically de-instrument covered lines. Overhead proportional to uncovered code" — SlipCover Paper
- Branch de-instrumentation for already-covered branches
- Hot-path detection to skip instrumentation entirely
- Incremental coverage mode (only instrument changed files)
- Use
sys.monitoring.DISABLEreturn value for one-shot eventsRef: "Events can be disabled after first firing" — PEP 669
- Benchmark against coverage.py and SlipCover
- Target: <5% overhead for typical test suites
External Refs:
Focus: Complete configuration system with pyproject.toml support.
Research Foundation: Enables "Zero-Copy" module loading configuration from Zero-Copy Python Module Loading.
- "architecture treats the Python interpreter not as a standalone application that discovers code, but as an embedded execution engine that is fed pre-validated code objects" — Zero-Copy Python Module Loading
- "This approach effectively shifts the computational costs of I/O, parsing, and compilation from the critical path of the Python process startup to a pre-computation phase" — Zero-Copy Python Module Loading
The 0.6.x series implements a full configuration system. Currently Tach has limited configuration - this series adds comprehensive pyproject.toml support.
Target: Full configuration via pyproject.toml.
- Define complete
[tool.tach]schemaRef: "The Rust supervisor must pre-calculate the dependency graph of the modules and load them in Topological Order" — Zero-Copy Python Module Loading
- Document all configuration options
- Provide JSON schema for IDE completion
- Validate configuration on startup
- Support schema versioning
[tool.tach]
testpaths = ["tests"]
python_files = ["test_*.py", "*_test.py"]
python_classes = ["Test*"]
python_functions = ["test_*"]
norecursedirs = [".git", "node_modules", ".venv"][tool.tach.execution]
workers = "auto" # or integer
timeout = 60
exitfirst = false
maxfail = 0Target: Fine-grained test behavior configuration.
- Support timeout in markers
- Support timeout in config by pattern
- Override global timeout per-test
- Handle timeout inheritance
- Support
tach.tomlin subdirectories - Merge settings from parent directories
- Override parent settings locally
- Document precedence rules
- Configure behavior based on markers
Ref: "The visitor flags a module as Tier 3 if it encounters: Network I/O, Concurrency, System Mutation, or Global Locks" — Python Monorepo Zygote Tree Design
- Set default markers via config
- Filter tests by marker expression
- Support custom marker definitions
Target: Control test execution behavior.
- Random order:
--random-order - Dependency order: respect
@pytest.mark.dependency - Duration order: fastest first
Ref: "We profile packages and give more weight to those with slow module imports. We implement priority by replacing the 1's in the binary calls matrix with the weight values" — Forklift
- Reverse order:
--reverse - Alphabetical order
- Define env vars in config
- Support env var files (
.env) - Expand variables in values
- Protect sensitive values
- Support per-environment configs
- Full isolation (default)
Ref: "Namespaces provide complete, kernel-enforced isolation with acceptable overhead. Every syscall is isolated at kernel level" — Project Tach Compatibility Layer Blueprint
- Relaxed isolation (faster, less safe)
- No isolation (
--no-isolation) - Per-test isolation override
Target: Support different configurations for different scenarios.
- Define named profiles in config
- Switch profiles via
--profileflag - Support profile inheritance
- Document common profile patterns
- Auto-detect CI environment
- Apply CI-specific defaults
- Support environment-based profiles
- Handle Docker/container detection
Focus: Memory optimization, adaptive scheduling, and parallelism improvements.
Research Foundation: Implements microsecond-scale memory reset using userfaultfd from Python Memory Snapshotting with Userfaultfd and Userfaultfd and CPython Allocator Interaction.
- "By 'snapshotting' the virtual memory state of a process and lazily restoring it upon access, engineers can achieve reset times measured in microseconds rather than milliseconds" — Python Memory Snapshotting with Userfaultfd
- "If a 1GB heap is snapshotted, but the subsequent execution only touches 50KB, only those 50KB are physically copied and mapped. This O(N) cost... is the primary driver of UFFD's performance advantage" — Python Memory Snapshotting with Userfaultfd
- "leverages jemalloc's manual cache flushing capabilities to establish a stable, high-performance test runner" — Python Memory Snapshotting with Userfaultfd
The 0.7.x series focuses on performance at scale. As test suites grow to thousands of tests, we need smarter scheduling and better memory management.
Target: Reduce memory footprint and improve snapshot efficiency.
- Add
--memory-profileflag - Track memory usage per test
- Identify memory leaks
- Report peak memory usage
- Generate memory reports
- Reduce snapshot size via compression
- Implement incremental snapshots
Ref: "The kernel iterates over the Page Table Entries corresponding to the address range. It clears the 'Present' bit, effectively unmapping the physical pages" — Python Memory Snapshotting with Userfaultfd
- Skip unchanged memory regions
- Use copy-on-write more effectively
Ref: "workers inherit the parent's memory state without duplication, only copying physical pages when they are modified" — Cross-Platform Process Cloning Research
- Optimize page table handling
- Detect low memory conditions
- Reduce worker count under pressure
- Trigger garbage collection proactively
Ref: "If a snapshot is taken while the GC is traversing the object graph and modifying gcrefs, a subsequent restore will leave the GC in an inconsistent state" — _Userfaultfd and CPython Allocator Interaction
- Fail gracefully on OOM
- Support memory limits
Target: Smart test scheduling based on historical data.
- Track test durations over time
Ref: "The significant skew in package popularity indicates that relatively few zygotes could provide substantial benefit. The top 15 packages alone account for more than 50% of the files" — Forklift
- Store duration data in cache file
- Predict duration for new tests
- Balance worker load based on predictions
- Handle duration variance
- Identify frequently-run tests
- Prioritize cold tests for early execution
- Cache compilation for hot tests
- Optimize discovery for hot paths
- Distribute tests evenly by predicted duration
- Handle stragglers (tests slower than predicted)
- Support test stealing between workers
- Minimize total wall-clock time
Target: Reduce startup time for large codebases.
- Don't import modules until needed
Ref: "To speed up restart, zygotes are created lazily upon first use. Zygotes may be evicted under memory pressure" — Forklift
- Load test modules on-demand
- Share loaded modules between workers
- Support preloading via config
- Build module dependency graph
Ref: "Profiling data from large-scale deployments indicates that module initialization—specifically the parsing, compiling, and executing of top-level code in dependencies—accounts for 60% to 80% of cold start duration" — Python Monorepo Zygote Tree Design
- Identify shared dependencies
- Optimize import order
- Detect circular imports
- Compile bytecode lazily
- Cache compiled bytecode
Ref: "The runner maintains a content-addressable store of compiled bytecode. When a file is modified, the runner invokes a compilation step to generate the binary blob for direct injection" — Rust-CPython Execution Blueprint Research
- Use mmap for bytecode files
- Share bytecode between workers
Target: Speed up test collection for large codebases.
- Parallelize file scanning
Ref: "Rust, utilizing the rayon data parallelism library, can saturate all CPU cores to parse and analyze thousands of files per second" — Rust Static Analysis for Toxic Python Modules
- Parse test files in parallel
- Merge discovery results efficiently
- Handle discovery errors in parallel context
- Cache discovery results
- Detect file changes via mtime/hash
- Only re-discover changed files
- Support
--cache-clearto reset
- Benchmark
rustpython-parservsruff_python_parserfor test discoveryruff_python_parser: "capable of processing gigabytes of source code per second" — Rust-CPython Execution Blueprint
- Evaluate error recovery characteristics (important for incomplete files)
- Consider migration if >2x performance improvement observed
- Document parser selection rationale
External Refs:
Target: Investigate next-generation snapshot approaches from fuzzing research.
- Evaluate AFL-Snapshot-LKM approach for kernel-level snapshots
Ref: AFL-Snapshot-LKM achieves 20-360% speedup over fork-server
- Assess kernel module licensing and distribution implications
- Prototype kernel-assisted snapshot/restore cycle
- Benchmark against userfaultfd approach
- Study LibAFL snapshot executor architecture
Ref: LibAFL Book documents Rust fuzzing patterns
- Evaluate executor abstraction for Tach isolation modes
- Consider shared memory arena patterns from fuzzing
| Technique | Current Overhead | Target | Speedup vs Fork | Implementation Complexity |
|---|---|---|---|---|
| Fork (baseline) | ~500-1000 μs | N/A | 1x | Low |
| Fork server | ~100-200 μs | 0.1.x ✓ | 5x | Low |
| userfaultfd | ~10-50 μs | 0.7.x | 10-50x | Medium |
| Kernel snapshot (LKM) | ~1-5 μs | Future | 100-500x | High (GPL) |
Licensing Note: AFL-Snapshot-LKM is GPL-licensed. Distribution as kernel module has licensing implications for Tach's MIT license. Consider:
- Optional separate download for kernel module
- Benchmark-only usage (non-production)
- Alternative: Investigate kernel API stabilization for mainline support
Focus: First-class CI/CD support with templates and integrations.
Research Foundation: Enables future cross-platform support per Cross-Platform Process Cloning Research.
- "By leveraging undocumented kernel primitives—Mach virtual memory remapping on macOS and NT process cloning on Windows—it is theoretically possible to approximate the performance of Linux fork()" — Cross-Platform Process Cloning Research
- "The cornerstone of simulating Copy-on-Write on macOS without utilizing the standard fork() system call is machvm_remap" — _Cross-Platform Process Cloning Research
The 0.8.x series makes Tach a first-class citizen in CI/CD pipelines. Better reporting, CI platform integrations, and artifact handling.
Target: Seamless GitHub Actions integration.
- Basic workflow template
- Matrix build template (multiple Python versions)
- Coverage workflow template
- Release workflow template
- Caching workflow template
- PR comment with test summary
- Status check reporting
- Annotation for test failures
- Problem matcher for error highlighting
- SARIF output for security findings
Target: Support for major CI platforms.
-
.gitlab-ci.ymltemplates - GitLab JUnit integration
- Coverage badge support
- GitLab Pages for reports
- CircleCI orb
- Jenkins pipeline library
- Azure DevOps tasks
- Travis CI examples
- Buildkite plugin
- Drone CI examples
Target: Better test result reporting.
- Add test properties to JUnit XML
- Support file attachments
- Include timing information
- Support test categories
- Handle multi-file output
- Generate standalone HTML reports
- Include failure details and tracebacks
- Show test duration charts
- Support filtering and search
- Export as static site
- Track test pass/fail history
- Identify tests with inconsistent results
Ref: "If the child process did not explicitly re-seed, both parent and child would generate identical sequences of 'random' numbers" — Fork Safety of Python C-Extensions
- Report flakiness percentage
- Suggest potential causes
- Support auto-retry for flaky tests
Target: Complete coverage workflow.
- Cobertura XML (default)
- LCOV format
- JSON format
- HTML report
- SonarQube format
- Coverage diff (new code only)
- Coverage thresholds (fail if below)
- Branch coverage
Ref: "employs PEP 669 (Low-Impact Monitoring) to achieve observability with negligible overhead" — Rust-CPython Execution Blueprint Research
- Missing lines report
- Coverage trending
Target: Alternative worker model using PEP 684 per-interpreter GIL instead of fork.
Research Foundation: PEP 684 enables true parallel Python execution within a single process. PEP 734 (Python 3.14) exposes this via
concurrent.interpreters.
- "Each sub-interpreter can have its own GIL" — PEP 684
- V8 isolates demonstrate 5ms startup (Cloudflare Workers model)
- Prototype sub-interpreter-based worker using C-API
Py_NewInterpreterFromConfigwithPyInterpreterConfig_OWN_GIL - Implement channel-based communication between interpreters
No direct object sharing; use
interpreters.Queueor shared memory - Benchmark against fork-based workers
- Document extension module compatibility requirements
Many C extensions don't support sub-interpreters yet
- Use
concurrent.interpreterswhen available - Fallback to C-API for Python 3.12-3.13
- Test with free-threaded Python builds
External Refs:
Focus: Production hardening, crash recovery, and resource management.
The 0.9.x series hardens Tach for production use. Crash recovery, resource cleanup, and stress testing ensure reliability.
Target: Graceful handling of crashes and errors.
- Detect and kill orphan workers
- Clean up shared memory on crash
- Handle SIGKILL correctly
- Recover from supervisor crash
- Clean up temp files
- Save test progress periodically
- Resume from last known state
- Report partial results on crash
- Support
--resumeflag - Handle interrupted runs
Target: Proper signal handling throughout.
- SIGINT (Ctrl+C) - Graceful shutdown
- SIGTERM - Clean exit
- SIGHUP - Reload configuration
- SIGQUIT - Dump stack traces
- SIGUSR1 - Status dump
- Forward signals to workers
- Handle worker signal death
- Timeout on worker shutdown
- Force kill unresponsive workers
Target: Prevent resource leaks.
- Track file descriptor usage
- Detect FD leaks
- Track memory allocations
- Detect memory leaks
- Track thread creation
- Enforce FD limits per worker
- Enforce memory limits
- Enforce CPU time limits
- Report resource violations
- Support cgroups integration
Target: Verify stability under load.
- Large test suites (10k+ tests)
- Long-running tests (hours)
- High parallelism (100+ workers)
- Memory pressure scenarios
- Network failure scenarios
Focus: Feature freeze and stabilization.
- Feature freeze
- API stability review
- Complete documentation
- Migration guide draft
- Public beta announcement
- Bug fixes from beta 1 feedback
- Performance regression testing
- Compatibility testing
- Security audit
Focus: Final polish before 1.0.
- Address beta 1 feedback
- Final API changes
- Documentation updates
- Performance optimization
- Final bug fixes
- Release notes
- Upgrade path testing
- Community feedback
- Critical fixes only
- Final documentation
- Package verification
- Release preparation
Stable release with API guarantees.
- Complete user documentation
- API stability commitment (SemVer)
- Migration guide from pytest
- Long-term support policy
- Performance benchmarks published
- Security best practices documented
- Battle-tested on real-world projects
Focus: Maintenance and minor improvements.
- Bug fixes from 1.0.0 feedback
- Minor performance improvements
- Documentation updates
- Dependency updates
- Critical bug fixes
- Security patches
Focus: New features that didn't make 1.0.
- Features deferred from 1.0
- Community-requested features
- Plugin ecosystem improvements
- Additional database support
Consolidated external documentation and resources referenced throughout this roadmap.
- PEP 669 - Low Impact Monitoring - Coverage and debugging
- PEP 684 - Per-Interpreter GIL - Sub-interpreter isolation
- PEP 703 - Free Threading - GIL removal (experimental)
- PEP 734 - Multiple Interpreters - Python 3.14 interpreters module
- PEP 523 - Frame Evaluation API - Native mocking
- userfaultfd(2) - User-space page fault handling
- Landlock Docs - Filesystem/network sandboxing
- namespaces(7) - Process isolation
- OverlayFS - Copy-on-write filesystem
- PyO3 Guide - Rust-Python bindings
- PyO3 Parallelism - GIL management patterns
- jemalloc mallctl - Allocator control API
- rust-landlock - Landlock Rust bindings
- AFL-Snapshot-LKM - Kernel snapshot module
- LibAFL - Rust fuzzing framework
- SlipCover - Low-overhead coverage
- Maelstrom - Distributed test runner
- snob - Test impact analysis
- SlipCover Paper (ISSTA 2023) - De-instrumentation
- Forklift Paper (WoSC 2024) - Zygote trees