-
Notifications
You must be signed in to change notification settings - Fork 1
Description
Phase 3: Monorepo Structure Organization
Parent Epic: #116
Depends On: #118 (DI Integration)
Target: Week 3 (5-7 days)
Risk Level: Low
Organize CodeWeaver into clean monorepo structure with 9 independently-buildable packages, now trivial thanks to DI breaking circular dependencies.
Goals
- Organize code into packages/ with uv workspace
- All packages build independently
- Remaining violations: ~30-40 (down from 164)
- Clean dependency graph
- Documentation and migration guides
- Zero functional changes
Why This Phase is Now Easy
Thanks to Phase 2 (DI Integration):
- ✅ Circular dependencies broken (75% eliminated)
- ✅ Services don't import across packages
- ✅ Clean dependency flow established
- ✅ Just need to organize files into packages
Original estimate: 3-4 weeks to refactor dependencies
New reality: 5-7 days to organize structure (dependencies already fixed!)
Target Monorepo Structure
packages/
codeweaver-core/
- Core types, exceptions
- DI infrastructure
- search_types (moved in Phase 1)
- No external dependencies (except stdlib)
codeweaver-tokenizers/ ✅ (Extracted in Phase 1)
- Tokenizer implementations
- Tree-sitter integrations
codeweaver-daemon/ ✅ (Extracted in Phase 1)
- Background daemon logic
- Process management
codeweaver-utils/
- Common utilities
- git, logging, procs
- Depends: core
codeweaver-semantic/
- Semantic chunking
- AST analysis
- Depends: core, utils, tokenizers
codeweaver-telemetry/
- Telemetry client (DI-enabled)
- Analytics
- Depends: core
codeweaver-providers/
- All provider implementations
- Embedding, vector store, reranking
- Provider factories (DI)
- Depends: core, telemetry
codeweaver-engine/
- Indexer, search services
- Config, registry (simplified)
- Depends: core, utils, semantic, providers
codeweaver/
- CLI, server, MCP
- agent_api orchestration
- Depends: engine, all other packages
Implementation Checklist
Part A: Package Structure Setup (Days 1-2)
Create package directories:
- Create
packages/directory structure - Create pyproject.toml for each package
- Set up uv workspace configuration
- Define inter-package dependencies
uv workspace configuration:
# Root pyproject.toml
[tool.uv.workspace]
members = [
"packages/codeweaver-core",
"packages/codeweaver-tokenizers",
"packages/codeweaver-daemon",
"packages/codeweaver-utils",
"packages/codeweaver-semantic",
"packages/codeweaver-telemetry",
"packages/codeweaver-providers",
"packages/codeweaver-engine",
"packages/codeweaver",
]Part B: Move Code to Packages (Days 3-4)
Priority 1: Foundation packages (already extracted)
- codeweaver-tokenizers ✅ (Phase 1)
- codeweaver-daemon ✅ (Phase 1)
- codeweaver-core
- Move DI infrastructure
- Include search_types
- Core exceptions
Priority 2: Utility packages
-
codeweaver-utils
- Move common/utils (except registry)
- Update imports
- Validate independence
-
codeweaver-telemetry
- Move common/telemetry
- DI-enabled client
- Update imports
Priority 3: Semantic and providers
-
codeweaver-semantic
- Move semantic package
- Update imports
- Validate against utils
-
codeweaver-providers
- Move all provider implementations
- Include provider factories
- Update imports
Priority 4: Engine and app
-
codeweaver-engine
- Move engine, config
- Simplified registry (thin layer)
- Services using DI
-
codeweaver (main package)
- CLI, server, MCP
- agent_api
- Orchestration layer
Part C: Build System (Days 5-6)
Package build configuration:
- Configure build backend for each package
- Set version management strategy
- Define dependency ranges
- Set up development dependencies
Example package pyproject.toml:
[project]
name = "codeweaver-providers"
version = "0.2.0-alpha"
dependencies = [
"codeweaver-core >=0.2.0",
"codeweaver-telemetry >=0.2.0",
]
[build-system]
requires = ["hatchling"]
build-backend = "hatchling.build"Build validation:
- Build each package independently
- Verify import paths
- Test inter-package dependencies
- Validate circular dependency elimination
Part D: Testing & Validation (Day 7)
Dependency validation:
- Run:
python scripts/validate_proposed_structure.py - Target: < 50 violations (down from 164)
- Verify no circular dependencies between packages
- Check dependency graph is acyclic
Build testing:
- Build all packages in dependency order
- Run tests for each package
- Integration test full system
- Performance validation
Expected results:
# Should succeed for all packages
cd packages/codeweaver-core && uv build
cd packages/codeweaver-utils && uv build
cd packages/codeweaver-providers && uv build
# ... etcPackage Dependency Graph
Clean dependency flow (acyclic):
codeweaver-core (foundation)
↑
├── codeweaver-tokenizers
├── codeweaver-daemon
├── codeweaver-utils
├── codeweaver-telemetry
↑
├── codeweaver-semantic (depends on: core, utils, tokenizers)
├── codeweaver-providers (depends on: core, telemetry)
↑
├── codeweaver-engine (depends on: core, utils, semantic, providers)
↑
└── codeweaver (depends on: ALL)
No circular dependencies between packages!
Remaining Violations (~30-40)
What's left after DI broke 75%:
Type movements (10-15 violations)
- Some types still in wrong packages
- Easy to fix: just move files
- No logic changes needed
Utility dependencies (9 violations)
- core → utils
- Solution: Move core utilities to core package
Semantic utilities (4 violations)
- semantic → utils
- Solution: Move or minimize shared utilities
Minor couplings (10-15 violations)
- Various small import adjustments
- Lazy imports where needed
- Protocol usage for abstract dependencies
Acceptance Criteria
- All 9 packages created with pyproject.toml
- Code organized into packages
- uv workspace builds successfully
- Dependency violations < 50 (down from 164)
- Zero circular dependencies between packages
- All packages build independently
- All tests pass
- Type checking passes
- Performance within 5% of baseline
- Documentation complete
Migration Guide
Document for users:
- Import path changes
- Package installation instructions
- Development setup with workspace
- Contribution guidelines per package
Example migration:
# Before (monolith)
from codeweaver.engine.indexer import Indexer
from codeweaver.providers.embedding.fastembed import FastembedProvider
# After (monorepo)
from codeweaver_engine.indexer import Indexer
from codeweaver_providers.embedding.fastembed import FastembedProviderBenefits After This Phase
For developers:
- ✅ Work on individual packages without full codebase
- ✅ Clear package boundaries and responsibilities
- ✅ Independent versioning possible
- ✅ Faster builds (only rebuild changed packages)
For users:
- ✅ Install only needed packages
- ✅ Lighter dependencies for specific use cases
- ✅ Clear module structure
For maintainers:
- ✅ Easier to review (package-scoped changes)
- ✅ Independent package releases
- ✅ Clear ownership boundaries
Success Metrics
Structural:
- 9 packages building independently
- < 50 dependency violations
- 0 circular dependencies between packages
- Clean dependency graph
Quality:
- All tests pass
- Type checking passes
- Documentation complete
- Migration guide available
Next Steps After This Phase
Phase 4 (#119): pydantic-ai Integration
- Add agent DI support
- Integrate data providers
- Build on clean package structure
Phase 5 (#120): Advanced Features
- Health checks, telemetry
- Plugin system
- Performance optimization
Phase 6 (#121): Cleanup
- Deprecate old patterns
- Eliminate/simplify registry
- Final documentation
Connection to Integrated Strategy
This phase implements Week 3 of the integrated strategy:
- Organize into packages → Natural structure emerges
- Build system → All packages independent
- Final validation → Proof of clean architecture
Why This is Low Risk Now
Thanks to Phases 1-2:
- ✅ Circular dependencies already broken (DI work)
- ✅ Services already don't import across boundaries
- ✅ Just moving files, not changing logic
- ✅ Can validate continuously during organization
Original risk: HIGH (dependencies blocking)
Current risk: LOW (just file organization)
Reference
- Planning:
INTEGRATED_DI_MONOREPO_STRATEGY.md(Week 3) - Visualization:
DI_IMPACT_VISUALIZATION.md - Original analysis:
REFINED_MONOREPO_ANALYSIS.md - Validation script:
scripts/validate_proposed_structure.py