-
Notifications
You must be signed in to change notification settings - Fork 16
Description
Enhancement: Implement Project-Specific Configuration Overrides
π― Overview
The overrides field in project configurations is currently parsed and stored but not actively used in the application logic. This feature should allow projects to override global configuration settings on a per-project basis.
π Problem Statement
Currently, all projects use the same global configuration settings (chunking, embedding, file conversion, etc.). The multi-project architecture includes an overrides field in ProjectConfig that is designed to allow project-specific customization, but this functionality is not implemented.
Current behavior:
overridesfield is parsed from config.yaml- Overrides are merged with global config and stored in
ProjectConfig.overrides - The merged overrides are never applied during processing
- All projects use identical global settings
Expected behavior:
- Projects should be able to override global settings like chunk_size, embedding model, etc.
- Processing pipeline should use project-specific settings when available
- Fallback to global settings when no project overrides exist
π§ Technical Details
Current Implementation Status
- β
Parsing: Overrides are correctly parsed in
MultiProjectConfigParser._parse_project_config() - β
Storage: Merged overrides stored in
ProjectConfig.overridesfield - β
Validation:
ConfigValidatorvalidates overrides structure - β Application: Overrides are not applied during document processing
Files Involved
packages/qdrant-loader/src/qdrant_loader/config/models.py- ProjectConfig.overrides fieldpackages/qdrant-loader/src/qdrant_loader/config/parser.py- Override parsing and mergingpackages/qdrant-loader/src/qdrant_loader/core/project_manager.py- ProjectContext management- Processing pipeline components that should use project-specific config
Example Use Case
global:
chunking:
chunk_size: 1000
chunk_overlap: 200
embedding:
model: "text-embedding-3-small"
projects:
technical-docs:
display_name: "Technical Documentation"
sources:
git:
docs-repo:
# ... git config
overrides:
chunking:
chunk_size: 2000 # Larger chunks for technical content
chunk_overlap: 400
embedding:
model: "text-embedding-3-large" # Better model for technical content
marketing-content:
display_name: "Marketing Content"
sources:
confluence:
marketing-space:
# ... confluence config
overrides:
chunking:
chunk_size: 500 # Smaller chunks for marketing content
chunk_overlap: 100π Proposed Solution
1. Create Configuration Resolution Service
class ProjectConfigResolver:
"""Resolves effective configuration by applying project overrides."""
def get_effective_config(
self,
project_context: ProjectContext,
global_config: GlobalConfig
) -> GlobalConfig:
"""Apply project overrides to global configuration."""
if not project_context.config_overrides:
return global_config
# Deep merge project overrides with global config
effective_config_dict = self._deep_merge(
global_config.to_dict(),
project_context.config_overrides
)
# Create new GlobalConfig instance with merged settings
return GlobalConfig(**effective_config_dict)2. Update Processing Components
- Modify chunking strategies to accept project-specific config
- Update embedding components to use project-specific models
- Ensure file conversion uses project-specific settings
3. Update Project Manager
- Add method to get effective configuration for a project
- Ensure ProjectContext includes resolved configuration
4. Update Pipeline Orchestrator
- Pass project-specific configuration to processing components
- Ensure proper fallback to global config when no project specified
π Implementation Tasks
Phase 1: Core Infrastructure
- Create
ProjectConfigResolverservice - Add
get_effective_config()method toProjectManager - Update
ProjectContextto include effective configuration - Add unit tests for configuration resolution
Phase 2: Component Integration
- Update chunking strategies to accept project-specific config
- Modify embedding components for project-specific models
- Update file conversion to use project-specific settings
- Update text processing components
Phase 3: Pipeline Integration
- Modify
PipelineOrchestratorto use project-specific config - Update connector instantiation to use effective config
- Ensure proper config propagation through processing pipeline
Phase 4: Testing & Documentation
- Add integration tests for project-specific overrides
- Update configuration documentation
- Add examples to config template
- Update CLI help text and examples
π§ͺ Testing Strategy
Unit Tests
- Configuration resolution with various override scenarios
- Deep merging of nested configuration objects
- Fallback behavior when no overrides specified
Integration Tests
- End-to-end processing with project-specific settings
- Multiple projects with different configurations
- Validation of effective settings in processing components
Example Test Cases
def test_project_specific_chunking():
"""Test that project overrides affect chunking behavior."""
# Project with larger chunk size should produce fewer chunks
def test_project_specific_embedding():
"""Test that project uses specified embedding model."""
# Verify correct model is used for embedding generation
def test_fallback_to_global_config():
"""Test fallback when no project overrides specified."""
# Should use global settings when overrides emptyπ Documentation Updates
Configuration Reference
- Document override syntax and available options
- Provide examples for common override scenarios
- Explain inheritance and merging behavior
User Guide
- Add section on project-specific customization
- Include best practices for using overrides
- Troubleshooting guide for configuration issues
π Related Issues
- Multi-project architecture implementation
- Configuration validation improvements
- Performance optimization for different content types
π‘ Future Enhancements
- Runtime configuration updates
- Configuration profiles/presets
- Project-specific connector settings
- Dynamic configuration based on content analysis
Priority: Medium
Effort: Medium (2-3 days)
Impact: High - Enables flexible multi-project configurations
This enhancement will complete the multi-project architecture by making the override functionality fully operational, allowing users to optimize settings for different types of content and use cases.