Releases: starkbaknet/project-vectorizer
Releases · starkbaknet/project-vectorizer
Project Vectorizer Release v0.1.4
[0.1.4] - 2025-10-13
Fixed
- **Hardcoded value ** – Replaced hardcoded configuration with dynamic variable lookup
Notes
- This is a minor bugfix release with no API or CLI changes.
Project Vectorizer Release v0.1.3
[0.1.3] - 2025-10-13
Fixed
- Hardcoded value in work module – Replaced hardcoded configuration with dynamic variable lookup
- Prevents unexpected behavior when running with custom configs
Notes
- This is a minor bugfix release with no API or CLI changes.
Project Vectorizer Release v0.1.2
[0.1.2] - 2025-10-13
Added
Performance & Optimization
- Optimized Config Generation - New
Config.create_optimized()method that auto-detects optimal settings based on system resources- Auto-detects CPU cores and sets optimal
max_workers - Calculates safe batch sizes based on available RAM
- Dynamically adjusts memory thresholds and GC intervals
- Use with
pv init --optimizefor permanent optimization
- Auto-detects CPU cores and sets optimal
- Max Resources Flag - New
--max-resourcesflag forindexandindex-gitcommands- Temporarily overrides config with optimized settings
- Perfect for one-time performance boosts
- Example:
pv index . --max-resources
- psutil Integration - Added
psutillibrary for system resource detection- Automatic CPU core detection
- Memory availability monitoring
- Smart resource allocation
Progress & UI Improvements
- Unified Progress Tracking - Clean, single-line progress bar for all indexing operations
- Replaces cluttered per-file logging
- Real-time progress updates with file names
- Shows file status tags: [New], [Modified], [Deleted]
- Callback-based architecture for flexible progress reporting
- Library Progress Bar Suppression - Suppressed sentence-transformers batch progress bars
- Added
show_progress_bar=Falseparameter to embedding generation - Eliminates "Batches: 0%|..." clutter during indexing
- Added
- Timing Information - All indexing operations now display elapsed time
- Shows duration in seconds (< 1 min) or minutes+seconds (≥ 1 min)
- Displayed in completion panels
- Example: "Time taken: 2m 17s"
- Clean Terminal Output - Professional, easy-to-read output
- Automatic log suppression when progress callback is active
- Warnings and errors still shown
- Verbose mode available with
--verboseflag
Documentation
- Comprehensive Documentation Index - New
docs/README.mdwith organized navigation- Links to all feature guides
- Quick reference section
- CLI command reference
- Configuration reference
- Troubleshooting guide
- Performance Guides - Three detailed guides added:
MAX_RESOURCES_GUIDE.md- Using maximum system resourcesOPTIMIZED_CONFIG.md- Auto-optimization featuresCLEAN_PROGRESS_OUTPUT.md- Progress tracking system
- Changelog - This file! Track all version changes
Changed
Configuration
- Chunk Size Enforcement - Engine now enforces max 128-token chunks (line 35 in engine.py)
- Improved precision for searches
- Better granularity for code matching
- Configurable chunk size still respected for larger values
- Default Chunk Overlap - Optimized overlap settings
- Better balance between precision and performance
- Reduced from 32 to 16 tokens for 128-chunk configs
Performance
- Batch Processing - Memory-based batch size validation
- Safe batch sizes calculated based on available RAM
- Prevents out-of-memory errors on large projects
- Parallel Processing - Improved worker management
- Auto-detection of optimal worker count
- Better CPU utilization
- Configurable via
max_workerssetting
CLI
- Progress Output - All index commands now use unified progress bar
pv index- Shows clean progress with timingpv index-git- Shows git reference and timingpv sync --watch- Real-time progress updates
Fixed
- Progress Bar Clutter - Fixed multiple progress bars appearing during indexing
- Sentence-Transformers Output - Suppressed library-internal batch progress bars
- Memory Usage - Better memory management with GC intervals and monitoring
- Log Suppression - Informational logs now properly suppressed when progress callback is active
Performance
Benchmarks (with --max-resources)
- Indexing Speed: ~2m 16s for 48 files (9,222 chunks) on 8-core system
- Chunk Size Impact: Virtually identical performance between 128 and 512 token chunks
- Optimal Settings: Auto-detected based on system (16 workers, 400 batch size on 8-core/16GB RAM)
Improvements
- Smart Incremental: 60-70% faster than full re-indexing
- Git-Aware: 80-90% faster for recent changes only
- Parallel Processing: Linear scaling with CPU cores
Project Vectorizer Release 0.1.0
Refactor GitHub Actions workflow for PyPI publishing - Simplified the workflow by removing the TestPyPI publishing step and associated conditions. - Streamlined the environment input options for manual deployment. - Ensured consistent formatting and indentation across steps for better readability. - Maintained the core functionality for building and publishing to PyPI.