|
1 | | -# Changelog |
2 | | - |
3 | | -All notable changes to the Reranking & Embedding Service will be documented in this file. |
4 | | - |
5 | | -The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), |
6 | | -and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). |
7 | | - |
8 | | -## [Unreleased] |
9 | | - |
10 | | -### Added |
11 | | -- GitHub Actions workflow for automated Docker image builds |
12 | | -- Automatic GitHub Release creation on version tags |
13 | | -- Build status badges in README |
14 | | - |
15 | | -### Changed |
16 | | -- CHANGELOG.md now uses actual commit dates from git tags |
17 | | - |
18 | | -## [1.2.0] - 2025-10-17 |
19 | | - |
20 | | -### Changed |
21 | | -- Refactored image build script to read version from `version.py` dynamically |
22 | | -- Updated build scripts to remove deprecated PCA down-projection method |
23 | | - |
24 | | -### Fixed |
25 | | -- README.md improvements with PCA down-projection examples |
26 | | - |
27 | | -## [1.1.0] - 2025-10-03 |
28 | | - |
29 | | -### Added |
30 | | -- Version management system with `version_manager.py` script |
31 | | -- Version display in API endpoints (`/v1/config` and `/v1/diagnostics`) |
32 | | -- Version information in startup banner |
33 | | - |
34 | | -### Changed |
35 | | -- Optimized PyTorch settings at application startup |
36 | | -- Refactored embedding service method signature for better type safety |
37 | | -- Enhanced cache handling in configuration |
38 | | - |
39 | | -### Fixed |
40 | | -- Memory leak in model management |
41 | | -- Improved memory management for long-running instances |
42 | | -- Bearer token regex validation |
43 | | -- Error messaging improvements in embedding routes |
44 | | - |
45 | | -## [1.0.0] - 2025-09-XX |
46 | | - |
47 | | -### Added |
48 | | -- Initial implementation of reranking service with FastAPI |
49 | | -- Cross-encoder and bi-encoder support for reranking |
50 | | -- Embedding generation via `/v1/encode` endpoint |
51 | | -- Cohere-compatible `/v1/rerank` API endpoint |
52 | | -- Direct transformers integration for advanced models (Qwen3-Embedding-8B) |
53 | | -- Automatic model detection (direct transformers vs sentence-transformers) |
54 | | -- Flash Attention support for 2-4x speedup on compatible hardware |
55 | | -- Matryoshka (MRL) dimensionality reduction |
56 | | -- Precision control (bfloat16/float16/float32) |
57 | | -- Extended context support (up to 32k tokens) |
58 | | -- API key authentication system |
59 | | -- CORS configuration |
60 | | -- Comprehensive logging and diagnostics endpoints |
61 | | -- Model warmup capability for faster inference |
62 | | -- Docker and Podman support |
63 | | -- YAML-based configuration system |
64 | | -- Health check endpoint (`/healthz`) |
65 | | -- Model management endpoints (`/v1/models`, `/v1/models/reload`) |
66 | | -- Hugging Face model prefetching |
67 | | -- Multi-GPU support with CUDA optimization |
68 | | - |
69 | | -### Documentation |
70 | | -- Comprehensive README with usage examples |
71 | | -- Configuration documentation |
72 | | -- API endpoint documentation |
73 | | -- Performance tuning guide |
74 | | -- Troubleshooting section |
75 | | - |
76 | | -[1.2.0]: https://github.com/cyberbobjr/simple-reranker/compare/v1.1.0...v1.2.0 |
77 | | -[1.1.0]: https://github.com/cyberbobjr/simple-reranker/compare/v1.0.0...v1.1.0 |
78 | | -[1.0.0]: https://github.com/cyberbobjr/simple-reranker/releases/tag/v1.0.0 |
| 1 | +# Changelog |
| 2 | + |
| 3 | +All notable changes to the Reranking & Embedding Service will be documented in this file. |
| 4 | + |
| 5 | +The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.0.0/), |
| 6 | +and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). |
| 7 | + |
| 8 | +## [1.3.0] - 2025-10-17 |
| 9 | + |
| 10 | +### Added |
| 11 | +- Implement RerankService for document reranking with cross-encoder and bi-encoder support |
| 12 | + |
| 13 | +## [Unreleased] |
| 14 | + |
| 15 | +### Added |
| 16 | +- GitHub Actions workflow for automated Docker image builds |
| 17 | +- Automatic GitHub Release creation on version tags |
| 18 | +- Build status badges in README |
| 19 | + |
| 20 | +### Changed |
| 21 | +- CHANGELOG.md now uses actual commit dates from git tags |
| 22 | + |
| 23 | +## [1.2.0] - 2025-10-17 |
| 24 | + |
| 25 | +### Changed |
| 26 | +- Refactored image build script to read version from `version.py` dynamically |
| 27 | +- Updated build scripts to remove deprecated PCA down-projection method |
| 28 | + |
| 29 | +### Fixed |
| 30 | +- README.md improvements with PCA down-projection examples |
| 31 | + |
| 32 | +## [1.1.0] - 2025-10-03 |
| 33 | + |
| 34 | +### Added |
| 35 | +- Version management system with `version_manager.py` script |
| 36 | +- Version display in API endpoints (`/v1/config` and `/v1/diagnostics`) |
| 37 | +- Version information in startup banner |
| 38 | + |
| 39 | +### Changed |
| 40 | +- Optimized PyTorch settings at application startup |
| 41 | +- Refactored embedding service method signature for better type safety |
| 42 | +- Enhanced cache handling in configuration |
| 43 | + |
| 44 | +### Fixed |
| 45 | +- Memory leak in model management |
| 46 | +- Improved memory management for long-running instances |
| 47 | +- Bearer token regex validation |
| 48 | +- Error messaging improvements in embedding routes |
| 49 | + |
| 50 | +## [1.0.0] - 2025-09-XX |
| 51 | + |
| 52 | +### Added |
| 53 | +- Initial implementation of reranking service with FastAPI |
| 54 | +- Cross-encoder and bi-encoder support for reranking |
| 55 | +- Embedding generation via `/v1/encode` endpoint |
| 56 | +- Cohere-compatible `/v1/rerank` API endpoint |
| 57 | +- Direct transformers integration for advanced models (Qwen3-Embedding-8B) |
| 58 | +- Automatic model detection (direct transformers vs sentence-transformers) |
| 59 | +- Flash Attention support for 2-4x speedup on compatible hardware |
| 60 | +- Matryoshka (MRL) dimensionality reduction |
| 61 | +- Precision control (bfloat16/float16/float32) |
| 62 | +- Extended context support (up to 32k tokens) |
| 63 | +- API key authentication system |
| 64 | +- CORS configuration |
| 65 | +- Comprehensive logging and diagnostics endpoints |
| 66 | +- Model warmup capability for faster inference |
| 67 | +- Docker and Podman support |
| 68 | +- YAML-based configuration system |
| 69 | +- Health check endpoint (`/healthz`) |
| 70 | +- Model management endpoints (`/v1/models`, `/v1/models/reload`) |
| 71 | +- Hugging Face model prefetching |
| 72 | +- Multi-GPU support with CUDA optimization |
| 73 | + |
| 74 | +### Documentation |
| 75 | +- Comprehensive README with usage examples |
| 76 | +- Configuration documentation |
| 77 | +- API endpoint documentation |
| 78 | +- Performance tuning guide |
| 79 | +- Troubleshooting section |
| 80 | + |
| 81 | +[1.2.0]: https://github.com/cyberbobjr/simple-reranker/compare/v1.1.0...v1.2.0 |
| 82 | +[1.1.0]: https://github.com/cyberbobjr/simple-reranker/compare/v1.0.0...v1.1.0 |
| 83 | +[1.0.0]: https://github.com/cyberbobjr/simple-reranker/releases/tag/v1.0.0 |
0 commit comments