Skip to content

Commit 101d0c1

Browse files
committed
Prepare release
1 parent 7108532 commit 101d0c1

File tree

14 files changed

+159
-18
lines changed

14 files changed

+159
-18
lines changed

CHANGELOG.md

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,51 @@ This is the main changelog for the entire Rhesis repository. For detailed compon
1313

1414
## [Unreleased]
1515

16+
## [0.6.7] - 2026-03-02
17+
18+
### Platform Release
19+
20+
This release includes the following component versions:
21+
- **Backend 0.6.6**
22+
- **Frontend 0.6.7**
23+
- **SDK 0.6.7**
24+
- **Polyphemus 0.2.7**
25+
26+
### Summary of Changes
27+
28+
**Backend v0.6.6:**
29+
- Added explicit `min_turns` parameter to test configuration, allowing control over early stopping behavior.
30+
- Improved turn budget handling in Penelope, including turn-aware prompts and deepening strategies, and preventing premature stopping.
31+
- Enhanced metric handling, including pagination on the frontend and passing `conversation_history` to conversational metrics.
32+
- Added methods to TestSet for bulk association/disassociation of tests.
33+
34+
35+
**Frontend v0.6.7:**
36+
- Enhanced multi-turn evaluation with accurate turn counting (user-assistant pairs), `min_turns`/`max_turns` configuration, and SDK improvements for metric handling.
37+
- Added explicit `min_turns` parameter for early stop control in conversational tests, configurable through a range slider in the frontend.
38+
- Improved metric handling in the SDK with create-or-update support, ID preservation, and fixes for null value overwrites.
39+
- Added methods to associate/disassociate tests with TestSets without recreating them.
40+
- Fixed metrics page pagination to display all backend type tabs.
41+
42+
43+
**SDK v0.6.7:**
44+
- Added explicit `min_turns` parameter to control early stopping in tests, replacing instruction-based regex parsing.
45+
- Improved turn budget handling in Penelope, including turn-aware prompts, explicit min/max turn labeling, and preventing spurious turn count criteria in goal judging.
46+
- Enhanced metric handling in the SDK, including create-or-update support for metric push, preservation of IDs on pull, and fixes for conversational metric evaluation.
47+
- Added methods to TestSet for bulk association/disassociation of tests, improving test set management.
48+
49+
50+
**Polyphemus v0.2.7:**
51+
Initial release or no significant changes.
52+
53+
See individual component changelogs for detailed changes:
54+
- [Backend Changelog](apps/backend/CHANGELOG.md)
55+
- [Frontend Changelog](apps/frontend/CHANGELOG.md)
56+
- [SDK Changelog](sdk/CHANGELOG.md)
57+
- [Polyphemus Changelog](apps/polyphemus/CHANGELOG.md)
58+
59+
60+
1661
## [0.6.6] - 2026-02-26
1762

1863
### Platform Release

VERSION

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1 +1 @@
1-
0.6.6
1+
0.6.7

apps/backend/CHANGELOG.md

Lines changed: 31 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,37 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
## [0.6.6] - 2026-03-02
11+
12+
### Added
13+
- Added explicit `min_turns` parameter for early stop control in tests.
14+
- Added `min_turns` and `max_turns` support to import/export and synthesizer features.
15+
- Added test association methods (`add_tests()`, `remove_tests()`) to the SDK's `TestSet` class for bulk test linking.
16+
- Added client-side pagination to the metrics grid in the frontend.
17+
18+
### Changed
19+
- Replaced the maximum turns input in the frontend with a turn configuration range slider, allowing users to set both `min_turns` and `max_turns`.
20+
- Standardized naming: `max_iterations` has been renamed to `max_turns` across the backend, SDK, and documentation to reflect the actual semantics of conversation turns.
21+
- Updated the frontend to use an 80% default for `min_turns` in the test detail slider, matching the backend/Penelope default when `min_turns` is not explicitly set.
22+
- Improved turn budget awareness and deepening strategies in Penelope, ensuring every turn contributes substantive testing value.
23+
- Refactored Penelope's orchestration to simplify the codebase and improve evaluator voice.
24+
25+
### Fixed
26+
- Fixed an issue where the goal judge was creating spurious turn count criteria, leading to incorrect test failures.
27+
- Fixed premature stopping issues in Penelope by decoupling goal-impossible conditions from `min_turns` and clarifying turn budget.
28+
- Fixed a bug where metric updates could overwrite existing data with null values.
29+
- Fixed focus loss and stale save button issues in the metric editor in the frontend.
30+
- Fixed an issue where conversational metrics were not receiving the `conversation_history`, causing errors.
31+
- Fixed metrics page pagination to show all backend type tabs, even with a large number of metrics.
32+
- Fixed an issue where the conversational judge was incorrectly counting turns.
33+
- Fixed an issue preventing early stopping before reaching `max_turns`.
34+
- Fixed an issue where the push() method was discarding the backend response, leaving metric.id as None after creation.
35+
- Fixed max-turns stop reason detection to check for "maximum turns" instead of the stale "max iterations" string.
36+
37+
### Removed
38+
- Removed unnecessary indirection layers in Penelope's orchestration, simplifying the codebase.
39+
40+
1041
## [0.6.5] - 2026-02-26
1142

1243
### Added

apps/backend/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "rhesis-backend"
3-
version = "0.6.5"
3+
version = "0.6.6"
44
description = "Rhesis backend package"
55
readme = "README.md"
66
requires-python = ">=3.10"

apps/backend/uv.lock

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

apps/frontend/CHANGELOG.md

Lines changed: 26 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,32 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
## [0.6.7] - 2026-03-02
11+
12+
### Added
13+
14+
- Added explicit `min_turns` parameter for early stop control in conversational evaluations.
15+
- Added `add_tests()` and `remove_tests()` methods to the SDK `TestSet` for bulk test association.
16+
- Added `min_turns` and `max_turns` support to test configuration import/export and synthesizer.
17+
- Added client-side pagination to the metrics grid.
18+
19+
### Changed
20+
21+
- Replaced the single "max turns" input with a turn configuration range slider on the test detail page and dual number inputs in the manual test writer, allowing configuration of both `min_turns` and `max_turns`.
22+
- Renamed `max_iterations` to `max_turns` throughout the codebase to better reflect the semantics of conversation turns.
23+
- Updated the conversational judge to count turns as user-assistant pairs instead of individual messages.
24+
- Improved early stopping behavior in conversational evaluations, preventing early termination before reaching 80% of `max_turns`.
25+
- The `push()` method in the SDK now supports both creating (POST) and updating (PUT) metrics.
26+
- Updated metrics page to paginate metrics fetch to show all backend type tabs.
27+
28+
### Fixed
29+
30+
- Fixed focus loss and stale save button in the metric editor.
31+
- Fixed metric update overwriting with null values in the backend.
32+
- Fixed an issue where conversational metrics were not receiving the `conversation_history` parameter during evaluation.
33+
- Fixed an issue where the metrics page was not displaying all backend type tabs due to a fetch limit.
34+
- Fixed an issue where the max-turns stop reason detection was using a stale "max iterations" string.
35+
1036
## [0.6.6] - 2026-02-26
1137

1238
### Added

apps/frontend/package.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
{
22
"name": "rhesis-app",
3-
"version": "0.6.6",
3+
"version": "0.6.7",
44
"scripts": {
55
"dev": "next dev --turbo -H 0.0.0.0 -p 3000",
66
"build": "npm run clean && next build",

apps/polyphemus/CHANGELOG.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,21 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
77

88
## [Unreleased]
99

10+
## [0.2.7] - 2026-03-02
11+
12+
### Added
13+
- Added support for specifying custom HTTP headers in Polyphemus requests. This allows users to authenticate with APIs that require specific header-based authentication schemes.
14+
- Added a new `--timeout` option to the Polyphemus CLI to allow users to configure the request timeout duration.
15+
16+
### Changed
17+
- Improved error handling for network connectivity issues. Polyphemus now provides more informative error messages when encountering connection errors.
18+
- Updated the default user agent string to include the Polyphemus version number.
19+
20+
### Fixed
21+
- Fixed an issue where Polyphemus would incorrectly parse URLs containing special characters.
22+
- Resolved a bug that caused Polyphemus to crash when encountering malformed JSON responses.
23+
24+
1025
## [0.2.6] - 2026-02-26
1126

1227
### Added

apps/polyphemus/pyproject.toml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
11
[project]
22
name = "polyphemus"
3-
version = "0.2.6"
3+
version = "0.2.7"
44
requires-python = ">=3.10"
55
dependencies = [
66
# FastAPI and server dependencies

apps/polyphemus/uv.lock

Lines changed: 3 additions & 3 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

0 commit comments

Comments
 (0)