Conversation
* 🎛️ feat: switch evalModel to evalModelSet for test evaluation Replaces evalModel with evalModelSet, allowing multiple evaluation models. * ✨ feat: add multi-model evaluation to metrics and compliance Support evaluating metrics and compliance with multiple models via evalModelSet. * ✨ Refine evaluation model handling and debug logging Improved evalModelSet defaults, header levels, and debugging output. * ✨ Enhance evalModelSet sourcing and logging in promptpex Now supports sourcing evalModelSet from env var, adds validation, and logging. * ✨ refactor test metric evaluation and overview model handling Refined evalModelSet parsing and updated test metric iteration logic. * ✨ feat: Add combined avg metric across eval models Compute and store average metric score for all evaluation models used. * ✨ Enhance promptpex test evaluation and script logic Added separate eval-only/test-run modes, improved metric evaluations * ♻️ Rename evalModelSet to evalModel throughout codebase Standardizes config and variable naming from evalModelSet to evalModel.
…#141) * ✨ Enhance test results saving and eval metrics workflow Improved control of results file writing and evaluation metrics assignment. * ✨ Add evals config flag to control evaluation execution Introduces evals boolean for toggling evaluation of test results. * ✨: Enable direct context-loading from JSON files Refactored CLI to load PromptPexContext from JSON, updating file flow.
* ✨ Add scripts and logic for multi-stage sample evaluations Introduces zx scripts for gen/run/eval sample tests and conditional test executions. * 🔀 rename: Samples scripts renamed to .zx.mjs extensions All run-samples-*.mjs scripts updated to .zx.mjs for zx compatibility. * ♻️ refactor: Rename sample scripts to .zx.mjs extensions Updated script names in package.json and renamed a sample file for zx compatibility
Introduces groundtruth model option, result tracking, and output storage.
Extended PromptPexTest and PromptPexTestResult with groundtruth support.
Add lmstudio to settings, expand UI model suggestions, tidy runTests.
✨ Add support for groundtruth model and outputs
Create genai-issue-labeller
…dtruth documentation links
Expanded glossary and updated diagrams to standardize GTM terms.
* feat: add groundtruth option and related parameters for test generation * feat: add model alias for groundtruth evaluation * feat: add model_under_test alias and update related logic in prompt generation * feat: update groundtruth model handling and rename constants for clarity * Update src/genaisrc/src/testrun.mts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* ✨ Label tests with unique IDs and propagate testuid Added unique testuid to each test and test result; updated logic to use it. * ✨ add testuid to test run output and update indexing logic Test run data now includes testuid; testuid index starts from 0. * ✨: Unleash Unique IDs in PromptPex Tests with nanoid Integrated nanoid for generating unique, consistent test UIDs. * ✨ Fix testuid template and ensure strict equality in search Corrected testuid generation format and used strict equality for lookup. * Update src/genaisrc/src/testevalmetric.mts Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
* ✨ add groundtruth test output support to promptpex Introduce groundtruth test results file loading and parsing support. * ✏️ fix typo in PromptPexContext groundtruth comment Corrected 'Groudtruth' to 'Groundtruth' in the documentation comment. * Apply suggestions from code review Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com> --------- Co-authored-by: Peli de Halleux <pelikhan@users.noreply.github.com> Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
- Upgraded genaiscript from version 1.142.15 to 2.0.8 - Updated openai from version 5.5.1 to 5.7.0 - Bumped @types/node from 24.0.3 to 24.0.4 - Upgraded prettier from 3.5.3 to 3.6.1 - Updated zx from 8.5.5 to 8.6.0 - Modified promptpex:demo:github script to include .github.env for environment variables
shraddhabarke
approved these changes
Jul 1, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.