tests(docs): add component-level eval tracing and dataset IO tests#2433
Open
BloggerBust wants to merge 11 commits intoconfident-ai:mainfrom
Open
tests(docs): add component-level eval tracing and dataset IO tests#2433BloggerBust wants to merge 11 commits intoconfident-ai:mainfrom
BloggerBust wants to merge 11 commits intoconfident-ai:mainfrom
Conversation
Contributor
BloggerBust
commented
Jan 14, 2026
- Add component-level eval tests (shape snapshots and semantic assertions)
- Add dataset JSON/CSV loader tests for EvaluationDataset
- refactor trace snapshot utilities to tests/utils/trace_assertions.py
- Re-export trace snapshot utilities from tests/test_integrations/utils.py
- Commit generated trace snapshot JSON fixtures
Contributor
|
Skipped: This PR was not opened by one of your configured authors: ( |
|
@BloggerBust is attempting to deploy a commit to the Confident AI Team on Vercel. A member of the Team first needs to authorize it. |
- Add component-level eval tests (shape snapshots and semantic assertions) - Add dataset JSON/CSV loader tests for EvaluationDataset - refactor trace snapshot utilities to tests/utils/trace_assertions.py - Re-export trace snapshot utilities from tests/test_integrations/utils.py - Commit generated trace snapshot JSON fixtures
edfea09 to
f4d420e
Compare
- add coverage for tool spans, agent spans, metrics scoping, update_current_span last write wins, and evals_iterator input mapping - move span/trace helpers into shared test helpers - add rooted app smoke test asserting agent/retriever/generator and metrics - add rooted app trace fixture snapshot
…pans Filter observe kwargs to ToolSpan model fields and drop colliding keys before constructing ToolSpan to avoid duplicate keyword errors. Also adds doc-driven component-level tracing tests covering tool spans and LLMTestCase tool call fields.
22fd071 to
940041e
Compare
ToolSpan now filters observe_kwargs to model fields and drops any keys that would collide with explicit span_kwargs so reserved fields always win. - Rename tool span in component-level doc tests to avoid name collisions - Assert observe(name) overrides function name, and update_current_span overrides observe - Add checklist coverage for parent/child UUID relationships in nested spans - Add regression test to ensure @observe(type="tool", name=...) does not crash due to name collison
940041e to
e7c7324
Compare
- add missing component-level test for span name precedence - ensure update_current_span(name=...) wins over function-derived names
c58fa9e to
b368a4b
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.