Skip to content

Evaluating Chatbot responses

yangm2 edited this page Feb 16, 2026 · 2 revisions

Terminology & Concepts

term definition
dataset tbd
evaluation tbd
evaluator tbd
experiment tbd
llm-as-a-judge tbd
single-turn conversation tbd
multi-turn conversation tbd
trajectory evaluation tbd
simulated user tbd

Technology

tech description
langchain tbd
langsmith tbd

Experimental Flow

Please see the EVALUATION.md in the repo for setting up and running experiments with LangSmith.

Understanding and Navigating Experimental Results in LangSmith

TBD

Clone this wiki locally