The Judge LLM in LLM Evaluations currently only supports cloud API providers (OpenAI, Anthropic, Google, xAI, Mistral, OpenRouter) etc .. Customers with internally hosted models (Ollama, vLLM, TGI, or any OpenAI-compatible inference server) cannot use them as the Judge LLM.
As of now model being evaluated supports endpointUrl field for custom/local endpoints however Judge LLM only accepts provider, model, apiKey eg no endpoint URL option. Users should be able to provide a custom endpoint URL for the Judge LLM, enabling self-hosted models to serve as the judge in evaluations.
Files that need changes:
FE:
Clients/src/presentation/pages/EvalsDashboard/NewExperimentModal.tsx — Add optional endpointUrl field to the judgeLlm config object and render
an input for it
- Related TypeScript interfaces for the judge config
BE:
EvalServer/src/utils/run_evaluation.py (~line 145-172) — Extract endpointUrl from judge config and set OPENAI_API_BASE environment variable when present
EvaluationModule/src/deepeval_engine/deepeval_evaluator.py — Update get_judge_llm() and CustomDeepEvalLLM to accept and pass base_url
EvaluationModule/src/deepeval_engine/model_runner.py — Update _setup_openai() to use base_url parameter when OPENAI_API_BASE is set
EvaluationModule/scorers/judge_runner.py — Pass base_url when creating OpenAI client for judge scoring
EvaluationModule/scorers/provider_registry.py — Allow custom models when a custom endpoint is provided (relax the hardcoded model allowlist)
The Judge LLM in LLM Evaluations currently only supports cloud API providers (OpenAI, Anthropic, Google, xAI, Mistral, OpenRouter) etc .. Customers with internally hosted models (Ollama, vLLM, TGI, or any OpenAI-compatible inference server) cannot use them as the Judge LLM.
As of now model being evaluated supports
endpointUrlfield for custom/local endpoints however Judge LLM only acceptsprovider,model,apiKeyeg no endpoint URL option. Users should be able to provide a custom endpoint URL for the Judge LLM, enabling self-hosted models to serve as the judge in evaluations.Files that need changes:
FE:
Clients/src/presentation/pages/EvalsDashboard/NewExperimentModal.tsx— Add optionalendpointUrlfield to thejudgeLlmconfig object and renderan input for it
BE:
EvalServer/src/utils/run_evaluation.py(~line 145-172) — ExtractendpointUrlfrom judge config and setOPENAI_API_BASEenvironment variable when presentEvaluationModule/src/deepeval_engine/deepeval_evaluator.py— Updateget_judge_llm()andCustomDeepEvalLLMto accept and passbase_urlEvaluationModule/src/deepeval_engine/model_runner.py— Update_setup_openai()to usebase_urlparameter whenOPENAI_API_BASEis setEvaluationModule/scorers/judge_runner.py— Passbase_urlwhen creating OpenAI client for judge scoringEvaluationModule/scorers/provider_registry.py— Allow custom models when a custom endpoint is provided (relax the hardcoded model allowlist)