-
Notifications
You must be signed in to change notification settings - Fork 2
research(orchestration): CoE collaborative entropy — uncertainty quantification for multi-LLM routing confidence scoring #2505
Description
Source
arXiv:2603.28360 — "CoE: Collaborative Entropy for Uncertainty Quantification in Agentic Multi-LLM Systems" (March 30, 2026)
Finding
CoE combines two entropy signals:
- Intra-model semantic entropy: token-probability variance across a single model's output
- Inter-model divergence: semantic disagreement between multiple models on the same query
The combined metric predicts routing confidence: high CoE → escalate to stronger model; low CoE → accept current model output. Outperforms single-entropy baselines at detecting hallucination-prone responses by 15%.
Applicability to Zeph
zeph-llm BaRP router uses Thompson sampling on reward signals (latency + quality). CoE adds a proactive uncertainty signal: instead of waiting for a bad outcome to update the bandit, CoE can flag uncertain responses before committing them, triggering a verification or escalation step.
Implementation sketch:
LlmProvider::response_entropy() -> f32— expose per-token log-prob variance (OpenAI/Claude support logprobs)- After each response: compute intra-model entropy; if
> threshold, run same prompt on secondary provider inter_model_divergence(r1, r2) -> f32via embedding cosine distance- If
divergence > threshold: prefer higher-confidence response or escalate to orchestrator - Config:
[llm.coe] enabled = false intra_threshold = 0.8 inter_threshold = 0.3
Priority rationale
P3: quality improvement for multi-model routing. Practical only with ≥2 providers configured. The response verification layer (PR #1862) already does post-hoc checking; CoE would make it proactive and cost-aware.