Atlas does not present uncertain information as fact. Every response carries an internal confidence assessment, and Atlas's behavior changes based on how confident it is in its answer. If it can't ground the answer, it says so — then either asks for more information or goes and finds it.
"I don't know" is always a better answer than a confident-sounding hallucination.
Every response generated by Atlas (Layers 1-3) gets an internal confidence score before it's delivered:
┌──────────────────────────────────────────────────────────────┐
│ Response Confidence Pipeline │
│ │
│ Response Generated │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ Confidence │ │
│ │ Assessor │ ← checks grounding sources │
│ └────────┬────────┘ │
│ │ │
│ ┌─────┴──────┬──────────┬──────────┐ │
│ ▼ ▼ ▼ ▼ │
│ HIGH (≥0.8) MED (0.5-0.8) LOW (0.2-0.5) NONE (<0.2) │
│ Deliver Hedge Ask/Search Refuse to guess │
│ directly language for more "I don't know" │
│ "likely", info first │
│ "I think" │
└──────────────────────────────────────────────────────────────┘
| Level | Score | Filler Phrase | Then... |
|---|---|---|---|
| HIGH | ≥ 0.8 | Normal sentiment filler ("Good question — ") | Deliver answer directly |
| MEDIUM | 0.5–0.8 | Uncertainty filler ("I think I know, but let me make sure — ") | Deliver with hedge language |
| LOW | 0.2–0.5 | Research filler ("Hmm, I'm not 100% on that — let me check... ") | Stream filler → grounding loop → deliver grounded answer |
| NONE | < 0.2 | Honest filler ("I genuinely don't know this one. ") | Ask user or offer to search |
Critical: the user always sees words immediately. The grounding loop runs in the background while the filler streams. The user never stares at a spinner.
These work the same way as sentiment fillers — streamed instantly while Atlas works in the background. They layer on top of sentiment fillers (sentiment sets the emotional tone, confidence sets the certainty framing):
Just use the normal sentiment-based fillers:
"Good question — "
"Sure — "
"Yeah, "
"I'm fairly sure, but let me double-check... "
"If I remember right — "
"I think — "
"Pretty sure it's — "
"Let me make sure I've got this right... "
These buy time while the grounding loop runs (up to 3 seconds):
"Hmm, I'm not 100% on that — give me a sec to look it up... "
"Good question — I don't want to guess on this. Checking now... "
"Let me verify before I say something wrong... "
"I've got a rough idea, but let me make sure... "
"Not totally sure off the top of my head — pulling it up now... "
"I want to get this right — one moment... "
If the grounding loop succeeds (confidence boosted ≥ 0.5):
"...okay, got it. [grounded answer]"
"...yeah, so here's what I found: [answer]"
"...alright, confirmed — [answer]"
If the grounding loop fails (still < 0.5):
"...I couldn't verify this, so take it with a grain of salt: [best guess with caveats]"
"...I'm still not finding a solid answer. Want me to dig deeper, or do you have more context?"
"...honestly, I'm not confident enough to give you an answer on this. Can you point me in the right direction?"
"I genuinely don't know this one. "
"That's outside what I can answer confidently. "
"I'd rather not guess on this — "
"I have no idea, honestly. "
Followed by an offer:
"Want me to search for it?"
"Do you have any more details that might help?"
"I can try to look it up if you want."
User: "What's the default MTU for WireGuard?"
0ms → Confidence pre-check: LLM topic = networking config,
specific number → base confidence 0.35 (LOW)
Prior mistakes in networking? → 0 → no penalty → 0.35
2ms → Start streaming LOW confidence filler:
"Good question — I don't want to guess on this. Checking now... "
50ms → Grounding loop starts (background):
├── Memory HOT query: "wireguard MTU" → no hits
├── Web search: "wireguard default MTU"
│ → Result: 1420 (multiple sources agree)
└── Confidence re-scored: 0.85 (web consensus)
1200ms → Grounding complete. Transition to answer:
"...okay, got it. WireGuard defaults to an MTU of 1420.
That's lower than the standard 1500 to account for the
WireGuard overhead. You might need to tune it depending
on your network."
User perceives: continuous response from 2ms, with honest
uncertainty that resolves into a confident answer.
The two filler systems compose naturally:
| Sentiment | Confidence | Combined Filler |
|---|---|---|
| Question + HIGH | — | "Good question — [answer]" |
| Question + LOW | — | "Good question — I'm not sure off the top of my head. Checking... " |
| Frustrated + HIGH | — | "Yeah, that's annoying. [answer]" |
| Frustrated + LOW | — | "I hear you — let me look into this real quick... " |
| Greeting + any | — | "Hey! " (confidence doesn't apply to greetings) |
| Excited + MEDIUM | — | "Ooh, interesting one — I think I know but let me confirm... " |
The filler engine selects sentiment filler first, then appends confidence framing if needed:
filler = select_sentiment_filler(sentiment, user_profile)
if confidence < 0.8:
filler += select_confidence_filler(confidence)| Signal | Effect on Confidence | Notes |
|---|---|---|
| Answer from Layer 1 (date, math) | +0.95 | Computed, not generated — always grounded |
| Answer from Layer 2 (HA command) | +0.90 | Direct API result, verifiable |
| Memory hit (high relevance score) | +0.3 to +0.7 | Depends on memory confidence + recency |
| Web search result (SearXNG) | +0.3 to +0.6 | Depends on source quality + consensus |
| LLM generation (no tools used) | +0.2 to +0.5 | Pure generation = lowest baseline confidence |
| Multiple sources agree | +0.2 bonus | Corroboration boosts confidence |
| User previously confirmed this | +0.3 bonus | "You told me X last week" is grounded |
| Topic is fast-changing (versions, dates) | -0.2 penalty | High hallucination risk for version numbers |
| Specific numbers/dates/URLs | -0.2 penalty | LLMs fabricate these frequently |
When confidence is LOW, Atlas doesn't just say "I don't know" — it actively tries to ground the answer before responding:
User: "What port does Zigbee2MQTT run on by default?"
Step 1: LLM generates "8080" (confidence: 0.35 — uncertain, specific number)
Step 2: Grounding loop triggers (confidence < 0.5)
├── Check memory: any prior interactions about Zigbee2MQTT? → No
├── Check HA: is zigbee2mqtt running? → query API
│ └── Found: zigbee2mqtt addon running on port 8485 (frontend)
├── Web search: "zigbee2mqtt default port" → confirms 8080 (backend), 8485 (frontend)
└── Consensus: multiple sources, confidence now 0.85
Step 3: Deliver grounded answer:
"Zigbee2MQTT's backend API runs on port 8080 by default,
and the web frontend is on 8485. Looks like yours is
running on 8485 based on your HA setup."
1. Computed / deterministic (date, math, system state) → highest trust
2. Direct observation (HA API, docker inspect, etc.) → very high
3. User's own prior statements (from memory system) → high
4. Web search consensus (multiple sources agree) → medium-high
5. Single web source (one result) → medium
6. LLM knowledge (training data) → medium-low
7. LLM inference (reasoning, no source) → low
┌─────────────────────────────────────────────────────────┐
│ Grounding Loop │
│ │
│ Input: LLM draft response + confidence score │
│ │
│ if confidence ≥ 0.8: │
│ → deliver as-is │
│ │
│ if confidence < 0.8: │
│ → extract claims (specific facts, numbers, dates) │
│ → for each claim: │
│ 1. Check memory (HOT path) for corroboration │
│ 2. Check live sources (HA API, system state) │
│ 3. Web search if still uncertain │
│ 4. Re-score confidence per claim │
│ → if overall confidence now ≥ 0.8: │
│ → deliver with grounded info │
│ → if still < 0.5: │
│ → ask user for clarification │
│ → OR say "I'm not confident in this answer" │
│ │
│ Max grounding attempts: 2 (avoid infinite loops) │
│ Grounding budget: 3 seconds max │
└─────────────────────────────────────────────────────────┘
The grounding loop doesn't check the entire response — it identifies specific verifiable claims:
Response: "Docker Compose v2.24 added the 'include' directive,
which lets you merge multiple compose files."
Extracted claims:
1. "Docker Compose v2.24 added 'include'" → specific version = HIGH hallucination risk
2. "'include' merges compose files" → functional claim = MEDIUM risk
Grounding actions:
Claim 1: web search "docker compose include directive version"
→ actually v2.20 → CORRECT claim, fix version
Claim 2: consistent with known Docker knowledge → pass
Corrected response: "Docker Compose v2.20 added the 'include' directive..."
These patterns trigger automatic grounding checks:
| Pattern | Risk | Example |
|---|---|---|
| Version numbers | Very high | "Python 3.12 added..." |
| Specific dates | High | "Released on March 15, 2025" |
| URLs | Very high | "See docs at https://..." |
| Exact config values | High | "Set max_connections to 1024" |
| API endpoints/params | High | "POST to /api/v2/users" |
| Numerical limits | Medium | "Supports up to 10,000 connections" |
| Attribution | High | "According to the Linux Foundation..." |
| "Always" / "Never" absolutes | Medium | "Docker always restarts containers..." |
When Atlas gets something wrong, it doesn't just correct itself — it logs the mistake so the system learns.
Mistakes are detected via:
- User correction: "That's wrong, it's actually X"
- Grounding loop contradiction: web search contradicts initial answer
- Self-correction: Atlas realizes mid-response it was wrong
- Follow-up failure: user reports the advice didn't work
CREATE TABLE mistake_log (
id INTEGER PRIMARY KEY AUTOINCREMENT,
interaction_id INTEGER, -- which interaction contained the mistake
user_id TEXT,
claim_text TEXT NOT NULL, -- what Atlas said that was wrong
correction_text TEXT, -- what the right answer is
detection_method TEXT NOT NULL, -- 'user_correction' | 'grounding' | 'self_correction' | 'follow_up'
mistake_category TEXT, -- 'version_number' | 'factual' | 'config' | 'attribution' | 'logic' | 'outdated'
confidence_at_time REAL, -- how confident Atlas was when it said it
topic_tags TEXT DEFAULT '[]', -- JSON: ["docker", "python", "networking"]
root_cause TEXT, -- 'training_data_outdated' | 'hallucination' | 'misunderstood_question' | 'missing_context'
resolved BOOLEAN DEFAULT FALSE, -- has this been addressed in memory/patterns?
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP,
FOREIGN KEY (interaction_id) REFERENCES interactions(id)
);
CREATE INDEX idx_mistakes_category ON mistake_log(mistake_category);
CREATE INDEX idx_mistakes_topic ON mistake_log(topic_tags);
CREATE INDEX idx_mistakes_unresolved ON mistake_log(resolved) WHERE resolved = FALSE;The nightly evolution job processes unresolved mistakes:
Nightly Mistake Review:
1. Query: SELECT * FROM mistake_log WHERE resolved = FALSE
2. For each mistake:
a. Store correction in memory (COLD path):
"Atlas incorrectly stated X. The correct answer is Y."
type: 'correction', confidence: 0.95, source: 'mistake_review'
b. If pattern detected (e.g., Atlas keeps getting Docker versions wrong):
→ Add to high-risk topics list
→ Lower default confidence for that topic by 0.1
→ Store: "Atlas tends to hallucinate Docker version numbers.
Always verify via web search."
c. Mark as resolved: UPDATE mistake_log SET resolved = TRUE
3. Generate mistake report:
"Today: 3 mistakes corrected. Recurring issue: version numbers
in Docker ecosystem. Added auto-grounding trigger for Docker
version claims."
If Atlas has been wrong about a topic before, it knows to be more careful:
def adjust_confidence(base_confidence, topic_tags, user_id):
# Check for prior mistakes in this topic area
prior_mistakes = query("""
SELECT COUNT(*) FROM mistake_log
WHERE topic_tags LIKE ?
AND created_at > datetime('now', '-30 days')
""", topic_tags)
# Each recent mistake in this topic lowers confidence
penalty = min(prior_mistakes * 0.1, 0.3) # max 0.3 penalty
return max(base_confidence - penalty, 0.1)Example: Atlas hallucinated a Docker Compose version once. Next time someone asks about Docker Compose versions, the base confidence drops from 0.5 to 0.4, triggering the grounding loop automatically.
Atlas communicates its confidence naturally:
"Piper TTS runs on port 10200 in your setup."
"I believe Zigbee2MQTT defaults to port 8080 for the API,
but let me verify that... [checks] yeah, 8080 for the API
and 8485 for the web frontend."
"I'm not sure about the exact firmware version your Zigbee
coordinator needs. Want me to look that up, or do you have
the model number?"
"I genuinely don't know the answer to that, and I don't want
to make something up. Let me search for it."
"Actually, wait — I said Python 3.12 but I think it was 3.11.
Let me check... [searches] it was 3.11. Sorry about that."
- Corrections stored as high-confidence memories
- Prior mistakes retrieved during confidence assessment
- "You asked about this before and I got it wrong — here's the corrected info"
- Honesty system (personality.md) defines the tone of uncertainty
- Grounding system defines the mechanics of when to be uncertain
- Together: Atlas is direct ("I don't know") not mealy-mouthed ("I'm not entirely sure but perhaps maybe...")
- Mistake review is a nightly job task
- Pattern detection across mistakes (recurring topics)
- Confidence calibration adjusts per-topic baselines over time
[GROUNDING RULES]
You MUST follow these rules about uncertainty:
- If you are not confident in a specific fact (version, date, URL, number), say so.
- Never present uncertain information as definitive fact.
- If asked something you don't know, say "I don't know" and offer to look it up.
- When you catch yourself being wrong, correct immediately. Don't double down.
- Prefer "I'll check" over guessing. Prefer "I don't know" over fabricating.
- If multiple answers are possible, present them as options, not as THE answer.
[/GROUNDING RULES]
[MISTAKE HISTORY]
Topics where you've been wrong before (verify before answering):
{{MISTAKE_TOPICS}}
[/MISTAKE HISTORY]
Tracked in memory_metrics for observability:
-- Grounding loop triggers per day
SELECT DATE(ts), COUNT(*) FROM memory_metrics
WHERE operation = 'grounding_loop' GROUP BY DATE(ts);
-- Mistake rate trend
SELECT DATE(created_at), COUNT(*) FROM mistake_log GROUP BY DATE(created_at);
-- Confidence calibration: are we under/over-confident?
SELECT
CASE WHEN confidence_at_time >= 0.8 THEN 'high'
WHEN confidence_at_time >= 0.5 THEN 'medium'
ELSE 'low' END as confidence_band,
COUNT(*) as total_mistakes
FROM mistake_log
GROUP BY confidence_band;
-- If most mistakes are in the "high" band, we're overconfident
-- If most are in "low", the grounding loop is working