Skip to content

Commit ee32616

Browse files
committed
chore: docs update
1 parent 44ba7f6 commit ee32616

File tree

4 files changed

+206
-6
lines changed

4 files changed

+206
-6
lines changed

CHANGELOG.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -96,6 +96,8 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
9696
- `quint_test` now accepts L2 hypotheses for evidence refresh (L2 + PASS stays L2 with fresh evidence).
9797
- Freshness report now shows individual evidence IDs (not just counts) for actionable output.
9898
- Implements WLNK principle: one expired evidence item = entire holon is STALE.
99+
- Natural language support: users can say "waive the benchmark until February" and the agent handles ID resolution.
100+
- New documentation: `docs/evidence-freshness.md` — practical guide to managing stale evidence.
99101
- Updated command documentation: `q-decay.md` and `q3-validate.md`.
100102

101103
- **CI/CD Pipeline**:

docs/architecture.md

Lines changed: 11 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -80,10 +80,18 @@ External evidence (documentation, benchmarks, research) is only valuable if it i
8080

8181
Quint's assurance calculator applies a **Congruence Penalty** based on the CL, reducing the effective reliability of evidence that isn't a perfect match for your context.
8282

83-
### Validity (Evidence Decay)
83+
### Validity (Evidence Freshness)
8484

8585
**FPF Pattern:** B.3.4 Evidence Decay & Epistemic Debt
8686

87-
Evidence is perishable. A performance benchmark from two years ago is less trustworthy than one from last week because the context (libraries, hardware, compilers) has likely changed.
87+
Evidence expires. That benchmark you ran six months ago? The library has been updated twice since then. Your numbers might not be accurate anymore.
8888

89-
Every piece of evidence in Quint has a `valid_until` date. The `/q-decay` command scans for expired evidence, and the assurance calculator automatically penalizes the reliability of claims that depend on it. This system makes the "staleness" of knowledge visible and manageable, preventing you from making critical decisions based on outdated information.
89+
Every piece of evidence has a `valid_until` date. When evidence expires, the decision it supports becomes **questionable** — not necessarily wrong, just unverified. The `/q-decay` command shows you what's stale and lets you:
90+
91+
- **Refresh** — Re-run tests to get fresh proof
92+
- **Deprecate** — Downgrade the hypothesis if the decision needs rethinking
93+
- **Waive** — Accept the risk temporarily with documented rationale
94+
95+
This makes hidden risk visible. You know exactly which decisions are operating on outdated assumptions.
96+
97+
See [Evidence Freshness](evidence-freshness.md) for the full guide.

docs/evidence-freshness.md

Lines changed: 181 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,181 @@
1+
# Evidence Freshness
2+
3+
Evidence has an expiration date. This guide explains why that matters and what to do about it.
4+
5+
## Why Evidence Expires
6+
7+
Imagine you benchmarked Redis vs Memcached six months ago. Redis won. You made the decision, recorded the DRR, moved on.
8+
9+
Now it's six months later. The Memcached team shipped a major performance update. Your Node.js version changed. The benchmark numbers you relied on? They might not be accurate anymore.
10+
11+
**The decision isn't necessarily wrong — it's just unverified.**
12+
13+
This is what FPF calls **Evidence Decay**. Every piece of evidence has a `valid_until` date. When that date passes, the evidence is "stale" and the decisions built on it become questionable.
14+
15+
## The Problem with Stale Evidence
16+
17+
Stale evidence creates hidden risk. You're operating on assumptions that haven't been re-checked. Maybe they're still true. Maybe they're not. You don't know.
18+
19+
Quint Code makes this visible instead of hiding it.
20+
21+
## Checking Your Evidence
22+
23+
Run `/q-decay` to see what's stale:
24+
25+
```
26+
/q-decay
27+
```
28+
29+
You'll get a freshness report showing which holons have expired evidence:
30+
31+
```
32+
## Evidence Freshness Report
33+
34+
### STALE (1 holon requires action)
35+
36+
#### Use Redis for Caching (L2)
37+
| Evidence ID | Type | Status | Details |
38+
|-------------|------|--------|---------|
39+
| ev-benchmark-2024-06-15 | validation | EXPIRED | 180 days overdue |
40+
| ev-load-test-2024-06-20 | validation | EXPIRED | 175 days overdue |
41+
42+
### FRESH (3 holons)
43+
- Use TypeScript for API (L2)
44+
- PostgreSQL for primary storage (L2)
45+
- JWT for auth tokens (L2)
46+
```
47+
48+
## Three Ways to Handle Stale Evidence
49+
50+
When you see stale evidence, you have three options:
51+
52+
### 1. Refresh — Re-run the test
53+
54+
**When:** The decision is still correct, you just need fresh proof.
55+
56+
**How:** Run `/q3-validate` on the hypothesis to gather new evidence.
57+
58+
```
59+
User: The Redis benchmark is old but Redis is still the right choice.
60+
Refresh the evidence.
61+
62+
Agent: Running validation for hypothesis-redis-caching...
63+
[executes benchmark]
64+
Evidence refreshed. Valid until 2025-06-21.
65+
```
66+
67+
### 2. Deprecate — Reconsider the decision
68+
69+
**When:** The world has changed. The decision itself is questionable.
70+
71+
**How:** Tell the agent to deprecate the hypothesis. It gets downgraded one level (L2→L1 or L1→L0).
72+
73+
```
74+
User: AWS announced they're deprecating this service.
75+
Deprecate that decision.
76+
77+
Agent: Deprecated: hypothesis-aws-service L2 → L1
78+
This decision now needs re-evaluation.
79+
Run /q1-hypothesize to explore alternatives.
80+
```
81+
82+
### 3. Waive — Accept the risk temporarily
83+
84+
**When:** You know it's stale, but you're shipping Tuesday and can't re-test everything right now.
85+
86+
**How:** Tell the agent to waive the evidence with a deadline and reason.
87+
88+
```
89+
User: We're launching next week. Waive the benchmark until January 15th,
90+
we have a full test cycle scheduled for the 20th.
91+
92+
Agent: Waiver recorded:
93+
- Evidence: ev-benchmark-2024-06-15
94+
- Waived until: 2025-01-15
95+
- Rationale: Launch deadline. Full test cycle Jan 20.
96+
97+
Warning: Returns to EXPIRED after 2025-01-15.
98+
```
99+
100+
**A waiver is not ignoring the problem.** It's explicitly documenting that you know about the risk and accept it until a specific date. The waiver goes in the audit log — who waived what, why, and until when.
101+
102+
## Natural Language Usage
103+
104+
You don't need to memorize evidence IDs or parameters. Just describe what you want.
105+
106+
The agent sees the freshness report and understands context. When you say "waive the benchmark until February," it finds the right evidence ID and calls the tool for you.
107+
108+
**These all work:**
109+
110+
```
111+
"Waive everything until January 15th, we're launching"
112+
113+
"The load test is only 2 weeks overdue, refresh it"
114+
115+
"That API is being deprecated, deprecate our decision to use it"
116+
117+
"Waive the security audit until the 15th with rationale: re-audit scheduled"
118+
```
119+
120+
If you want to be explicit, you can:
121+
122+
```
123+
/q-decay --waive ev-benchmark-2024-06-15 --until 2025-02-01 --rationale "Migration pending"
124+
```
125+
126+
But natural language works fine.
127+
128+
## The WLNK Principle
129+
130+
A holon is **STALE** if *any* of its evidence is expired (and not waived).
131+
132+
This is the Weakest Link (WLNK) principle. If you have three pieces of evidence and one is stale, the whole decision is questionable. You don't get to average it out.
133+
134+
Think of it like a chain. Three strong links and one rusted link? The chain breaks at the rust.
135+
136+
## Practical Workflows
137+
138+
### Weekly Maintenance
139+
140+
```
141+
/q-decay # What's stale?
142+
# For each item: refresh, deprecate, or waive
143+
```
144+
145+
### Before a Release
146+
147+
```
148+
/q-decay # Check for stale decisions
149+
# Either refresh evidence or explicitly waive with rationale
150+
# Waivers become part of release documentation
151+
```
152+
153+
### After Major Changes
154+
155+
Dependency update? API change? Security advisory?
156+
157+
```
158+
/q-decay # What's affected?
159+
# Deprecate obsolete decisions
160+
# Start new hypothesis cycle for replacements
161+
```
162+
163+
## Audit Trail
164+
165+
All actions are logged:
166+
167+
| Action | What's Recorded |
168+
|--------|----------------|
169+
| Deprecate | from_layer, to_layer, who, when |
170+
| Waive | evidence_id, until_date, rationale, who, when |
171+
172+
You can always answer: "Who waived what and why?"
173+
174+
## Summary
175+
176+
- Evidence expires. This is normal.
177+
- `/q-decay` shows you what's stale.
178+
- **Refresh** if the decision is still right, you just need new proof.
179+
- **Deprecate** if the decision needs rethinking.
180+
- **Waive** if you accept the risk temporarily (with documented rationale).
181+
- Talk naturally — the agent handles the details.

docs/fpf-engine.md

Lines changed: 12 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -82,10 +82,19 @@ Compute trust scores using:
8282
| `/q-actualize` | Maintenance | Reconcile the knowledge base with recent code changes. |
8383
| `/q-reset` | Utility | Discard the current reasoning cycle. |
8484

85-
### New Maintenance Commands
85+
### Maintenance Commands
8686

87-
#### /q-decay (Evidence Decay)
88-
Over time, the evidence supporting your decisions can become stale. A benchmark from two years ago may not reflect the performance of a library today. This command implements the FPF principle of **Evidence Decay (B.3.4)**. It scans your evidence for expired `valid_until` dates and reports on the project's "Epistemic Debt"—the amount of risk you are carrying from outdated knowledge.
87+
#### /q-decay (Evidence Freshness)
88+
89+
Evidence expires. A benchmark from six months ago might not reflect current performance. `/q-decay` shows you what's stale and gives you three options:
90+
91+
- **Refresh** — Re-run tests to get fresh evidence
92+
- **Deprecate** — Downgrade the hypothesis if the decision needs rethinking
93+
- **Waive** — Accept the risk temporarily with documented rationale
94+
95+
You can speak naturally: "waive the benchmark until February, we'll re-test after launch."
96+
97+
See [Evidence Freshness](evidence-freshness.md) for the full guide.
8998

9099
#### /q-actualize (Knowledge Reconciliation)
91100
This command serves as the **Observe** phase of the FPF's **Canonical Evolution Loop (B.4)**. It reconciles your documented knowledge with the current state of the codebase by:

0 commit comments

Comments
 (0)