Skip to content

Commit 96fdce0

Browse files
committed
chore: tidy up readme and docs
1 parent 0fcad13 commit 96fdce0

File tree

4 files changed

+269
-2
lines changed

4 files changed

+269
-2
lines changed

README.md

Lines changed: 16 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -149,9 +149,22 @@ This creates:
149149
/q1-hypothesize "Your problem..." # Generate hypotheses
150150
```
151151

152-
> **Pro tip:** For best results, see [Advanced Setup](docs/advanced.md#agent-configuration) to optimize your AI's understanding of the reasoning process.
152+
Here is a library of some [workflow examples](docs/workflow_example/) that might help you kick off with probing.
153153

154-
## How It Works
154+
But really, it would be better to hack into it straight away and feel the flow. Shash commands have a numeric prefix for your convenience.
155+
156+
### Recommended: Add FPF Context to Your Agent Rules
157+
158+
For best results, we highly recommend using the [`CLAUDE.md`](CLAUDE.md) from this repository as a reference for your own project's agent instructions. It's optimized for software engineering work with FPF.
159+
160+
At minimum, copy the **FPF Glossary** section to your:
161+
- `CLAUDE.md` (Claude Code)
162+
- `.cursorrules` or `AGENTS.md` (Cursor)
163+
- Agent system prompts (other tools)
164+
165+
This helps the AI understand FPF concepts like L0/L1/L2 layers, WLNK, R_eff, and the Transformer Mandate without re-explanation each session.
166+
167+
## How Quint Code Works
155168

156169
Quint Code implements the **[First Principles Framework (FPF)](https://github.com/ailev/FPF)** by Anatoly Levenchuk — a methodology for rigorous, auditable reasoning. The killer feature is turning the black box of AI reasoning into a transparent, evidence-backed audit trail.
157170

@@ -184,6 +197,7 @@ See [docs/fpf-engine.md](docs/fpf-engine.md) for the full breakdown.
184197

185198
## Documentation
186199

200+
- [Workflow Examples](docs/workflow_example/) — Step-by-step walkthroughs
187201
- [Quick Reference](docs/fpf-engine.md) — Commands and workflow
188202
- [Advanced: FPF Deep Dive](docs/advanced.md) — Theory, glossary, tuning
189203
- [Architecture](docs/architecture.md) — How it works under the hood

docs/workflow_example/README.md

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,9 @@
1+
# Workflow Examples
2+
3+
Step-by-step walkthroughs showing Quint in real scenarios.
4+
5+
| Example | Scenario |
6+
|---------|----------|
7+
| [Payment Webhooks](payment-webhooks.md) | Handling unreliable external events |
8+
| [CI/CD Strategy](cicd-strategy.md) | Choosing deployment infrastructure |
9+
| [LLM Pipeline Debugging](llm-pipeline-debugging.md) | Improving ML/AI accuracy with empirical testing |
Lines changed: 136 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,136 @@
1+
# Example: Choosing a CI/CD Strategy
2+
3+
Your legacy deployments work — SSH into server, git pull, build in place.
4+
But every deployment is a prayer. No rollbacks, no consistency, no audit trail.
5+
6+
You're building a new service and want to do it right this time.
7+
8+
## The Problem
9+
10+
Current state:
11+
- Git clone via SSH directly to EC2
12+
- Build happens on the server
13+
- No rollback mechanism
14+
- "Works on my machine" is the deployment strategy
15+
16+
Requirements:
17+
- Build/release idempotency
18+
- Cost-effective (no Kubernetes)
19+
- Must scale to other services later
20+
- Private repos, AWS infrastructure
21+
22+
## With Quint
23+
24+
```bash
25+
$ /q1-hypothesize "CICD strategy for new service - no k8s, cost-effective, idempotent"
26+
```
27+
28+
AI generates competing approaches:
29+
30+
| # | Approach | Complexity | Cost | Rollback |
31+
|---|----------|------------|------|----------|
32+
| H1 | GitHub Actions + SSH deploy | Low | Free | Manual |
33+
| H2 | Docker Swarm + ECR | Medium | ~$5/mo registry | Built-in |
34+
| H3 | ECS Fargate | Medium-High | ~$30+/mo | Built-in |
35+
| H4 | Kamal (DHH's tool) | Medium | Free | Built-in |
36+
37+
```bash
38+
$ /q2-verify
39+
```
40+
41+
AI checks constraints:
42+
- **H1 fails:** "No idempotency — same problem as current setup"
43+
- **H3 partial:** "Overkill for B2B fintech with <100 RPS"
44+
- **H4 partial:** "Requires Ruby runtime on deploy machine"
45+
- **H2 passes:** "Native Docker, no external dependencies, handles multi-service"
46+
47+
```bash
48+
$ /q1-add "concerned about blast radius — all services on one Swarm node?"
49+
```
50+
51+
AI generates sub-hypotheses:
52+
53+
| # | Approach | Blast Radius | Cost |
54+
|---|----------|--------------|------|
55+
| H2a | Single Swarm node, all services | High | Low (~$50/mo) |
56+
| H2b | Swarm node per service | Low | Higher (~$150/mo) |
57+
| H2c | Hybrid: Swarm for new, legacy stays | Medium | Medium |
58+
59+
```bash
60+
$ /q3-validate
61+
```
62+
63+
AI researches evidence:
64+
65+
```
66+
H2 — Docker Swarm + ECR
67+
68+
Evidence collected:
69+
1. [Docker docs] Swarm handles 1000s of containers per node — CL2
70+
2. [AWS pricing] ECR ~$0.10/GB, t3.medium ~$30/mo — CL3 (our context)
71+
3. [HN thread] "Swarm dead?" — maintainers confirm active, just feature-complete — CL1
72+
73+
R_eff = 0.68 (limited by HN thread uncertainty)
74+
```
75+
76+
```bash
77+
$ /q4-audit
78+
```
79+
80+
```
81+
Bias check: Initial preference was Kamal (H4) — rejected on Ruby dependency
82+
83+
Trust scores:
84+
- H2b (Swarm per service): R_eff = 0.68
85+
86+
Cost analysis:
87+
- Staging: Single Swarm node acceptable (blast radius OK for non-prod)
88+
- Production: Swarm per service for isolation
89+
90+
Recommendation: H2b for prod, H2a for staging. Revisit if RPS > 1000.
91+
```
92+
93+
```bash
94+
$ /q5-decide
95+
```
96+
97+
```
98+
Decision: Docker Swarm + ECR
99+
100+
Architecture:
101+
- Staging: Single t3.medium, all services in Swarm
102+
- Production: t3.medium per service, Swarm mode
103+
- Registry: AWS ECR (private, ~$5/mo)
104+
- CI: GitHub Actions (free tier sufficient)
105+
106+
Rationale:
107+
- Idempotent deploys via image tags
108+
- Rolling updates with automatic rollback
109+
- No Ruby/external dependencies
110+
- Migration path for legacy services
111+
112+
Review trigger: If any service needs >1000 RPS or HA across AZs
113+
```
114+
115+
When new developer asks: *"Why aren't we using Kubernetes? Everyone uses it."*
116+
117+
```bash
118+
$ /q-query "CICD architecture decision"
119+
```
120+
121+
```
122+
Decision: 2024-02-15 — Docker Swarm selected over ECS/K8s
123+
124+
Key evidence:
125+
- B2B fintech, <100 RPS typical load
126+
- K8s operational overhead not justified
127+
- Swarm provides 90% of benefits at 10% complexity
128+
- Cost: ~$80/mo vs ~$300/mo for ECS
129+
130+
Recommendation: Revisit only if:
131+
- Multi-AZ HA required
132+
- RPS exceeds 1000 sustained
133+
- Team grows beyond 5 engineers
134+
```
135+
136+
**The decision survives team changes. No tribal knowledge required.**
Lines changed: 108 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,108 @@
1+
# Example: Handling Payment Confirmations
2+
3+
Your checkout works. Stripe charges the card.
4+
But three weeks later, finance finds $12,000 in "ghost payments" —
5+
customers charged but never got access.
6+
7+
The webhook endpoint returned 200. Logs look clean.
8+
What went wrong?
9+
10+
## Without Quint
11+
12+
Your AI suggests: *"Just add a webhook endpoint that activates the subscription"*
13+
14+
You ship it. It works in testing. Production looks fine.
15+
16+
Until it doesn't. Webhooks fail silently. Your endpoint timed out during a DB hiccup. Stripe retried, you processed it twice. A network blip ate three webhooks completely.
17+
18+
Now you're debugging production with no record of why you built it this way.
19+
20+
## With Quint
21+
22+
```bash
23+
$ /q1-hypothesize "handle stripe payment confirmation"
24+
```
25+
26+
AI generates competing approaches:
27+
28+
| # | Approach | Risk | Recovery |
29+
|---|----------|------|----------|
30+
| H1 | Webhook-only | Silent failures, no detection | None without manual audit |
31+
| H2 | Webhook + sync processing | Timeout = lost event, retries = duplicates | Stripe retry (3 days) |
32+
| H3 | Webhook → Queue + Polling backup | Complex, two code paths | Self-healing |
33+
34+
```bash
35+
$ /q2-verify
36+
```
37+
38+
AI checks each hypothesis for logical consistency:
39+
- **H1 fails:** "No mechanism detects missed webhooks"
40+
- **H2 partial:** "Idempotency key needed, still misses network failures"
41+
- **H3 passes:** "Polling catches what webhooks miss, queue handles spikes"
42+
43+
```bash
44+
$ /q3-validate
45+
```
46+
47+
AI gathers evidence for surviving hypotheses:
48+
49+
```
50+
H3 — Webhook → Queue + Polling backup
51+
52+
Evidence collected:
53+
1. [Stripe docs] "Webhooks are not guaranteed" — CL2 (external docs)
54+
2. [Internal test] Queue handles 10k events/sec — CL3 (same context)
55+
3. [Case study] Company X lost $50k before adding polling — CL1 (different context)
56+
57+
R_eff = 0.72 (limited by CL1 case study)
58+
```
59+
60+
```bash
61+
$ /q4-audit
62+
```
63+
64+
```
65+
Bias check: No anchoring detected (H1 was initial suggestion, rejected)
66+
67+
Trust scores:
68+
- H3: R_eff = 0.72 (WLNK: external case study)
69+
70+
Recommendation: H3 is viable. Consider refreshing case study
71+
with internal metrics after 30 days in production.
72+
```
73+
74+
```bash
75+
$ /q5-decide
76+
```
77+
78+
```
79+
Decision: H3 — Async queue + 15-min polling reconciliation
80+
81+
Rationale:
82+
- Webhook acknowledges immediately (200 in <100ms)
83+
- Background job processes with idempotency check
84+
- Polling job catches silent failures
85+
- Accepted tradeoff: 15-min max delay for edge cases
86+
87+
Evidence: Stripe docs recommend polling backup.
88+
Review trigger: If webhook success rate drops below 99%
89+
```
90+
91+
## 3 weeks later
92+
93+
Finance asks: *"Why do we poll every 15 minutes? Can we remove it?"*
94+
95+
```bash
96+
$ /q-query "payment confirmation architecture"
97+
```
98+
99+
```
100+
Decision: 2024-01-15 — H3 selected over webhook-only
101+
102+
Key evidence:
103+
- Stripe admits webhook delivery "not guaranteed"
104+
- Polling catches ~0.3% of transactions (measured)
105+
- Removing polling = ~$400/month in silent failures
106+
107+
Recommendation: Keep polling. Document in runbook.
108+
```

0 commit comments

Comments
 (0)