Skip to content

Commit 49eb67f

Browse files
authored
Merge pull request #14 from nirukk52/backend-testing-infrastructure
Backend testing infrastructure
2 parents 713b328 + 177664f commit 49eb67f

File tree

25 files changed

+2530
-649
lines changed

25 files changed

+2530
-649
lines changed
Lines changed: 305 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,305 @@
1+
# Encore MCP Testing Workflow 🔥
2+
3+
**Date**: 2025-11-09
4+
**Status**: ✅ Correct Approach
5+
6+
## The Right Way to Test Backends
7+
8+
### ❌ OLD WAY (Wrong)
9+
```bash
10+
# Start encore run in terminal 1
11+
cd backend && encore run
12+
13+
# Run test in terminal 2
14+
cd backend && encore test
15+
16+
# Check dashboard manually
17+
# Open http://localhost:9400
18+
```
19+
20+
**Problems:**
21+
- Requires two terminals
22+
- `encore run` needed just for debugging
23+
- Can't inspect `encore test` runtime
24+
- Dashboard shows wrong environment
25+
26+
### ✅ NEW WAY (Correct)
27+
```bash
28+
# Just run the test
29+
cd .cursor && task backend:integration:metrics
30+
31+
# Use Encore MCP to debug
32+
# (Claude/AI agents use MCP tools directly)
33+
```
34+
35+
**Benefits:**
36+
- Single command
37+
- No `encore run` required
38+
- Inspect `encore test` runtime directly
39+
- Programmatic debugging
40+
41+
## Encore MCP Tools for Testing
42+
43+
### 1. Query Test Database
44+
45+
```typescript
46+
mcp_encore-mcp_query_database({
47+
queries: [{
48+
database: "db",
49+
query: `
50+
SELECT run_id, status, stop_reason, created_at
51+
FROM runs
52+
ORDER BY created_at DESC
53+
LIMIT 5
54+
`
55+
}]
56+
})
57+
```
58+
59+
**Use when:** Need to see recent test runs and their outcomes
60+
61+
### 2. Get Run Events
62+
63+
```typescript
64+
mcp_encore-mcp_query_database({
65+
queries: [{
66+
database: "db",
67+
query: `
68+
SELECT event_type, sequence_number, payload
69+
FROM run_events
70+
WHERE run_id = '01K9KEVM6VFB6M5AB7AE04Y1RW'
71+
ORDER BY sequence_number
72+
`
73+
}]
74+
})
75+
```
76+
77+
**Use when:** Need to see full event timeline for a failed run
78+
79+
### 3. Check Agent State
80+
81+
```typescript
82+
mcp_encore-mcp_query_database({
83+
queries: [{
84+
database: "db",
85+
query: `
86+
SELECT
87+
snapshot->>'nodeName' as node_name,
88+
snapshot->>'status' as status,
89+
snapshot->>'stopReason' as stop_reason,
90+
created_at
91+
FROM agent_state_snapshots
92+
WHERE run_id = '01K9KEVM6VFB6M5AB7AE04Y1RW'
93+
ORDER BY created_at DESC
94+
LIMIT 1
95+
`
96+
}]
97+
})
98+
```
99+
100+
**Use when:** Need to see where agent got stuck
101+
102+
### 4. Count Discovered Screens
103+
104+
```typescript
105+
mcp_encore-mcp_query_database({
106+
queries: [{
107+
database: "db",
108+
query: `
109+
SELECT COUNT(DISTINCT screen_id) as screen_count
110+
FROM graph_persistence_outcomes
111+
WHERE run_id = '01K9KEVM6VFB6M5AB7AE04Y1RW'
112+
AND upsert_kind = 'discovered'
113+
`
114+
}]
115+
})
116+
```
117+
118+
**Use when:** Validating metrics for deterministic testing
119+
120+
### 5. Get Service Metadata
121+
122+
```typescript
123+
mcp_encore-mcp_get_services({
124+
services: ["run", "agent"],
125+
include_endpoints: true,
126+
include_schemas: true
127+
})
128+
```
129+
130+
**Use when:** Need to understand endpoint signatures or service structure
131+
132+
### 6. Inspect PubSub Topics
133+
134+
```typescript
135+
mcp_encore-mcp_get_pubsub()
136+
```
137+
138+
**Use when:** Debugging subscription issues or message flows
139+
140+
### 7. Get Database Schema
141+
142+
```typescript
143+
mcp_encore-mcp_get_databases({
144+
include_tables: true
145+
})
146+
```
147+
148+
**Use when:** Writing new queries or understanding table relationships
149+
150+
## Complete Debugging Workflow
151+
152+
### Step 1: Run Test
153+
```bash
154+
cd .cursor && task backend:integration:metrics
155+
```
156+
157+
### Step 2: Get Latest Run ID
158+
```typescript
159+
mcp_encore-mcp_query_database({
160+
queries: [{
161+
database: "db",
162+
query: "SELECT run_id, status, stop_reason FROM runs ORDER BY created_at DESC LIMIT 1"
163+
}]
164+
})
165+
```
166+
167+
**Output:**
168+
```json
169+
{
170+
"run_id": "01K9KEVM6VFB6M5AB7AE04Y1RW",
171+
"status": "failed",
172+
"stop_reason": "failed"
173+
}
174+
```
175+
176+
### Step 3: Get Run Events
177+
```typescript
178+
mcp_encore-mcp_query_database({
179+
queries: [{
180+
database: "db",
181+
query: `
182+
SELECT event_type, payload
183+
FROM run_events
184+
WHERE run_id = '01K9KEVM6VFB6M5AB7AE04Y1RW'
185+
ORDER BY sequence_number
186+
`
187+
}]
188+
})
189+
```
190+
191+
**Analyze:** Which node did it fail at? What was the last event?
192+
193+
### Step 4: Get Agent State
194+
```typescript
195+
mcp_encore-mcp_query_database({
196+
queries: [{
197+
database: "db",
198+
query: `
199+
SELECT snapshot
200+
FROM agent_state_snapshots
201+
WHERE run_id = '01K9KEVM6VFB6M5AB7AE04Y1RW'
202+
ORDER BY created_at DESC
203+
LIMIT 1
204+
`
205+
}]
206+
})
207+
```
208+
209+
**Analyze:** What node was executing? What counters/budgets were set?
210+
211+
### Step 5: Check Metrics
212+
```typescript
213+
mcp_encore-mcp_query_database({
214+
queries: [{
215+
database: "db",
216+
query: `
217+
SELECT payload->>'metrics' as metrics
218+
FROM run_events
219+
WHERE run_id = '01K9KEVM6VFB6M5AB7AE04Y1RW'
220+
AND event_type = 'agent.run.finished'
221+
`
222+
}]
223+
})
224+
```
225+
226+
**Analyze:** Were metrics captured before failure?
227+
228+
## Common Queries for Testing
229+
230+
### Find All Failed Runs
231+
```sql
232+
SELECT run_id, stop_reason, created_at
233+
FROM runs
234+
WHERE status = 'failed'
235+
ORDER BY created_at DESC
236+
LIMIT 10
237+
```
238+
239+
### Find Runs That Got Stuck at Specific Node
240+
```sql
241+
SELECT r.run_id, a.snapshot->>'nodeName' as stuck_node
242+
FROM runs r
243+
JOIN agent_state_snapshots a ON r.run_id = a.run_id
244+
WHERE r.status = 'failed'
245+
AND a.created_at = (
246+
SELECT MAX(created_at)
247+
FROM agent_state_snapshots
248+
WHERE run_id = r.run_id
249+
)
250+
ORDER BY r.created_at DESC
251+
```
252+
253+
### Compare Successful vs Failed Runs
254+
```sql
255+
-- Successful run event count
256+
SELECT COUNT(*) FROM run_events WHERE run_id = 'successful_run_id';
257+
258+
-- Failed run event count
259+
SELECT COUNT(*) FROM run_events WHERE run_id = 'failed_run_id';
260+
261+
-- Missing events in failed run
262+
SELECT DISTINCT event_type
263+
FROM run_events
264+
WHERE run_id = 'successful_run_id'
265+
AND event_type NOT IN (
266+
SELECT event_type FROM run_events WHERE run_id = 'failed_run_id'
267+
);
268+
```
269+
270+
## Integration with backend-testing_skill
271+
272+
The `backend-testing_skill` now includes Encore MCP examples for all testing patterns:
273+
274+
- Pattern 1: Unit Test (direct endpoint calls)
275+
- Pattern 2: Integration Test Database (query results)
276+
- Pattern 3: Integration Test PubSub (verify message flow)
277+
- Pattern 4: Metrics Test (validate deterministic behavior)
278+
279+
**All patterns**: Use Encore MCP for debugging instead of manual inspection!
280+
281+
## Key Benefits
282+
283+
1. **No Manual Dashboard Checks** - Query programmatically
284+
2. **Isolated Test Runtime** - `encore test` creates clean environment
285+
3. **Repeatable** - Same queries work every time
286+
4. **AI-Friendly** - Claude/agents can use MCP tools directly
287+
5. **Version Controlled** - Queries can be saved in test helpers
288+
289+
## Update Your Workflow
290+
291+
### Before (Manual)
292+
```bash
293+
cd backend && encore run # Terminal 1
294+
cd backend && encore test # Terminal 2
295+
# Open dashboard, click around, find run, check events manually
296+
```
297+
298+
### After (Automated)
299+
```bash
300+
cd .cursor && task backend:integration:metrics
301+
# Use Encore MCP tools to query results programmatically
302+
```
303+
304+
**This is the way.** 🚀
305+

0 commit comments

Comments
 (0)