Skip to content

Commit 592f5bd

Browse files
committed
docs: rewrite for end users and scrub org-specific references
- Rewrote README, getting-started, overlay-guide, troubleshooting, and installation docs for clear onboarding - Added "How it works" section explaining Pi SDK integration - Removed all internal org references from docs and smoke test - Deleted internal design docs (000-009), kept user-facing docs only - Renamed 007-installation.md to installation.md
1 parent 8ee3a76 commit 592f5bd

18 files changed

+444
-1535
lines changed

README.md

Lines changed: 97 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -1,112 +1,132 @@
11
# pi-sre-mode
22

3-
A Pi-native incident investigation package for Pi, with support for private overlays.
3+
An incident investigation mode for [Pi](https://github.com/mariozechner/pi-coding-agent). Open a terminal, start an incident, investigate with real tools, write a report — all without leaving Pi.
44

5-
## What this is
5+
## Why
66

7-
`pi-sre-mode` aims to distill the best parts of the llmduck idea into a Pi package:
7+
During an incident you're juggling metrics dashboards, log viewers, SSH sessions, and a dozen browser tabs. `pi-sre-mode` puts the investigation loop inside Pi so you can query metrics, grep logs, check service health, and build a timeline in one place.
88

9-
- guided incident workflow inside Pi
10-
- reusable SRE skills and prompts
11-
- read-only safety guardrails
12-
- connector / environment preflight checks
13-
- support for private organization overlays without forking the public package
9+
It ships read-only guardrails by default so you don't accidentally `rm` or `systemctl restart` something mid-investigation.
1410

15-
This repo started docs-first and now includes the initial working scaffold:
11+
## What you get
1612

17-
- `extensions/incident-mode.ts` — main public extension
18-
- `src/` — overlay types, state, checks, report helpers, template catalog
19-
- `skills/` — generic SRE skills
20-
- `prompts/` — generic incident prompt templates
21-
- `examples/local-overlay/` — sample overlay package for local testing
13+
- **`/incident`** — set up investigation context: pick a template (5xx spike, high latency, OOM, etc.), name the service, set a time window. That context follows every subsequent prompt automatically.
14+
- **`/check-connectors`** — preflight check that your CLIs, auth, and environment are ready before you start digging.
15+
- **`/report`** — turn the investigation into a markdown report.
16+
- **`/sudo`** / **`/sudo-off`** — bypass or re-enable the read-only guardrails when you need to.
17+
- **Built-in investigation skills** — SRE methodology and a generic investigation playbook that guide Pi's reasoning.
18+
- **7 incident templates** — 5xx spike, high latency, OOM/crash loop, broker issues, service down, deploy regression, resource exhaustion, plus a blank "custom" template.
2219

23-
## Documentation map
20+
## Quick start
2421

25-
- `docs/README.md` — doc index
26-
- `docs/000-overview.md` — project thesis, goals, and non-goals
27-
- `docs/001-product-shape.md` — what the public package should feel like
28-
- `docs/002-overlay-model.md` — how private overlays layer on top
29-
- `docs/003-public-package-architecture.md` — public package structure and runtime behavior
30-
- `docs/004-private-overlay-architecture.md` — private overlay package model
31-
- `docs/005-mvp.md` — initial build scope
32-
- `docs/006-build-plan.md` — phased implementation plan
33-
- `docs/007-installation.md` — installation patterns for global public package + project overlay
34-
- `docs/008-ecosystem-notes.md` — notes from the Pi package ecosystem
35-
- `docs/009-release-checklist.md` — first public release checklist
22+
Install globally:
3623

37-
## Installation
38-
39-
### Recommended real usage
24+
```bash
25+
pi install npm:pi-sre-mode
26+
```
4027

41-
Install the public package globally in `~/.pi/agent/settings.json`:
28+
Or add to `~/.pi/agent/settings.json`:
4229

4330
```json
4431
{
45-
"packages": [
46-
"npm:pi-sre-mode"
47-
]
32+
"packages": ["npm:pi-sre-mode"]
4833
}
4934
```
5035

51-
Install a private overlay project-locally in `.pi/settings.json`:
36+
Then in Pi:
5237

53-
```json
54-
{
55-
"packages": [
56-
"git:git@github.com:your-org/pi-sre-overlay-zerodha.git"
57-
]
58-
}
38+
```
39+
/check-connectors # verify your environment
40+
/incident # set up context — pick a template, name the service
41+
investigate elevated p99 for payments-api, start with the timeline
42+
/report # generate a markdown report
5943
```
6044

61-
### Local development
45+
## You don't always need `/incident`
6246

63-
```json
64-
{
65-
"packages": [
66-
"/path/to/pi-sre-mode",
67-
"/path/to/pi-sre-overlay-zerodha"
68-
]
69-
}
47+
Use plain Pi for quick questions:
48+
49+
- "check p99 latency for payments-api over the last 2h"
50+
- "compare error rates before and after the last deploy"
51+
- "summarize the Nomad allocation restarts today"
52+
53+
Use `/incident` when you want persistent context, a structured template, guardrails, and a report at the end.
54+
55+
## Private overlays
56+
57+
The public package is generic on purpose. Your team's topology, runbooks, and internal tooling live in a **private overlay** — a separate Pi package that layers org-specific templates, skills, prompts, connector checks, and report paths on top.
58+
59+
Install an overlay per-project:
60+
61+
```bash
62+
pi install -l git:git@github.com:your-org/pi-sre-overlay.git
7063
```
7164

72-
More detailed installation examples are in `docs/007-installation.md`.
65+
See the [overlay guide](./docs/overlay-guide.md) for how to build one.
7366

74-
## Status
67+
## Read-only by default
7568

76-
MVP scaffold implemented and validated with both public-only and overlay smoke tests.
69+
During an active incident, `pi-sre-mode` blocks:
7770

78-
## Smoke test
71+
- file writes and edits
72+
- `rm`, `mv`, `sudo`, `kill`, `chmod`, `chown`
73+
- `systemctl restart/stop`, `nomad job run/stop`
74+
- mutating AWS CLI commands (create, delete, terminate, etc.)
75+
- shell trampolines (`bash -c`, `eval`, subshells)
7976

80-
A local smoke test is included at:
77+
Use `/sudo` to temporarily disable these guardrails. `/sudo-off` re-enables them.
8178

82-
- `examples/smoke-test/smoke-test.mjs`
79+
## Commands
8380

84-
Example:
81+
| Command | Purpose |
82+
|---|---|
83+
| `/incident` | Start or update investigation context |
84+
| `/incident-reset` | Clear incident context |
85+
| `/check-connectors` | Run environment preflight checks |
86+
| `/report` | Generate a markdown investigation report |
87+
| `/sudo` | Bypass read-only guardrails |
88+
| `/sudo-off` | Re-enable read-only guardrails |
8589

86-
```bash
87-
cd /path/to/pi-sre-mode
90+
## How it works
91+
92+
`pi-sre-mode` is built entirely on Pi's extension API — no external server, no separate UI, no agent framework. Everything runs inside your Pi session.
93+
94+
- **Prompt injection** — when incident mode is active, `before_agent_start` automatically prepends the incident context (template, service, time window, guardrails) to every prompt. Pi investigates with full awareness of what you're looking at.
95+
- **Tool interception**`tool_call` hooks inspect every command before execution and block dangerous ones. This is how read-only guardrails work without a custom sandbox.
96+
- **Session state** — incident context is persisted in Pi's session entries, so it survives reloads, branches, and forks. Navigate the session tree and your incident follows.
97+
- **Interactive UI**`/incident` uses Pi's built-in `select`, `input`, and `confirm` primitives for the setup wizard. Status line and widget show the active incident at a glance.
98+
- **Inter-extension events** — overlays register themselves by emitting events that the public package listens for. No tight coupling, no imports between packages.
99+
- **Skills and prompts** — shipped as standard Pi skills/prompts in the package manifest. Pi discovers them automatically.
100+
101+
This means the extension is thin orchestration. The real value is in the skills, prompts, and templates — content that's easy to write and easy to override.
102+
103+
## Docs
104+
105+
- [Getting started](./docs/getting-started.md)
106+
- [Building an overlay](./docs/overlay-guide.md)
107+
- [Installation patterns](./docs/installation.md)
108+
- [Troubleshooting](./docs/troubleshooting.md)
109+
110+
## Examples
111+
112+
- [`examples/local-overlay/`](./examples/local-overlay/) — minimal overlay for testing
113+
- [`examples/smoke-test/`](./examples/smoke-test/) — automated smoke test via Pi RPC
114+
115+
## Local development
88116

89-
# Public package only
117+
```json
118+
{
119+
"packages": [
120+
"/path/to/pi-sre-mode",
121+
"/path/to/your-overlay"
122+
]
123+
}
124+
```
125+
126+
```bash
127+
# public package only
90128
bun run smoke-test -- --public-only
91129

92-
# Public package + overlay
130+
# with an overlay
93131
bun run smoke-test -- --overlay /path/to/private-overlay
94132
```
95-
96-
Current commands in the public extension:
97-
- `/incident`
98-
- `/incident-reset`
99-
- `/sudo`
100-
- `/sudo-off`
101-
- `/check-connectors`
102-
- `/report`
103-
104-
Current features:
105-
- persisted incident-mode session state
106-
- prompt injection via `before_agent_start`
107-
- read-only blocking for `write` / `edit` and unsafe bash patterns while incident guardrails are active
108-
- separate `/sudo` mode to bypass incident permission checks when needed
109-
- connector preflight checks
110-
- markdown report generation
111-
- overlay registration via `incident-mode:register-overlay`
112-
- RPC-based smoke test for package + overlay integration

docs/000-overview.md

Lines changed: 0 additions & 82 deletions
This file was deleted.

0 commit comments

Comments
 (0)