Skip to content

Commit 2c7950c

Browse files
committed
docs(readme): add progressive-disclosure docs set
1 parent 77c645f commit 2c7950c

File tree

7 files changed

+588
-0
lines changed

7 files changed

+588
-0
lines changed

AGENTS.md

Lines changed: 128 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,128 @@
1+
# sotto Agent Guide
2+
3+
## Scope
4+
5+
These rules apply to the entire `sotto/` repository.
6+
7+
## Mission
8+
9+
Ship a production-grade, local-first ASR CLI with:
10+
11+
- a single Go binary
12+
- no background daemon
13+
- strong crash/cleanup guarantees
14+
- clear modular boundaries for safe refactoring
15+
- reproducible packaging and CI
16+
17+
## First 60 Seconds (Progressive Disclosure)
18+
19+
1. Read this file fully.
20+
2. Read `README.md` for user-facing behavior.
21+
3. Read `PLAN.md` (current milestones/checklist) and `SESSION.md` (what actually ran).
22+
4. Run `just --list` to see current task entrypoints.
23+
5. Open only the component(s) you are changing (map below).
24+
25+
---
26+
27+
## Project Map (What to read by task)
28+
29+
| Area | Primary paths | Notes |
30+
| --- | --- | --- |
31+
| CLI contract + dispatch | `apps/sotto/internal/cli/`, `apps/sotto/internal/app/` | Commands, flags, top-level flow |
32+
| Session state machine | `apps/sotto/internal/session/`, `apps/sotto/internal/fsm/`, `apps/sotto/internal/ipc/` | Toggle/stop/cancel semantics, single-instance behavior |
33+
| Audio capture + device selection | `apps/sotto/internal/audio/` | PipeWire/Pulse capture, device fallback/mute handling |
34+
| Riva streaming ASR | `apps/sotto/internal/riva/`, `apps/sotto/internal/pipeline/` | gRPC stream config, segment assembly inputs |
35+
| Transcript assembly | `apps/sotto/internal/transcript/` | Whitespace normalization + trailing-space behavior |
36+
| Output dispatch | `apps/sotto/internal/output/`, `apps/sotto/internal/hypr/` | Clipboard + paste behavior |
37+
| Indicator + cues | `apps/sotto/internal/indicator/` | Visual notify + audio cue lifecycle |
38+
| Config grammar/defaults | `apps/sotto/internal/config/` | Any new key must update parser/defaults/tests/docs |
39+
| Packaging + tooling | `justfile`, `flake.nix`, `.github/workflows/` | CI/tooling changes |
40+
| Protobuf contracts | `apps/sotto/proto/third_party/`, `proto/gen/go/` | Run codegen when proto inputs change |
41+
42+
---
43+
44+
## Engineering Workflow Rules
45+
46+
1. Read target files before editing.
47+
2. Keep changes aligned to `PLAN.md` milestones; avoid drive-by refactors.
48+
3. Keep `PLAN.md` checkboxes accurate (only mark executed + verified work).
49+
4. Log key decisions/trade-offs/blockers/commands in `SESSION.md`.
50+
5. Prefer additive changes with regression tests.
51+
6. Never claim runtime integrations (Riva, PipeWire, Hyprland) were verified unless actually exercised.
52+
53+
### Design principles (repo-wide)
54+
55+
- Prefer boring, explicit code over clever code.
56+
- Fail fast at boundaries (config parse, startup checks, I/O preconditions).
57+
- Keep business/state logic separate from transport/I/O adapters.
58+
- Use guard clauses to reduce nesting.
59+
- Keep files top-down readable (public entrypoints first, private helpers below).
60+
61+
### Dependency + architecture rules
62+
63+
- Prefer manual constructor-based dependency injection.
64+
- Keep package responsibilities narrow; avoid utility dumping grounds.
65+
- Do not couple domain/state transitions directly to shell command details.
66+
67+
---
68+
69+
## Go Conventions
70+
71+
- Use table-driven tests for branch-heavy logic.
72+
- Prefer `errors.Is` / wrapped errors with context.
73+
- Keep I/O timeouts explicit.
74+
- Use `testing` + `testify` (`require`/`assert`) for expressive assertions when useful.
75+
- Add focused regression tests for bug fixes whenever feasible.
76+
77+
### Testing boundaries (repo policy)
78+
79+
- Prefer real interfaces/adapters and real resources (temp files, unix sockets, `httptest`, PATH fixtures).
80+
- Do **not** introduce mocking frameworks or expectation-driven mock suites.
81+
- Riva runtime/model inference remains a local-manual smoke concern (non-CI); use lightweight protocol/contract tests in CI.
82+
83+
### File size / readability guardrails
84+
85+
- Handwritten files should target `<= 250` LOC where practical.
86+
- Files above `~350` LOC require extraction-plan notes in `PLAN.md` before refactor work.
87+
- Exclude generated code from these thresholds: `apps/sotto/vendor/**`, `apps/sotto/proto/gen/**`.
88+
89+
---
90+
91+
## Config Change Contract (Mandatory)
92+
93+
When adding or changing a config key, update all of:
94+
95+
1. `apps/sotto/internal/config/types.go`
96+
2. `apps/sotto/internal/config/defaults.go`
97+
3. `apps/sotto/internal/config/parser.go`
98+
4. validation if required (`validate.go`)
99+
5. parser/validation tests
100+
6. `README.md` config example + notes
101+
7. any deployed default config in consuming repos (when in scope)
102+
103+
---
104+
105+
## Required Local Checks Before Hand-off
106+
107+
Run and report status for:
108+
109+
1. `just ci-check`
110+
2. `nix build 'path:.#sotto'`
111+
112+
If any check is skipped, state exactly what was skipped, why, and the exact command to run.
113+
114+
### Pre-commit Hooks (`prek`)
115+
116+
- Install hooks: `just precommit-install`
117+
- Run hooks manually: `just precommit-run`
118+
119+
Use hooks to catch formatting/lint drift before pushing.
120+
121+
---
122+
123+
## Safety
124+
125+
- Never store secrets in repo files.
126+
- Assume `NGC_API_KEY` and other credentials are external env/secrets only.
127+
- Avoid destructive shell operations unless explicitly requested.
128+
- Do not edit files outside `sotto/` unless explicitly requested.

README.md

Lines changed: 135 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,135 @@
1+
# sotto
2+
3+
[![CI](https://github.com/rbright/sotto/actions/workflows/ci.yml/badge.svg)](https://github.com/rbright/sotto/actions/workflows/ci.yml)
4+
5+
Local-first CLI for automated speech recognition (ASR).
6+
7+
`sotto` captures microphone audio, streams it to a local ASR backend, assembles transcript text, and can optionally route output to clipboard/paste adapters.
8+
9+
- No daemon/background service.
10+
- Single binary runtime.
11+
- Built for local-first automation workflows.
12+
13+
## Features
14+
15+
- `toggle` / `stop` / `cancel` command flow for dictation sessions
16+
- single-instance runtime coordination via unix socket
17+
- local audio capture via PipeWire/Pulse
18+
- streaming ASR via local NVIDIA Riva gRPC endpoint
19+
- transcript normalization with optional trailing space
20+
- output adapters: clipboard + optional paste dispatch
21+
- session diagnostics via `sotto doctor`
22+
- JSONL session logs for observability/debugging
23+
24+
## Requirements
25+
26+
Core runtime dependencies:
27+
28+
- local ASR service endpoint (default: NVIDIA Riva)
29+
- local audio backend compatible with PipeWire/Pulse
30+
31+
Adapter/tool dependencies (when enabled by config):
32+
33+
- clipboard command (default: `wl-copy`)
34+
- paste/notification adapter commands (for your environment)
35+
36+
> `sotto` is local-first by default. No cloud ASR endpoint is required by design.
37+
38+
## Installation
39+
40+
### Nix (recommended)
41+
42+
```bash
43+
nix build 'path:.#sotto'
44+
nix run 'path:.#sotto' -- --help
45+
```
46+
47+
### From source
48+
49+
```bash
50+
just tools
51+
go test ./apps/sotto/...
52+
go build ./apps/sotto/cmd/sotto
53+
```
54+
55+
## Quickstart
56+
57+
```bash
58+
sotto doctor
59+
sotto toggle # start
60+
sotto toggle # stop + commit
61+
```
62+
63+
Core commands:
64+
65+
```bash
66+
sotto toggle
67+
sotto stop
68+
sotto cancel
69+
sotto status
70+
sotto devices
71+
sotto doctor
72+
sotto version
73+
```
74+
75+
## Architecture
76+
77+
High-level architecture docs and diagrams:
78+
79+
- [Architecture overview](./docs/architecture.md)
80+
- [Modularity review + refactor slices](./docs/modularity.md)
81+
82+
```mermaid
83+
flowchart LR
84+
Trigger["Trigger source\n(shell / hotkey / script)"] --> CLI["sotto CLI"]
85+
CLI --> IPC["UDS socket\n$XDG_RUNTIME_DIR/sotto.sock"]
86+
IPC --> Session["Session controller"]
87+
Session --> Audio["Audio capture"]
88+
Audio --> ASR["ASR stream"]
89+
ASR --> Transcript["Transcript assembly"]
90+
Transcript --> Output["Output adapters"]
91+
```
92+
93+
## Configuration
94+
95+
Path resolution order:
96+
97+
1. `--config <path>`
98+
2. `$XDG_CONFIG_HOME/sotto/config.conf`
99+
3. `~/.config/sotto/config.conf`
100+
101+
Detailed reference:
102+
103+
- [Configuration reference](./docs/configuration.md)
104+
105+
## Verification
106+
107+
Build/test gate:
108+
109+
```bash
110+
just ci-check
111+
nix build 'path:.#sotto'
112+
```
113+
114+
Optional integration-tag tests (local machine resources):
115+
116+
```bash
117+
just test-integration
118+
```
119+
120+
Local runtime smoke helpers:
121+
122+
```bash
123+
just smoke-riva-doctor
124+
just smoke-riva-manual
125+
```
126+
127+
Full checklist:
128+
129+
- [Verification guide](./docs/verification.md)
130+
131+
## References
132+
133+
- PipeWire: https://pipewire.org/
134+
- NVIDIA Riva: https://developer.nvidia.com/riva
135+
- NVIDIA Parakeet models on Hugging Face: https://huggingface.co/models?search=nvidia%20parakeet

docs/README.md

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
# sotto Documentation
2+
3+
## Start here
4+
5+
- [Architecture overview](./architecture.md)
6+
- [Configuration reference](./configuration.md)
7+
- [Verification checklist](./verification.md)
8+
- [Modularity review and refactor slices](./modularity.md)
9+
10+
## Documentation style
11+
12+
This docs set uses progressive disclosure:
13+
14+
1. README: project overview + install + quick links
15+
2. Architecture/config/verification docs for implementation detail
16+
3. Modularity doc for cleanup/refactor planning

docs/architecture.md

Lines changed: 92 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,92 @@
1+
# sotto Architecture
2+
3+
`sotto` is a local-first ASR CLI with explicit component boundaries so behavior can be tested mostly in-process.
4+
5+
## 1) High-level component map
6+
7+
```mermaid
8+
flowchart LR
9+
Trigger["Trigger source\n(shell / hotkey / script)"] --> CLI["sotto CLI"]
10+
CLI --> IPC["IPC socket\n$XDG_RUNTIME_DIR/sotto.sock"]
11+
IPC --> Session["Session Controller\n(FSM + lifecycle)"]
12+
13+
Session --> Audio["Audio capture\nPipeWire/Pulse"]
14+
Audio --> Riva["ASR stream\nNVIDIA Riva gRPC"]
15+
Riva --> Transcript["Transcript assembly\nnormalize + trailing space"]
16+
Transcript --> Output["Output adapters\nclipboard + optional paste"]
17+
18+
Session --> Indicator["Indicator adapters\nnotify + audio cues"]
19+
Session --> Logs["JSONL logging\n$XDG_STATE_HOME/sotto/log.jsonl"]
20+
```
21+
22+
## 2) Package responsibilities
23+
24+
| Package | Responsibility |
25+
| --- | --- |
26+
| `internal/cli` | Parse command/flag contract |
27+
| `internal/app` | Top-level execution and dispatch wiring |
28+
| `internal/ipc` | Single-instance socket ownership + command forwarding |
29+
| `internal/session` | Dictation lifecycle orchestration and FSM transitions |
30+
| `internal/audio` | Device discovery/selection + capture chunk stream |
31+
| `internal/riva` | gRPC stream setup + ASR response accumulation |
32+
| `internal/pipeline` | Bridge audio capture to ASR + debug artifact handling |
33+
| `internal/transcript` | Segment assembly/whitespace normalization |
34+
| `internal/output` | Clipboard and paste adapters |
35+
| `internal/indicator` | Notification and cue adapters |
36+
| `internal/doctor` | Environment/config/tool/readiness diagnostics |
37+
| `internal/logging` | Runtime JSONL log setup |
38+
39+
## 3) Runtime flow (toggle -> stop)
40+
41+
```mermaid
42+
sequenceDiagram
43+
participant T as Trigger
44+
participant C as CLI (app.Runner)
45+
participant I as IPC server
46+
participant S as Session controller
47+
participant A as Audio capture
48+
participant R as Riva stream
49+
participant O as Output committer
50+
51+
T->>C: sotto toggle
52+
C->>I: acquire socket / become owner
53+
C->>S: Run()
54+
S->>A: Start capture
55+
S->>R: Dial stream + send config
56+
A-->>R: audio chunks (20ms)
57+
58+
T->>C: sotto toggle (stop)
59+
C->>I: forward stop action
60+
I->>S: actionStop
61+
S->>R: close stream + collect transcript
62+
S->>O: Commit(transcript)
63+
O->>O: set clipboard
64+
O->>O: optional paste adapter
65+
S-->>C: Result (transcript/metrics/errors)
66+
```
67+
68+
## 4) Session state model
69+
70+
```mermaid
71+
stateDiagram-v2
72+
[*] --> idle
73+
idle --> recording: start
74+
recording --> transcribing: stop
75+
recording --> idle: cancel
76+
transcribing --> idle: transcribed
77+
transcribing --> error: stop/commit failure
78+
recording --> error: start failure
79+
error --> idle: reset
80+
```
81+
82+
## 5) External dependencies
83+
84+
- [PipeWire](https://pipewire.org/) for local audio capture backend
85+
- [NVIDIA Riva](https://developer.nvidia.com/riva) for local ASR serving
86+
- [NVIDIA Parakeet model family on Hugging Face](https://huggingface.co/models?search=nvidia%20parakeet)
87+
88+
## 6) Testing boundary policy
89+
90+
- Prefer real adapters/resources (temp files, unix sockets, `httptest`, PATH fixtures).
91+
- Avoid mock frameworks for in-repo behavior.
92+
- Full model inference remains local-manual verification (not CI).

0 commit comments

Comments
 (0)