Skip to content

Commit 61d117e

Browse files
comp615ampcode-com
andauthored
feat: add Pi-Agent detection via process name and breadcrumb scanning (#21)
_This PR description was generated by AMP._ 🤖 ## Summary Adds detection for three new AI agents and fixes a false-positive matching issue for short process names. ## Changes ### New agents (`src/agent.rs`) **Pi-Agent** — process + breadcrumbs: - Process: `pi` (exact match only — see below) - Breadcrumbs: `~/.pi/agent/sessions/*.jsonl` - Email: `Pi <noreply@pi.dev>` - ℹ️ Pi is a Node.js CLI — `process.name()` returns `"node"`, not `"pi"`. Detection works via `basename(argv[0])` which is `"pi"` (npm bin shim). [Verified on a live Pi session](#21). **Copilot CLI** — process + breadcrumbs: - Process: `copilot` (distinct from `copilot-agent` VS Code extension) - Breadcrumbs: `~/.copilot/session-state/{session-id}/events.jsonl` - Email: `Copilot <223556219+Copilot@users.noreply.github.com>` ([verified](github/copilot-cli#975)) - ⚠️ Placed after `copilot-agent` — `find_by_name` uses `contains()` so the more specific entry must match first **OpenCode** — process only: - Process: `opencode` - Email: `opencode <noreply@opencode.ai>` ([verified](anomalyco/opencode#786)) - No breadcrumbs — sessions stored in SQLite (`~/.local/share/opencode/opencode.db`), TODO for future ### Bug fix: `exact_process_match` flag Added `exact_process_match: bool` to the `Agent` struct. When `true`, `find_by_name` requires the basename to equal the process name exactly instead of using `contains()`. **Problem**: The 2-character process name `pi` caused false positives — e.g., bash's `argv[1]` containing `pipefail` would match `"pipefail".contains("pi")`. **Fix**: Pi-Agent is the only agent with `exact_process_match: true`. All other agents retain the default `false` (backward compatible). Also added `Agent::default()` const fn so entries in `KNOWN_AGENTS` only need to specify fields that differ from defaults, keeping the list concise. ### `README.md` - Updated breadcrumb examples and limitations to include Pi, Copilot CLI, and OpenCode ## Testing ```bash just run --debug # Verify Pi no longer false-positive matches on "pipefail" etc. just run # Only detects actually running agents ``` --------- Co-authored-by: Amp <amp@ampcode.com>
1 parent 19527a7 commit 61d117e

File tree

2 files changed

+61
-32
lines changed

2 files changed

+61
-32
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ It finds agents in four ways:
77
1. It checks for agent-specific environment variables.
88
2. It walks its own process ancestry, under the assumption that the git commit was initiated by an agent.
99
3. It walks up the process tree and checks all descendants of siblings at each level, looking for agents working in the same repository.
10-
4. It checks agent-specific state files ("breadcrumbs") to determine if an agent was recently active in this repo (e.g. `~/.claude/projects/`, `~/.codex/sessions/`).
10+
4. It checks agent-specific state files ("breadcrumbs") to determine if an agent was recently active in this repo (e.g. `~/.claude/projects/`, `~/.codex/sessions/`, `~/.pi/agent/sessions/`).
1111

1212
Multiple agents can be attributed in a single commit. Results are deduplicated by email address.
1313

@@ -62,7 +62,7 @@ ln -s /usr/local/bin/aittributor .git/hooks/prepare-commit-msg
6262

6363
## Known limitations
6464

65-
**Process detection is not always possible.** Agents may exit before the commit runs, or use process names that don't match (e.g. Electron-based desktop apps). When process scanning fails, aittributor falls back to agent session history, checking state files for recent activity in the same repo. This fallback only works for agents that write state files (currently Claude and Codex), and it cannot distinguish between an agent that wrote the code being committed and one that was only used for research. The result is a bias toward over-attribution, which is a deliberate tradeoff as undercounting real AI usage is harder to correct after the fact than occasional overcounting.
65+
**Process detection is not always possible.** Agents may exit before the commit runs, or use process names that don't match (e.g. Electron-based desktop apps). When process scanning fails, aittributor falls back to agent session history, checking state files for recent activity in the same repo. This fallback only works for agents that write scannable state files (currently Claude, Codex, Copilot CLI, and Pi). Some agents like OpenCode store sessions in SQLite, which is not yet supported by the breadcrumb scanner, and it cannot distinguish between an agent that wrote the code being committed and one that was only used for research. The result is a bias toward over-attribution, which is a deliberate tradeoff as undercounting real AI usage is harder to correct after the fact than occasional overcounting.
6666

6767
**Agent-initiated commits are the most reliable.** Attribution is most accurate when the agent itself runs `git commit`. Manual commits while an agent session is open (or recently closed) are the main source of attribution that may not reflect actual code contribution.
6868

src/agent.rs

Lines changed: 59 additions & 30 deletions
Original file line numberDiff line numberDiff line change
@@ -6,89 +6,111 @@ pub struct Agent {
66
pub email: &'static str,
77
pub breadcrumb_dir: Option<&'static str>,
88
pub breadcrumb_ext: Option<&'static str>,
9+
/// When true, process_names must match the basename exactly (not as a substring).
10+
/// Use for short names like "pi" that would otherwise false-positive on "pipefail" etc.
11+
pub exact_process_match: bool,
912
}
1013

1114
pub const KNOWN_AGENTS: &[Agent] = &[
1215
Agent {
1316
process_names: &["claude"],
14-
env_vars: &[],
1517
email: "Claude Code <noreply@anthropic.com>",
1618
breadcrumb_dir: Some(".claude/projects"),
1719
breadcrumb_ext: Some("jsonl"),
20+
..Agent::default()
1821
},
1922
Agent {
2023
process_names: &["goose"],
21-
env_vars: &[],
2224
email: "Goose <opensource@block.xyz>",
23-
breadcrumb_dir: None,
24-
breadcrumb_ext: None,
25+
..Agent::default()
2526
},
2627
Agent {
2728
process_names: &["cursor", "cursor-agent"],
28-
env_vars: &[],
2929
email: "Cursor <cursoragent@cursor.com>",
30-
breadcrumb_dir: None,
31-
breadcrumb_ext: None,
30+
..Agent::default()
3231
},
3332
Agent {
3433
process_names: &["aider"],
35-
env_vars: &[],
3634
email: "Aider <noreply@aider.chat>",
37-
breadcrumb_dir: None,
38-
breadcrumb_ext: None,
35+
..Agent::default()
3936
},
4037
Agent {
4138
process_names: &["windsurf"],
42-
env_vars: &[],
4339
email: "Windsurf <noreply@codeium.com>",
44-
breadcrumb_dir: None,
45-
breadcrumb_ext: None,
40+
..Agent::default()
4641
},
4742
Agent {
4843
process_names: &["codex"],
49-
env_vars: &[],
5044
email: "Codex <noreply@openai.com>",
5145
breadcrumb_dir: Some(".codex/sessions"),
5246
breadcrumb_ext: Some("jsonl"),
47+
..Agent::default()
5348
},
5449
Agent {
5550
process_names: &["copilot-agent"],
56-
env_vars: &[],
5751
email: "GitHub Copilot <noreply@github.com>",
58-
breadcrumb_dir: None,
59-
breadcrumb_ext: None,
52+
..Agent::default()
53+
},
54+
// Copilot CLI is a separate terminal agent from the VS Code extension (copilot-agent above).
55+
// Must appear after copilot-agent since find_by_name uses contains() and "copilot" would
56+
// otherwise shadow the more specific "copilot-agent" match.
57+
Agent {
58+
process_names: &["copilot"],
59+
email: "Copilot <223556219+Copilot@users.noreply.github.com>",
60+
// Sessions stored as JSONL event logs in ~/.copilot/session-state/{session-id}/events.jsonl
61+
breadcrumb_dir: Some(".copilot/session-state"),
62+
breadcrumb_ext: Some("jsonl"),
63+
..Agent::default()
6064
},
6165
Agent {
6266
process_names: &["amazon-q"],
63-
env_vars: &[],
6467
email: "Amazon Q Developer <noreply@amazon.com>",
65-
breadcrumb_dir: None,
66-
breadcrumb_ext: None,
68+
..Agent::default()
6769
},
6870
Agent {
6971
process_names: &["amp"],
70-
env_vars: &[],
7172
email: "Amp <amp@ampcode.com>",
72-
breadcrumb_dir: None,
73-
breadcrumb_ext: None,
73+
..Agent::default()
7474
},
7575
Agent {
76-
process_names: &[],
7776
env_vars: &[("CLINE_ACTIVE", "true")],
7877
email: "Cline <noreply@cline.bot>",
79-
breadcrumb_dir: None,
80-
breadcrumb_ext: None,
78+
..Agent::default()
8179
},
8280
Agent {
8381
process_names: &["gemini"],
84-
env_vars: &[],
8582
email: "Gemini CLI Agent <gemini-cli-agent@google.com>",
86-
breadcrumb_dir: None,
87-
breadcrumb_ext: None,
83+
..Agent::default()
84+
},
85+
Agent {
86+
process_names: &["pi"],
87+
email: "Pi <noreply@pi.dev>",
88+
breadcrumb_dir: Some(".pi/agent/sessions"),
89+
breadcrumb_ext: Some("jsonl"),
90+
exact_process_match: true,
91+
..Agent::default()
92+
},
93+
// TODO: OpenCode sessions are stored in SQLite (~/.local/share/opencode/opencode.db),
94+
// not flat files. Breadcrumb scanning would require a new SQLite-based strategy.
95+
Agent {
96+
process_names: &["opencode"],
97+
email: "opencode <noreply@opencode.ai>",
98+
..Agent::default()
8899
},
89100
];
90101

91102
impl Agent {
103+
const fn default() -> Self {
104+
Agent {
105+
process_names: &[],
106+
env_vars: &[],
107+
email: "",
108+
breadcrumb_dir: None,
109+
breadcrumb_ext: None,
110+
exact_process_match: false,
111+
}
112+
}
113+
92114
/// Extract the bare email address from a "Name <addr>" string.
93115
/// e.g. "Claude Code <noreply@anthropic.com>" → "noreply@anthropic.com"
94116
pub fn extract_email_addr(email: &str) -> &str {
@@ -105,7 +127,14 @@ impl Agent {
105127
let basename_lower = basename.to_lowercase();
106128

107129
KNOWN_AGENTS.iter().find(|agent| {
108-
!agent.process_names.is_empty() && agent.process_names.iter().any(|&pn| basename_lower.contains(pn))
130+
!agent.process_names.is_empty()
131+
&& agent.process_names.iter().any(|&pn| {
132+
if agent.exact_process_match {
133+
basename_lower == pn
134+
} else {
135+
basename_lower.contains(pn)
136+
}
137+
})
109138
})
110139
}
111140

0 commit comments

Comments
 (0)