feat: learn from user label actions as AI classification hints by gentlemandev · Pull Request #1993 · elie222/inbox-zero

gentlemandev · 2026-03-22T20:23:48Z

Summary

Capture user label add/remove events from Gmail webhooks as sender classification feedback, injected as context hints into the AI rule-selection prompt. When a user drags an email into a folder in their mail client, that classification signal improves future AI decisions for that sender.

TLDR: The webhook pipeline already receives label events but drops non-SPAM ones. This PR records them and shows them to the AI as advisory context (not hard rules), enabling the system to learn from user behavior — especially for split senders like Amazon that send both receipts and marketing.

New SenderClassification model stores individual events with threadId/messageId for auditability
Subjects fetched at prompt time via batch Gmail API — no email content stored in DB (privacy)
Self-labeling filter: skips when Inbox Zero applied the same label, records user reclassifications
Conversation-tracking rules (To Reply, Awaiting Reply, FYI, Actioned) excluded via existing shouldLearn config
Unique constraint prevents duplicate rows from webhook retries
Extracted shared GMAIL_SYSTEM_LABELS constant and findRuleByLabelId helper to reduce duplication
fetchSenderFromMessage shared helper deduplicates ~30 lines of identical error handling

Test plan

Verify label-add webhook for non-SPAM labels creates SenderClassification rows
Verify SPAM label-add still triggers cold email learning (unchanged behavior)
Verify self-labeling filter: system-applied labels are not recorded as classification feedback
Verify label-removal records LABEL_REMOVED classification alongside existing exclusion GroupItem
Verify AI prompt includes sender classifications when they exist for the sender
Verify AI prompt is unmodified when no classifications exist
Verify conversation-tracking rules (TO_REPLY, FYI, etc.) are excluded from feedback
Verify webhook retry deduplication via unique constraint

🤖 Generated with Claude Code

Capture Gmail label add/remove events as sender classification feedback and inject them as context hints into the AI rule-selection prompt. When a user drags an email to a folder in their mail client, that classification signal is now recorded and used to improve future AI decisions for that sender. - New SenderClassification table stores individual events with threadId/messageId for auditability (not aggregated counts) - Subjects fetched at prompt time via batch Gmail API (no email content stored in DB for privacy) - Self-labeling filter prevents feedback loop from system-applied labels - Conversation-tracking rules (To Reply, FYI, etc.) excluded via existing shouldLearn config - Unique constraint prevents duplicate rows from webhook retries - Shared GMAIL_SYSTEM_LABELS constant and findRuleByLabelId helper extracted to reduce duplication across webhook handlers Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

vercel · 2026-03-22T20:23:54Z

The latest updates on your projects. Learn more about Vercel for GitHub.

1 Skipped Deployment

Project	Deployment	Actions	Updated (UTC)
inbox-zero	Ignored	Preview	Mar 23, 2026 10:15am

cubic-dev-ai

4 issues found across 10 files

Prompt for AI agents (unresolved issues)


Check if these issues are valid — if so, understand the root cause of each and fix them. If appropriate, use sub-agents to investigate and fix each issue separately.


<file name="apps/web/prisma/schema.prisma">

<violation number="1" location="apps/web/prisma/schema.prisma:690">
P3: Remove the redundant `@@index([emailAccountId, sender])` because it is already covered by the composite unique index.

(Based on your team's feedback about avoiding redundant indexes when a unique constraint already provides the same indexing.) [FEEDBACK_USED]</violation>
</file>

<file name="apps/web/utils/rule/sender-classification.ts">

<violation number="1" location="apps/web/utils/rule/sender-classification.ts:130">
P2: Escape subject and rule name before injecting them into the XML-like prompt block.</violation>

<violation number="2" location="apps/web/utils/rule/sender-classification.ts:164">
P1: Avoid ambiguous `findFirst` for label-to-rule mapping; handle multiple matches explicitly to prevent misclassification.</violation>
</file>

<file name="apps/web/app/api/google/webhook/process-label-added-event.ts">

<violation number="1" location="apps/web/app/api/google/webhook/process-label-added-event.ts:180">
P1: The self-labeling check is too broad: any past system-applied label on the message causes future user re-adds of that label to be skipped.</violation>
</file>

_{Reply with feedback, questions, or to request a fix. Tag @cubic-dev-ai to re-run a review.}

cubic-dev-ai · 2026-03-22T20:30:19Z

apps/web/utils/rule/classification-feedback.ts

+  labelId: string;
+  emailAccountId: string;
+}) {
+  return prisma.rule.findFirst({


P1: Avoid ambiguous findFirst for label-to-rule mapping; handle multiple matches explicitly to prevent misclassification.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At apps/web/utils/rule/sender-classification.ts, line 164: <comment>Avoid ambiguous `findFirst` for label-to-rule mapping; handle multiple matches explicitly to prevent misclassification.</comment> <file context> @@ -0,0 +1,176 @@ + labelId: string; + emailAccountId: string; +}) { + return prisma.rule.findFirst({ + where: { + emailAccountId, </file context>

This is consistent with the existing codebase pattern — process-label-removed-event.ts uses the same findFirst query for the same label-to-rule lookup. In practice, a labelId maps to a single rule because each rule creates/owns its label. If multiple rules ever share a label, both the existing and new code would need updating.

Thanks for the feedback! I've saved this as a new learning to improve future reviews.

cubic-dev-ai · 2026-03-22T20:30:19Z

apps/web/app/api/google/webhook/process-label-added-event.ts

+  if (!isEligibleForClassificationFeedback(rule.systemType)) return;
+
+  // Self-labeling filter: skip if Inbox Zero already applied this label
+  const systemApplied = await wasLabelAppliedBySystem({


P1: The self-labeling check is too broad: any past system-applied label on the message causes future user re-adds of that label to be skipped.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At apps/web/app/api/google/webhook/process-label-added-event.ts, line 180: <comment>The self-labeling check is too broad: any past system-applied label on the message causes future user re-adds of that label to be skipped.</comment> <file context> @@ -152,3 +154,73 @@ export async function handleLabelAddedEvent( + if (!isEligibleForClassificationFeedback(rule.systemType)) return; + + // Self-labeling filter: skip if Inbox Zero already applied this label + const systemApplied = await wasLabelAppliedBySystem({ + messageId, + emailAccountId, </file context>

This is an intentional trade-off. The scenario (system applies label → user removes → user re-adds same label) is rare, and the LABEL_REMOVED event from step 2 already captures the user's disagreement. Without the filter, every system-applied label would create a feedback row, creating a self-reinforcing loop where the AI's own decisions inflate the classification counts — which is much worse.

Thanks for the feedback! I've saved this as a new learning to improve future reviews.

cubic-dev-ai · 2026-03-22T20:30:19Z

apps/web/utils/rule/sender-classification.ts

+  const lines: string[] = [];
+
+  for (const classification of classifications) {
+    const ruleName = classification.rule?.name ?? "Unknown";


P2: Escape subject and rule name before injecting them into the XML-like prompt block.

Prompt for AI agents

Check if this issue is valid — if so, understand the root cause and fix it. At apps/web/utils/rule/sender-classification.ts, line 130: <comment>Escape subject and rule name before injecting them into the XML-like prompt block.</comment> <file context> @@ -0,0 +1,176 @@ + const lines: string[] = []; + + for (const classification of classifications) { + const ruleName = classification.rule?.name ?? "Unknown"; + const subject = subjects.get(classification.messageId); + </file context>

The existing codebase already injects raw email subjects and body content into LLM prompts without escaping (see stringifyEmail in ai-choose-rule.ts). The PROMPT_SECURITY_INSTRUCTIONS handle prompt injection concerns at the model level. Adding escaping here would be inconsistent with the rest of the prompt construction and wouldn't provide additional security since these XML-like tags are LLM prompt structure, not parsed by an XML processor.

Thanks for the feedback! I've updated an existing learning with this new information.

apps/web/prisma/schema.prisma

The @@unique([emailAccountId, sender, ruleId, messageId, eventType]) already serves as a prefix index for (emailAccountId, sender) queries in PostgreSQL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

- Update mocks for GMAIL_SYSTEM_LABELS, fetchSenderFromMessage, findRuleByLabelId, and sender-classification dependencies - Fix "should skip non-SPAM labels" test to reflect new behavior (system labels skip, non-system labels now record classification) - Update label-removed tests to use findRuleByLabelId mock Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Unit tests (sender-classification.test.ts): - saveSenderClassification: lowercase normalization, upsert dedup - getSenderClassificationsForPrompt: null when empty, subject formatting, LABEL_REMOVED formatting, deleted messages, batch fetch failure - findRuleByLabelId: match and no-match cases Eval tests (sender-classification-hint.test.ts): - Split sender (Amazon): receipt vs marketing with classification history — AI correctly uses hints to distinguish - Split sender (Google): calendar vs notification - Correction signal: label removal steers away from wrong rule - Strong consistent history reinforces correct classification - Hint does NOT override clear email content (personal email stays Conversations despite notification history) - Baseline: same email without hint still works correctly Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Add 5 new eval cases for sender classification hints: - Move-to-label pattern: remove+add from user moving email between folders (Notion changelog: Newsletter → Notification) - Weak signal: single data point should not override clear content (Cal.com booking is Calendar despite 1 Newsletter classification) - Contradictory history: user went back and forth, AI relies on content (Figma Config event → Marketing) - Hint should NOT override content: real receipt from sender with mostly marketing history (Uber trip receipt) - Split SaaS sender: Stripe sends receipts, notifications, and marketing (payout notification with mixed history) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The table stores feedback from user label actions, not classifications of senders. The new name better describes what the data is for (improving classification) rather than what it's about (a sender). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

aiChooseRule now accepts ClassificationFeedbackItem[] and formats the prompt internally. Tests pass structured data instead of hardcoded prompt strings, so changing the prompt format doesn't break tests. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

cubic-dev-ai bot reviewed Mar 22, 2026

View reviewed changes

elie222 and others added 6 commits March 22, 2026 21:11

chore: remove redundant index covered by unique constraint

9f5f1d1

The @@unique([emailAccountId, sender, ruleId, messageId, eventType]) already serves as a prefix index for (emailAccountId, sender) queries in PostgreSQL. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: learn from user label actions as AI classification hints#1993

feat: learn from user label actions as AI classification hints#1993
gentlemandev wants to merge 7 commits intomainfrom
feat/sender-classification-feedback

gentlemandev commented Mar 22, 2026

Uh oh!

vercel bot commented Mar 22, 2026 •

edited

Loading

Uh oh!

cubic-dev-ai bot left a comment

Uh oh!

cubic-dev-ai bot Mar 22, 2026 •

edited

Loading

Uh oh!

gentlemandev Mar 22, 2026

Uh oh!

cubic-dev-ai bot Mar 22, 2026

Uh oh!

cubic-dev-ai bot Mar 22, 2026 •

edited

Loading

Uh oh!

gentlemandev Mar 22, 2026

Uh oh!

cubic-dev-ai bot Mar 22, 2026

Uh oh!

cubic-dev-ai bot Mar 22, 2026 •

edited

Loading

Uh oh!

gentlemandev Mar 22, 2026

Uh oh!

cubic-dev-ai bot Mar 22, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

gentlemandev commented Mar 22, 2026

Summary

Test plan

Uh oh!

vercel bot commented Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gentlemandev Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gentlemandev Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Mar 22, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

gentlemandev Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

cubic-dev-ai bot Mar 22, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vercel bot commented Mar 22, 2026 •

edited

Loading

cubic-dev-ai bot Mar 22, 2026 •

edited

Loading

cubic-dev-ai bot Mar 22, 2026 •

edited

Loading

cubic-dev-ai bot Mar 22, 2026 •

edited

Loading