Skip to content

Commit ba215da

Browse files
committed
Add docs support
1 parent b225f0d commit ba215da

24 files changed

+3596
-17
lines changed
Lines changed: 160 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,160 @@
1+
# help-scout-mcp-server: Caching Strategy
2+
3+
> Source: https://github.com/drewburchfield/help-scout-mcp-server
4+
> Analyzed: 2026-02-25
5+
6+
## Overview
7+
8+
In-process LRU cache with SHA-256 key generation, endpoint-aware TTLs, and no PII-aware eviction. Raw API responses cached before any redaction is applied.
9+
10+
---
11+
12+
## Architecture
13+
14+
```
15+
HelpScout API → HelpScoutClient.get() → cache.set(raw) → tool handler → redact → MCP response
16+
17+
cache.get(raw) → tool handler → redact → MCP response (cache hit)
18+
```
19+
20+
- **Library**: `lru-cache` (npm)
21+
- **Storage**: In-process memory. No disk persistence. Lost on restart.
22+
- **Singleton**: One `Cache` instance exported from `src/utils/cache.ts`, shared across all tool/resource handlers.
23+
24+
---
25+
26+
## Configuration
27+
28+
| Setting | Env var | Default | Unit |
29+
|---------|---------|---------|------|
30+
| TTL | `CACHE_TTL_SECONDS` | 300 | seconds |
31+
| Max entries | `MAX_CACHE_SIZE` | 10,000 | items |
32+
33+
Both configurable via env vars. Applied at startup, no runtime changes.
34+
35+
---
36+
37+
## Key generation
38+
39+
```typescript
40+
generateKey(prefix: string, data?: unknown): string {
41+
const hash = crypto.createHash('sha256');
42+
hash.update(JSON.stringify({ prefix, data }));
43+
return hash.digest('hex');
44+
}
45+
```
46+
47+
- `prefix`: API endpoint path (e.g., `/conversations/123`)
48+
- `data`: query params object
49+
- SHA-256 of `JSON.stringify({prefix, data})` → hex string
50+
- Deterministic: same endpoint + same params = same cache key
51+
- No namespace isolation between tools/resources
52+
53+
---
54+
55+
## TTL strategy (endpoint-aware)
56+
57+
The `HelpScoutClient.get()` method selects TTL based on endpoint pattern:
58+
59+
| Endpoint pattern | TTL | Rationale |
60+
|-----------------|-----|-----------|
61+
| `/mailboxes*` | 1440s (24 hours) | Mailbox config changes rarely |
62+
| `/conversations*` | 300s (5 min) | Conversations update frequently |
63+
| `/conversations/*/threads*` | 300s (5 min) | Threads update frequently |
64+
| Everything else | 300s (5 min) | Default |
65+
66+
Custom TTL can be passed per-call via `cacheOptions.ttl` parameter, though no tool currently overrides the defaults.
67+
68+
---
69+
70+
## Cache lifecycle
71+
72+
### Read path
73+
```typescript
74+
async get<T>(endpoint, params, cacheOptions): Promise<T> {
75+
const cacheKey = `GET:${endpoint}`;
76+
const cachedResult = cache.get<T>(cacheKey, params);
77+
if (cachedResult) {
78+
logger.debug(`Cache hit: ${endpoint}`);
79+
return cachedResult; // raw API data, no redaction
80+
}
81+
// ... fetch from API
82+
}
83+
```
84+
85+
### Write path
86+
```typescript
87+
cache.set(cacheKey, params, response.data, { ttl: determinedTTL });
88+
```
89+
90+
### Eviction
91+
- LRU eviction when max size (10,000) reached
92+
- TTL-based expiry per entry
93+
- `cache.clear()` method exists but is never called in application code
94+
- No manual invalidation on write operations (e.g., replying to a thread doesn't invalidate cached thread list)
95+
96+
---
97+
98+
## PII implications
99+
100+
### Raw data always cached
101+
102+
Redaction happens in the tool layer *after* the cache returns data. The cache itself stores complete, unredacted API responses including:
103+
104+
- Customer names and emails
105+
- Agent names and emails
106+
- Full message bodies
107+
- CC/BCC recipients
108+
- Thread metadata
109+
110+
### No PII-aware features
111+
112+
| Feature | Present? | Impact |
113+
|---------|----------|--------|
114+
| Cache-level redaction | No | Raw PII in memory for TTL duration |
115+
| PII-aware eviction | No | No way to flush customer data on demand |
116+
| Per-user cache isolation | No | All MCP consumers share same cache |
117+
| Cache encryption | No | Plain objects in process memory |
118+
| Audit logging of cache access | No | No trail of what PII was cached/served |
119+
| Config-change cache invalidation | No | Changing `REDACT_MESSAGE_CONTENT` doesn't flush cache |
120+
121+
### Stale data risk
122+
123+
No write-through invalidation. If a conversation is updated via the HelpScout UI (customer edits, agent replies), the cache serves stale data for up to 5 minutes. This is a correctness issue, not a security one, but worth noting.
124+
125+
---
126+
127+
## Retry & connection pooling
128+
129+
Not caching per se, but related to the client layer:
130+
131+
### Retry logic (`executeWithRetry`)
132+
- Max attempts: 3 (configurable)
133+
- Base delay: 1000ms
134+
- Max delay: 10000ms
135+
- Jitter: 10% random to avoid thundering herd
136+
- 429 (rate limit): uses `Retry-After` header
137+
- 401 (auth failure): clears token, re-authenticates, retries
138+
- Other 5xx: exponential backoff
139+
140+
### Connection pool
141+
- Max sockets: 50
142+
- Max free sockets: 10
143+
- Timeout: 30,000ms
144+
- Keep-alive: enabled (1000ms interval)
145+
146+
### OAuth2 token caching
147+
- Access token stored in `accessToken` property
148+
- Expiry tracked with 60-second buffer (`tokenExpiresAt`)
149+
- Concurrent auth requests deduplicated via shared promise (`authenticationPromise`)
150+
- Token cleared on 401 to force re-auth
151+
152+
---
153+
154+
## Relevance to hs-cli
155+
156+
Our CLI is stateless (no persistent process), so in-process caching doesn't apply. However, these patterns inform our design:
157+
158+
1. **If we ever build an MCP server wrapper**: cache should store anonymized data, not raw, to prevent bypass via cache reads.
159+
2. **Endpoint-aware TTLs**: reasonable approach if we add caching. Mailboxes are stable (long TTL), conversations/threads change often (short TTL).
160+
3. **Token management**: our `internal/auth/` already handles OAuth2 with keyring storage — different approach (persistent credentials) vs their in-process token lifecycle.
Lines changed: 119 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,119 @@
1+
# help-scout-mcp-server: PII & Anonymization Model
2+
3+
> Source: https://github.com/drewburchfield/help-scout-mcp-server
4+
> Analyzed: 2026-02-25
5+
6+
## Overview
7+
8+
A thin MCP proxy over the HelpScout API with one optional content-body gate. Despite README claims of "enterprise-grade security" and "SOC2 compliant options", the actual PII model is minimal: message body redaction on 2 of 9 tools. Customer identity (name, email) is always exposed.
9+
10+
---
11+
12+
## Config: Two env vars, one derived flag
13+
14+
```typescript
15+
// src/utils/config.ts
16+
security: {
17+
allowPii: process.env.REDACT_MESSAGE_CONTENT !== 'true'
18+
|| process.env.ALLOW_PII === 'true',
19+
}
20+
```
21+
22+
| `REDACT_MESSAGE_CONTENT` | `ALLOW_PII` | `allowPii` | Effect |
23+
|---|---|---|---|
24+
| unset / `false` | unset | `true` | All content shown (default) |
25+
| `true` | unset | `false` | Body text redacted in 2 tools |
26+
| `true` | `true` | `true` | Override — all content shown |
27+
| `false` | `true` | `true` | All content shown |
28+
29+
- Loaded once at startup via `dotenv.config()`. No runtime reload.
30+
- `ALLOW_PII=true` silently overrides `REDACT_MESSAGE_CONTENT=true`. No warning emitted.
31+
- No per-request override. LLM cannot change the setting mid-session.
32+
33+
---
34+
35+
## Tool-by-tool redaction map
36+
37+
9 tools exposed. Only 2 apply any redaction, and only to the `body` field:
38+
39+
| Tool | Bodies redacted? | Customer name/email exposed? | Notes |
40+
|------|---|---|---|
41+
| `searchConversations` | No | Yes | Returns full `Conversation` objects |
42+
| `advancedConversationSearch` | No | Yes | Also accepts `customerEmail` as search param |
43+
| `comprehensiveConversationSearch` | No | Yes | Results grouped by status |
44+
| `structuredConversationFilter` | No | Yes | Accepts `customerIds[]` — enumerate by customer |
45+
| **`getConversationSummary`** | **Yes** | Yes | `body` → placeholder; `customer.*` untouched |
46+
| **`getThreads`** | **Yes** | Yes | `body` → placeholder; `createdBy.*` untouched |
47+
| `listAllInboxes` | N/A | N/A | Inbox metadata only |
48+
| `searchInboxes` | N/A | N/A | Inbox metadata only |
49+
| `getServerTime` | N/A | N/A | Timestamp only |
50+
51+
### Redaction implementation
52+
53+
```typescript
54+
// getConversationSummary
55+
firstCustomerMessage: {
56+
body: config.security.allowPii
57+
? firstCustomerMessage.body
58+
: '[Content hidden - set REDACT_MESSAGE_CONTENT=false to view]',
59+
customer: firstCustomerMessage.customer, // NOT redacted
60+
}
61+
62+
// getThreads
63+
const processedThreads = threads.map(thread => ({
64+
...thread, // spreads customer, createdBy, assignedTo — all PII
65+
body: config.security.allowPii
66+
? thread.body
67+
: '[Content hidden - set REDACT_MESSAGE_CONTENT=false to view]',
68+
}));
69+
```
70+
71+
Only `body` is swapped. Everything else passes through.
72+
73+
Note: the placeholder text says "set REDACT_MESSAGE_CONTENT=false to view" — this is incorrect. The default (unset) already shows content. The message should say "set REDACT_MESSAGE_CONTENT to false or unset it".
74+
75+
---
76+
77+
## Resource layer: complete bypass
78+
79+
MCP resources (`helpscout://conversations`, `helpscout://threads`, `helpscout://inboxes`) are exposed via `src/resources/index.ts`. This file **does not import the config module**. Zero redaction logic. An MCP client reading resources instead of calling tools gets full unredacted data regardless of env vars.
80+
81+
---
82+
83+
## Fields never protected (regardless of settings)
84+
85+
| Field | Where it appears | Risk |
86+
|-------|-----------------|------|
87+
| `customer.firstName` | Every conversation object | Customer identity |
88+
| `customer.lastName` | Every conversation object | Customer identity |
89+
| `customer.email` | Every conversation + thread | Customer contact |
90+
| `assignee.firstName/lastName/email` | Every assigned conversation | Staff identity |
91+
| `thread.createdBy.email/first/last` | Every thread | Message author |
92+
| `thread.customer.email` | Every thread | Customer contact |
93+
| `thread.assignedTo.email` | Draft threads | Staff identity |
94+
| `conversation.subject` | Every conversation | Customers type PII into subjects |
95+
| CC/BCC recipients | Thread objects | Third-party contacts |
96+
97+
---
98+
99+
## Permission model
100+
101+
None. No per-tool allowlist, no per-inbox scoping enforcement, no rate limiting per tool.
102+
103+
`HELPSCOUT_DEFAULT_INBOX_ID` is a soft default the LLM can override by passing its own `inboxId`. Not a security boundary.
104+
105+
`HelpScoutAPIConstraints` in `src/utils/api-constraints.ts` validates input format (numeric IDs, non-empty search terms) — correctness guards, not security.
106+
107+
---
108+
109+
## Gap analysis vs our approach
110+
111+
| Dimension | help-scout-mcp-server | hs-cli (planned) |
112+
|-----------|----------------------|------------------|
113+
| What's anonymized | Message body only (2 tools) | All person fields: name, email, phone |
114+
| Identity correlation | None — just hides content | Deterministic fake identities (same person = same fake) |
115+
| Scope | Binary on/off | Three levels: `off`, `customers`, `all` |
116+
| Format coverage | Tool responses only | All formats: table, csv, json, json-full |
117+
| Bypass paths | Resource layer completely unprotected | No bypass — anonymize applies at output layer |
118+
| Config | Env vars only, no persistence | Config file + env var + `config set` |
119+
| Customer identity | Always exposed | Anonymized when enabled |
Lines changed: 86 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,86 @@
1+
# Docs API Implementation Checklist
2+
3+
## Infrastructure
4+
- [x] Config: `DocsAPIKey`, `DocsPermissions` fields + env vars
5+
- [x] Auth store: `StoreDocsAPIKey`, `LoadDocsAPIKey`, `DeleteDocsAPIKey`
6+
- [x] DocsClient: HTTP Basic Auth, rate limiter, multipart uploads
7+
- [x] DocsClientAPI interface
8+
- [x] Docs pagination: `DocsPaginateAll`, `ExtractDocsItems`
9+
- [x] DocsPageInfo type
10+
11+
## Root/MCP wiring
12+
- [x] `docsClient` var in root.go
13+
- [x] `isUnderSubtree()` helper
14+
- [x] Docs client init in PersistentPreRunE (env > keyring > config)
15+
- [x] Docs permission check path
16+
- [x] MCP catalog discovers both inbox + docs trees
17+
- [x] MCP description no longer hardcoded "Inbox"
18+
19+
## Auth (3 commands)
20+
- [x] `hs docs auth login`
21+
- [x] `hs docs auth status`
22+
- [x] `hs docs auth logout`
23+
24+
## Collections (5 commands)
25+
- [x] `hs docs collections list`
26+
- [x] `hs docs collections get <id>`
27+
- [x] `hs docs collections create`
28+
- [x] `hs docs collections update <id>`
29+
- [x] `hs docs collections delete <id>`
30+
31+
## Categories (6 commands)
32+
- [x] `hs docs categories list <collection-id>`
33+
- [x] `hs docs categories get <id>`
34+
- [x] `hs docs categories create`
35+
- [x] `hs docs categories update <id>`
36+
- [x] `hs docs categories reorder <collection-id>`
37+
- [x] `hs docs categories delete <id>`
38+
39+
## Articles (13 commands)
40+
- [x] `hs docs articles list` (--collection or --category)
41+
- [x] `hs docs articles search --query ...`
42+
- [x] `hs docs articles get <id>` (--draft)
43+
- [x] `hs docs articles related <id>`
44+
- [x] `hs docs articles create`
45+
- [x] `hs docs articles update <id>`
46+
- [x] `hs docs articles delete <id>`
47+
- [x] `hs docs articles upload <id> --file ...`
48+
- [x] `hs docs articles views <id> --count ...`
49+
- [x] `hs docs articles draft save <id> --text ...`
50+
- [x] `hs docs articles draft delete <id>`
51+
- [x] `hs docs articles revisions list <id>`
52+
- [x] `hs docs articles revisions get <id> <rev-id>`
53+
54+
## Sites (7 commands)
55+
- [x] `hs docs sites list`
56+
- [x] `hs docs sites get <id>`
57+
- [x] `hs docs sites create`
58+
- [x] `hs docs sites update <id>`
59+
- [x] `hs docs sites delete <id>`
60+
- [x] `hs docs sites restrictions get <id>`
61+
- [x] `hs docs sites restrictions update <id>`
62+
63+
## Redirects (6 commands)
64+
- [x] `hs docs redirects list <site-id>`
65+
- [x] `hs docs redirects get <id>`
66+
- [x] `hs docs redirects find --site ... --url ...`
67+
- [x] `hs docs redirects create`
68+
- [x] `hs docs redirects update <id>`
69+
- [x] `hs docs redirects delete <id>`
70+
71+
## Assets (2 commands)
72+
- [x] `hs docs assets article upload --file ...`
73+
- [x] `hs docs assets settings upload --file ...`
74+
75+
## Output
76+
- [x] JSON clean passthrough (docsCleanMinimal)
77+
- [x] Table output per resource
78+
- [x] `jsonStr()` helper for generic JSON→table
79+
80+
## Verification
81+
- [x] `go build ./cmd/hs` — clean
82+
- [x] `go vet ./...` — clean
83+
- [x] `go test ./...` — no new failures (1 pre-existing env-dependent failure)
84+
- [x] MCP tools/list: 39 `helpscout_docs_*` tools discovered
85+
- [x] Auth gating: unauthenticated → clear error message
86+
- [x] All 85 inbox MCP tools still present

0 commit comments

Comments
 (0)