Problem
Operators who route LocalAI traffic to a model whose backend sits outside
their trust boundary (a remote hosted endpoint, a partner's service, a
sidecar that logs) have no server-side mechanism to scrub and restore
PII inside LocalAI's request/response path. Today they rewrite it
outside LocalAI or simply don't route sensitive content at all.
Proposal
Add an opt-in, per-model middleware that performs bijective token
substitution:
- Scans outgoing request messages for regex-detectable PII
- Replaces matches with deterministic tokens
(Peter Müller → PERSON_001)
- Forwards the transformed request to the model
- Reverses substitutions in the response (stream + non-stream) before
the client sees it
Framing: this is PII replacement, not a compliance tool. Operators
remain responsible for legal/retention/audit concerns. No claims about
GDPR/HIPAA are made or implied by this feature.
Config (per-model YAML)
pseudonymization:
enabled: true
detectors: [] # default empty; operator lists explicitly
strategy: deterministic # same input → same token within request
reverse_in_response: true
Detector names available in Phase 1:
regex_email
regex_phone_de
regex_phone_intl
regex_hrb (German Handelsregister)
regex_iban
Scope: middleware activates only when enabled: true on that model.
No global trigger. No "outgoing request" heuristic. Operator picks models
explicitly (e.g. only claude-opus-4-7.yaml, not others).
Flow
Request side (after SetOpenAIRequest):
- Run enabled detectors in declared order.
- Build bijective in-memory map (scope: one request).
- Replace in message content.
- Attach map to Echo context (
defer-released).
- Forward to model.
Response side (Echo response-writer wrapper):
- Read map from Echo context.
- Wrap response writer; intercept SSE chunks + final body.
- Reverse token substitutions with sliding-window buffering for
multi-chunk tokens (e.g. PERSON_001 split across SSE frames).
- On client-disconnect,
defer releases map — no leak.
Architecture — why response-writer wrapper, not inline edit
An earlier draft considered modifying core/http/endpoints/openai/chat.go
inside the streaming loop where ev.Choices[0].Delta.Content is set
(around L682-689). We rejected that: it couples this feature to that
specific streaming implementation, and any upstream change there
would force a merge-conflict on our middleware.
An Echo response-writer wrapper (same pattern Echo's BodyDump uses,
but with a transformer instead of a sink) keeps the seam at the
middleware layer. Willing to go the other direction if maintainers
prefer — this is worth discussing early, hence the RFC.
Metadata
Response adds:
{
"usage": {
"pseudonymization_meta": {
"detectors_fired": ["regex_email", "regex_phone_de"],
"tokens_replaced": 7,
"request_round_trip": true
}
}
}
No salt, no map, no user-content hash leaves the server.
Non-goals (Phase 1)
- No NER. spaCy / LLM-as-NER path is a Phase 2 PR. Keeps this one
small and review-focused on the regex + middleware seam.
- No UI management (YAML only).
- No cross-request mapping persistence (Redis/TTL is a Phase 2 feature).
- Not a DLP tool. Not an anonymization tool (round-trip is bijective
by design).
Prometheus metrics (proposal)
localai_pii_detections_total{model, detector}
localai_pii_replacements_total{model}
localai_pii_stream_buffer_overflow_total{model} # diagnostic for tuning
Implementation outline
PseudonymizationConfig in core/config/model_config.go:32 (mirrors
FunctionsConfig, ReasoningConfig, MCP at L61-92).
- New
pkg/pii/ package:
detector.go interface + registry
regex.go detector implementations
engine.go bijective map + apply/reverse
- Request middleware at
core/http/middleware/pseudonymization_request.go.
- Response middleware at
core/http/middleware/pseudonymization_response.go
(Echo writer-wrapper with sliding buffer).
- Both appended to
chatMiddleware at
core/http/routes/openai.go:35-51. MCP endpoint picks them up via
existing core/http/endpoints/localai/mcp.go:61 delegation.
- Ginkgo tests inline. Docs
docs/content/features/pseudonymization.md
with a security-caveat banner at the top.
Open questions for mudler
-
Response-writer wrapper vs. inline edit — preference? We lean
wrapper for maintenance durability; willing to go inline if you
see a cleaner path.
-
pkg/pii/ placement — new top-level package, or internal/pii/?
Regex detectors are narrow enough that reuse across LocalAI is
unlikely, but we'd like to keep it swappable.
-
Detector naming convention — regex_email vs email_regex vs
pii.email.regex? Whatever matches your house style for future
detector adds.
-
Metric naming prefix — localai_pii_* or share a prefix with
compression PR's localai_compression_*? Should all middleware
metrics live under localai_middleware_*?
-
Default behavior of unknown detector names in YAML — error
startup, warn-and-skip, or accept (future detector names)? We lean
error-at-startup so operator typos are caught early.
Prior art
walcz.de production runs this pattern in Python (prompt-optimizer)
for 4 months across 17 agents. The 195-LOC pseudonymizer.py and
regex portions of pii_detector.py (349 LOC total, regex ≈150 LOC)
are the basis. German business correspondence (Müller, HRB, IBAN DE89)
stress-tested. Happy to share code for reference.
Next step
If this design lands, PR submission plan:
- Commit 1:
pkg/pii/ regex detector + bijective engine + tests
(can merge as a utility library, independent of middleware)
- Commit 2: Request + response middleware + config + tests + docs
Approx 4-6 days total.
Assisted-by: Claude:claude-opus-4-7
Problem
Operators who route LocalAI traffic to a model whose backend sits outside
their trust boundary (a remote hosted endpoint, a partner's service, a
sidecar that logs) have no server-side mechanism to scrub and restore
PII inside LocalAI's request/response path. Today they rewrite it
outside LocalAI or simply don't route sensitive content at all.
Proposal
Add an opt-in, per-model middleware that performs bijective token
substitution:
(
Peter Müller→PERSON_001)the client sees it
Framing: this is PII replacement, not a compliance tool. Operators
remain responsible for legal/retention/audit concerns. No claims about
GDPR/HIPAA are made or implied by this feature.
Config (per-model YAML)
Detector names available in Phase 1:
regex_emailregex_phone_deregex_phone_intlregex_hrb(German Handelsregister)regex_ibanScope: middleware activates only when
enabled: trueon that model.No global trigger. No "outgoing request" heuristic. Operator picks models
explicitly (e.g. only
claude-opus-4-7.yaml, not others).Flow
Request side (after
SetOpenAIRequest):defer-released).Response side (Echo response-writer wrapper):
multi-chunk tokens (e.g.
PERSON_001split across SSE frames).deferreleases map — no leak.Architecture — why response-writer wrapper, not inline edit
An earlier draft considered modifying
core/http/endpoints/openai/chat.goinside the streaming loop where
ev.Choices[0].Delta.Contentis set(around L682-689). We rejected that: it couples this feature to that
specific streaming implementation, and any upstream change there
would force a merge-conflict on our middleware.
An Echo response-writer wrapper (same pattern Echo's
BodyDumpuses,but with a transformer instead of a sink) keeps the seam at the
middleware layer. Willing to go the other direction if maintainers
prefer — this is worth discussing early, hence the RFC.
Metadata
Response adds:
{ "usage": { "pseudonymization_meta": { "detectors_fired": ["regex_email", "regex_phone_de"], "tokens_replaced": 7, "request_round_trip": true } } }No salt, no map, no user-content hash leaves the server.
Non-goals (Phase 1)
small and review-focused on the regex + middleware seam.
by design).
Prometheus metrics (proposal)
Implementation outline
PseudonymizationConfigincore/config/model_config.go:32(mirrorsFunctionsConfig,ReasoningConfig,MCPat L61-92).pkg/pii/package:detector.gointerface + registryregex.godetector implementationsengine.gobijective map + apply/reversecore/http/middleware/pseudonymization_request.go.core/http/middleware/pseudonymization_response.go(Echo writer-wrapper with sliding buffer).
chatMiddlewareatcore/http/routes/openai.go:35-51. MCP endpoint picks them up viaexisting
core/http/endpoints/localai/mcp.go:61delegation.docs/content/features/pseudonymization.mdwith a security-caveat banner at the top.
Open questions for mudler
Response-writer wrapper vs. inline edit — preference? We lean
wrapper for maintenance durability; willing to go inline if you
see a cleaner path.
pkg/pii/placement — new top-level package, orinternal/pii/?Regex detectors are narrow enough that reuse across LocalAI is
unlikely, but we'd like to keep it swappable.
Detector naming convention —
regex_emailvsemail_regexvspii.email.regex? Whatever matches your house style for futuredetector adds.
Metric naming prefix —
localai_pii_*or share a prefix withcompression PR's
localai_compression_*? Should all middlewaremetrics live under
localai_middleware_*?Default behavior of unknown detector names in YAML — error
startup, warn-and-skip, or accept (future detector names)? We lean
error-at-startup so operator typos are caught early.
Prior art
walcz.de production runs this pattern in Python (
prompt-optimizer)for 4 months across 17 agents. The 195-LOC
pseudonymizer.pyandregex portions of
pii_detector.py(349 LOC total, regex ≈150 LOC)are the basis. German business correspondence (Müller, HRB, IBAN DE89)
stress-tested. Happy to share code for reference.
Next step
If this design lands, PR submission plan:
pkg/pii/regex detector + bijective engine + tests(can merge as a utility library, independent of middleware)
Approx 4-6 days total.
Assisted-by: Claude:claude-opus-4-7