Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions 1.0/en/0x10-C07-Model-Behavior.md
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ Technical controls to detect and scrub bad content before it is shown to the use
| **7.3.6** | **Verify that** the system requires a human approval step or re-authentication if the model generates high-risk content. | 3 |
| **7.3.7** | **Verify that** output filters detect and block responses that reproduce verbatim segments of system prompt content. | 2 |
| **7.3.8** | **Verify that** LLM client applications prevent model-generated output from triggering automatic outbound requests (e.g., auto-rendered images, iframes, or link prefetching) to attacker-controlled endpoints, for example by disabling automatic external resource loading or restricting it to explicitly allowlisted origins as appropriate. | 2 |
| **7.3.9** | **Verify that** generated outputs are analyzed for statistical steganographic covert channels (e.g., biased token-choice patterns or output distribution anomalies) that could encode hidden data across the model's valid output space, and that detections are flagged for review. | 3 |

---

Expand Down
1 change: 1 addition & 0 deletions 1.0/en/0x93-Appendix-D_AI_Security_Controls_Inventory.md
Original file line number Diff line number Diff line change
Expand Up @@ -206,6 +206,7 @@ Constrain, filter, and validate model outputs before they reach users or downstr
| Explicit / non-consensual content filters | 7.7.1 |
| Citation and attribution validation | 5.4.2 |
| MCP error response sanitization (no stack traces, tokens, internal paths) | 10.4.6 |
| Statistical steganographic covert channel detection in generated outputs | 7.3.9 |
| RAG attribution derived from retrieval metadata, not model-generated | 7.8.3 |

**Common pitfalls:** redacting PII in text but not in structured data fields; not enforcing stop sequences on streaming outputs; leaking internal architecture through error messages.
Expand Down
Loading