diff --git a/1.0/en/0x10-C07-Model-Behavior.md b/1.0/en/0x10-C07-Model-Behavior.md index c8aa3ea..64735a2 100644 --- a/1.0/en/0x10-C07-Model-Behavior.md +++ b/1.0/en/0x10-C07-Model-Behavior.md @@ -47,6 +47,7 @@ Technical controls to detect and scrub bad content before it is shown to the use | **7.3.6** | **Verify that** the system requires a human approval step or re-authentication if the model generates high-risk content. | 3 | | **7.3.7** | **Verify that** output filters detect and block responses that reproduce verbatim segments of system prompt content. | 2 | | **7.3.8** | **Verify that** LLM client applications prevent model-generated output from triggering automatic outbound requests (e.g., auto-rendered images, iframes, or link prefetching) to attacker-controlled endpoints, for example by disabling automatic external resource loading or restricting it to explicitly allowlisted origins as appropriate. | 2 | +| **7.3.9** | **Verify that** generated outputs are analyzed for statistical steganographic covert channels (e.g., biased token-choice patterns or output distribution anomalies) that could encode hidden data across the model's valid output space, and that detections are flagged for review. | 3 | --- diff --git a/1.0/en/0x93-Appendix-D_AI_Security_Controls_Inventory.md b/1.0/en/0x93-Appendix-D_AI_Security_Controls_Inventory.md index ffdc1a9..cd9de2e 100644 --- a/1.0/en/0x93-Appendix-D_AI_Security_Controls_Inventory.md +++ b/1.0/en/0x93-Appendix-D_AI_Security_Controls_Inventory.md @@ -206,6 +206,7 @@ Constrain, filter, and validate model outputs before they reach users or downstr | Explicit / non-consensual content filters | 7.7.1 | | Citation and attribution validation | 5.4.2 | | MCP error response sanitization (no stack traces, tokens, internal paths) | 10.4.6 | +| Statistical steganographic covert channel detection in generated outputs | 7.3.9 | | RAG attribution derived from retrieval metadata, not model-generated | 7.8.3 | **Common pitfalls:** redacting PII in text but not in structured data fields; not enforcing stop sequences on streaming outputs; leaking internal architecture through error messages.