|
| 1 | +--- |
| 2 | +title: Document Embedding in Prompts |
| 3 | +description: Learn how to embed documents in prompts for Azure OpenAI, including JSON escaping and indirect attack detection. |
| 4 | +author: PatrickFarley |
| 5 | +manager: nitinme |
| 6 | +ms.service: azure-ai-services |
| 7 | +ms.topic: conceptual |
| 8 | +ms.date: 05/07/2025 |
| 9 | +ms.author: pafarley |
| 10 | +--- |
| 11 | + |
| 12 | +# Document embedding in prompts |
| 13 | + |
| 14 | +A key aspect of Azure OpenAI's Responsible AI measures is the content safety system. This system runs alongside the core GPT model to monitor any irregularities in the model input and output. Its performance is improved when it can differentiate between various elements of your prompt like system input, user input, and AI assistant's output. |
| 15 | + |
| 16 | +For enhanced detection capabilities, prompts should be formatted according to the following recommended methods. |
| 17 | + |
| 18 | +## Chat Completions API |
| 19 | + |
| 20 | +The Chat Completion API is structured by definition. It consists of a list of messages, each with an assigned role. |
| 21 | + |
| 22 | +The safety system parses this structured format and applies the following behavior: |
| 23 | +- On the latest “user” content, the following categories of RAI Risks will be detected: |
| 24 | + - Hate |
| 25 | + - Sexual |
| 26 | + - Violence |
| 27 | + - Self-Harm |
| 28 | + - Prompt shields (optional) |
| 29 | + |
| 30 | +This is an example message array: |
| 31 | + |
| 32 | +```json |
| 33 | +{"role": "system", "content": "Provide some context and/or instructions to the model."}, |
| 34 | +{"role": "user", "content": "Example question goes here."}, |
| 35 | +{"role": "assistant", "content": "Example answer goes here."}, |
| 36 | +{"role": "user", "content": "First question/message for the model to actually respond to."} |
| 37 | +``` |
| 38 | + |
| 39 | +## Embedding documents in your prompt |
| 40 | + |
| 41 | +In addition to detection on last user content, Azure OpenAI also supports the detection of specific risks inside context documents via Prompt Shields – Indirect Prompt Attack Detection. You should identify parts of the input that are a document (for example, retrieved website, email, etc.) with the following document delimiter. |
| 42 | + |
| 43 | +``` |
| 44 | +\"\"\" <documents> *insert your document content here* </documents> \"\"\" |
| 45 | +``` |
| 46 | + |
| 47 | +When you do so, the following options are available for detection on tagged documents: |
| 48 | +- On each tagged “document” content, detect the following categories: |
| 49 | + - Indirect attacks (optional) |
| 50 | + |
| 51 | +Here's an example chat completion messages array: |
| 52 | + |
| 53 | +```json |
| 54 | +{"role": "system", "content": "Provide some context and/or instructions to the model.}, |
| 55 | + |
| 56 | +{"role": "user", "content": "First question/message for the model to actually respond to, including document context. \"\"\" <documents>\n*insert your document content here*\n</documents> \"\"\"""} |
| 57 | +``` |
| 58 | + |
| 59 | +### JSON escaping |
| 60 | + |
| 61 | +When you tag unvetted documents for detection, the document content should be JSON-escaped to ensure successful parsing by the Azure OpenAI safety system. |
| 62 | + |
| 63 | +For example, see the following email body: |
| 64 | + |
| 65 | +``` |
| 66 | +Hello Josè, |
| 67 | +
|
| 68 | +I hope this email finds you well today. |
| 69 | +``` |
| 70 | + |
| 71 | +With JSON escaping, it would read: |
| 72 | + |
| 73 | +``` |
| 74 | +Hello Jos\u00E9,\nI hope this email finds you well today. |
| 75 | +``` |
| 76 | + |
| 77 | +The escaped text in a chat completion context would read: |
| 78 | + |
| 79 | +```json |
| 80 | +{"role": "system", "content": "Provide some context and/or instructions to the model, including document context. \"\"\" <documents>\n Hello Jos\\u00E9,\\nI hope this email finds you well today. \n</documents> \"\"\""}, |
| 81 | + |
| 82 | +{"role": "user", "content": "First question/message for the model to actually respond to."} |
| 83 | +``` |
0 commit comments