You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/concepts/red-teaming.md
+19-21Lines changed: 19 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
2
title: Planning red teaming for large language models (LLMs) and their applications
3
3
titleSuffix: Azure OpenAI Service
4
-
description: Learn about how red teaming and adversarial testing is an essential practice in the responsible development of systems and features using large language models (LLMs)
4
+
description: Learn about how red teaming and adversarial testing are an essential practice in the responsible development of systems and features using large language models (LLMs)
5
5
ms.service: azure-ai-openai
6
6
ms.topic: conceptual
7
7
ms.date: 11/03/2023
@@ -43,28 +43,27 @@ Here is how you can get started and plan your process of red teaming LLMs. Advan
43
43
44
44
**Assemble a diverse group of red teamers**
45
45
46
-
Determine the ideal composition of red teamers in terms of people’s experience, demographics, and expertise across disciplines (e.g., experts in AI, social sciences, security) for your product’s domain. For example, if you’re designing a chatbot to help health care providers, medical experts can help identify risks in that domain.
46
+
Determine the ideal composition of red teamers in terms of people’s experience, demographics, and expertise across disciplines (for example, experts in AI, social sciences, security) for your product’s domain. For example, if you’re designing a chatbot to help health care providers, medical experts can help identify risks in that domain.
47
47
48
48
**Recruit red teamers with both benign and adversarial mindsets**
49
49
50
50
Having red teamers with an adversarial mindset and security-testing experience is essential for understanding security risks, but red teamers who are ordinary users of your application system and haven’t been involved in its development can bring valuable perspectives on harms that regular users might encounter.
51
51
52
-
[**Assign red teamers to harms and/or product features**](https://hits.microsoft.com/Recommendation/4220579)
52
+
**Assign red teamers to harms and/or product features**
53
53
54
-
- Assign RAI red teamers with specific expertise to probe for specific types of harms (e.g., security subject matter experts can probe for jailbreaks, meta prompt extraction, and content related to cyberattacks).
54
+
- Assign RAI red teamers with specific expertise to probe for specific types of harms (for example, security subject matter experts can probe for jailbreaks, meta prompt extraction, and content related to cyberattacks).
55
55
56
-
- For multiple rounds of testing, decide whether to switch red teamer assignments in each round to get diverse perspectives on each harm and maintain creativity. If switching assignments, allow time for red teamers to get up to speed on the instructions for their newly assigned harm.
56
+
- For multiple rounds of testing, decide whether to switch red teamer assignments in each round to get diverse perspectives on each harm and maintain creativity. If switching assignments, allow time for red teamers to get up to speed on the instructions for their newly assigned harm.
57
57
58
58
- In later stages, when the application and its UI are developed, you might want to assign red teamers to specific parts of the application (i.e., features) to ensure coverage of the entire application.
59
59
60
-
- Consider how much time and effort each red teamer should dedicate (e.g., those testing for benign scenarios might need less time than those testing for adversarial scenarios).
60
+
- Consider how much time and effort each red teamer should dedicate (for example, those testing for benign scenarios might need less time than those testing for adversarial scenarios).
61
61
62
-
> [!NOTE]
63
-
> It can be helpful to provide red teamers with:
64
-
> - Clear instructions that could include:
65
-
> - An introduction describing the purpose and goal of the given round of red teaming; the product and features that will be tested and how to access them; what kinds of issues to test for; red teamers’ focus areas, if the testing is more targeted; how much time and effort each red teamer should spend on testing; how to record results; and who to contact with questions.
66
-
> - A file or location for recording their examples and findings, including information such as:
67
-
> - The date an example was surfaced; a unique identifier for the input/output pair if available, for reproducibility purposes; the input prompt; a description or screenshot of the output.
62
+
It can be helpful to provide red teamers with:
63
+
- Clear instructions that could include:
64
+
- An introduction describing the purpose and goal of the given round of red teaming; the product and features that will be tested and how to access them; what kinds of issues to test for; red teamers’ focus areas, if the testing is more targeted; how much time and effort each red teamer should spend on testing; how to record results; and who to contact with questions.
65
+
- A file or location for recording their examples and findings, including information such as:
66
+
- The date an example was surfaced; a unique identifier for the input/output pair if available, for reproducibility purposes; the input prompt; a description or screenshot of the output.
68
67
69
68
### Plan: What to test
70
69
@@ -88,36 +87,36 @@ When reporting results, make clear which endpoints were used for testing. When t
88
87
89
88
### Plan: How to test
90
89
91
-
1.**[Conduct open-ended testing to uncover a wide range of harms.](https://hits.microsoft.com/Recommendation/4220586)**
90
+
**Conduct open-ended testing to uncover a wide range of harms.**
92
91
93
92
The benefit of RAI red teamers exploring and documenting any problematic content (rather than asking them to find examples of specific harms) enables them to creatively explore a wide range of issues, uncovering blind spots in your understanding of the risk surface.
94
93
95
-
2.**Create a list of harms from the open-ended testing.**.
94
+
**Create a list of harms from the open-ended testing.**.
96
95
97
96
- Consider creating a list of harms, with definitions and examples of the harms.
98
97
- Provide this list as a guideline to red teamers in later rounds of testing.
99
98
100
-
3.**Conduct guided red teaming and iterate: Continue probing for harms in the list; identify new harms that surface.**
99
+
**Conduct guided red teaming and iterate: Continue probing for harms in the list; identify new harms that surface.**
101
100
102
101
Use a list of harms if available and continue testing for known harms and the effectiveness of their mitigations. In the process, you will likely identify new harms. Integrate these into the list and be open to shifting measurement and mitigation priorities to address the newly identified harms.
103
102
104
103
Plan which harms to prioritize for iterative testing. Several factors can inform your prioritization, including, but not limited to, the severity of the harms and the context in which they are more likely to surface.
105
104
106
105
### Plan: How to record data
107
106
108
-
**[Decide what data you need to collect and what data is optional.](https://hits.microsoft.com/Recommendation/4220591)**
107
+
**Decide what data you need to collect and what data is optional.**
109
108
110
-
- Decide what data the red teamers will need to record (e.g., the input they used; the output of the system; a unique ID, if available, to reproduce the example in the future; and other notes.)
109
+
- Decide what data the red teamers will need to record (for example, the input they used; the output of the system; a unique ID, if available, to reproduce the example in the future; and other notes.)
111
110
112
111
- Be strategic with what data you are collecting to avoid overwhelming red teamers, while not missing out on critical information.
113
112
114
-
**[Create a structure for data collection](https://hits.microsoft.com/Recommendation/4220592)**
113
+
**Create a structure for data collection**
115
114
116
115
A shared Excel spreadsheet is often the simplest method for collecting red teaming data. A benefit of this shared file is that red teamers can review each other’s examples to gain creative ideas for their own testing and avoid duplication of data.
117
116
118
117
## During testing
119
118
120
-
**[Plan to be on active standby while red teaming is ongoing](https://hits.microsoft.com/Recommendation/4220593)**
119
+
**Plan to be on active standby while red teaming is ongoing**
121
120
122
121
- Be prepared to assist red teamers with instructions and access issues.
123
122
- Monitor progress on the spreadsheet and send timely reminders to red teamers.
@@ -144,5 +143,4 @@ In the report, be sure to clarify that the role of RAI red teaming is to expose
144
143
145
144
Additionally, if the report contains problematic content and examples, consider including a content warning.
146
145
147
-
The guidance in this document is not intended to be, and should not be construed as providing, legal advice. The jurisdiction in which you're operating may have various regulatory or legal requirements that apply to your AI system. Be aware that not all of these recommendations are appropriate for every scenario and, conversely, these recommendations may be insufficient for some scenarios.
148
-
146
+
The guidance in this document is not intended to be, and should not be construed as providing, legal advice. The jurisdiction in which you're operating may have various regulatory or legal requirements that apply to your AI system. Be aware that not all of these recommendations are appropriate for every scenario and, conversely, these recommendations may be insufficient for some scenarios.
0 commit comments