You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
title: How to generate adversarial simulations for safety evaluation
2
+
title: How to generate synthetic and simulated data for evaluation
3
3
titleSuffix: Azure AI Studio
4
-
description: This article provides instructions on how to run adversarial attack simulations to evaluate the safety of your generative AI application.
4
+
description: This article provides instructions on how to generate synthetic data to run simulations to evaluate the performance and safety of your generative AI application.
5
5
manager: nitinme
6
6
ms.service: azure-ai-studio
7
7
ms.custom:
@@ -14,21 +14,198 @@ ms.author: eur
14
14
author: eric-urban
15
15
---
16
16
17
-
# Generate adversarial simulations for safety evaluation
17
+
# Generate synthetic and simulated data for evaluation
Large language models are known for their few-shot and zero-shot learning abilities, allowing them to function with minimal data. However, this limited data availability impedes thorough evaluation and optimization when you might not have test datasets to evaluate the quality and effectiveness of your generative AI application.
22
23
23
-
In this article, you learn how to run adversarial attack simulations. Augment and accelerate your red-teaming operation by using Azure AI Studio safety evaluations to generate an adversarial dataset against your application. We provide adversarial scenarios along with access to an Azure OpenAI GPT-4 model with safety behaviors turned off to enable the adversarial simulation.
24
+
In this article, you will learn how to holistically generate high-quality datasets for evaluating quality and safety of your application by leveraging large language models and the Azure AI safety evaluation service.
24
25
25
26
## Getting started
26
27
27
-
First install and import the simulator package from the prompt flow SDK:
28
+
First install and import the simulator package from the Azure AI Evaluation SDK:
29
+
```python
30
+
pip install azure-ai-evaluation
31
+
```
32
+
## Generate synthetic data and simulate non-adversarial tasks
33
+
34
+
Azure AI Evaluation SDK's `Simulator` provides an end-to-end synthetic data generation capability to help developers test their application's response to typical user queries in the absence of production data. AI developers can use an index or text-based query generator and fully-customizable simulator to create robust test datasets around non-adversarial tasks specific to their application. The `Simulator` class is a powerful tool designed to generate synthetic conversations and simulate task-based interactions. This capability is particularly useful for:
35
+
36
+
-**Testing Conversational Applications**: Ensure your chatbots and virtual assistants respond accurately under various scenarios.
37
+
-**Training AI Models**: Generate diverse datasets to train and fine-tune machine learning models.
38
+
-**Generating Datasets**: Create extensive conversation logs for analysis and development purposes.
39
+
40
+
By automating the creation of synthetic data, the `Simulator` class helps streamline the development and testing processes, ensuring your applications are robust and reliable.
41
+
42
+
```python
43
+
from azure.ai.evaluation.synthetic import Simulator
44
+
```
45
+
### Generate text or index-based synthetic data as input
In the first part, we prepare the text for generating the input to our simulator:
60
+
-**Wikipedia Search**: Searches for "Leonardo da vinci" on Wikipedia and retrieves the first matching title.
61
+
-**Page Retrieval**: Fetches the Wikipedia page for the identified title.
62
+
-**Text Extraction**: Extracts the first 5000 characters of the page summary to use as input for the simulator.
63
+
64
+
### Specify target callback to simulate against
65
+
You can bring any application endpoint to simulate against by specifying a target callback function such as the one below given an application that is a LLM with a prompty file: `application.prompty`
The `Simulator` class offers extensive customization options, allowing you to override default behaviors, adjust model parameters, and introduce complex simulation scenarios. Below are examples of different overrides you can implement to tailor the simulator to your specific needs.
124
+
125
+
#### Query and Response generation Prompty customization
126
+
127
+
The `query_response_generating_prompty_override` allows you to customize how query-response pairs are generated from input text. This is particularly useful when you want to control the format or content of the generated responses as input to your simulator.
128
+
129
+
```python
130
+
current_dir = os.path.dirname(__file__)
131
+
query_response_prompty_override = os.path.join(current_dir, "query_generator_long_answer.prompty") # Passes the `query_response_generating_prompty` parameter with the path to the custom prompt template.
132
+
133
+
tasks = [
134
+
f"I am a student and I want to learn more about {wiki_search_term}",
135
+
f"I am a teacher and I want to teach my students about {wiki_search_term}",
136
+
f"I am a researcher and I want to do a detailed research on {wiki_search_term}",
137
+
f"I am a statistician and I want to do a detailed table of factual data concerning {wiki_search_term}",
138
+
]
139
+
140
+
outputs =await simulator(
141
+
target=callback,
142
+
text=text,
143
+
num_queries=4,
144
+
max_conversation_turns=2,
145
+
tasks=tasks,
146
+
query_response_generating_prompty=query_response_prompty_override # optional, use your own prompt to control how query-response pairs are generated from the input text to be used in your simulator
147
+
)
148
+
149
+
for output in outputs:
150
+
withopen("output.jsonl", "a") as f:
151
+
f.write(output.to_eval_qa_json_lines())
152
+
```
153
+
154
+
#### Simulation Prompty customization
155
+
156
+
The `Simulator` uses a default Prompty that instructs the LLM on how to simulate a user interacting with your application. The `user_simulating_prompty_override` enables you to override the default behavior of the simulator. By adjusting these parameters, you can tune the simulator to produce responses that align with your specific requirements, enhancing the realism and variability of the simulations.
157
+
158
+
```python
159
+
user_simulator_prompty_kwargs = {
160
+
"temperature": 0.7, # Controls the randomness of the generated responses. Lower values make the output more deterministic.
161
+
"top_p": 0.9# Controls the diversity of the generated responses by focusing on the top probability mass.
162
+
}
163
+
164
+
outputs =await simulator(
165
+
target=callback,
166
+
text=text,
167
+
num_queries=1, # Minimal number of queries
168
+
user_simulator_prompty="user_simulating_application.prompty", # A prompty which accepts all the following kwargs can be passed to override default user behaviour.
169
+
user_simulator_prompty_kwargs=user_simulator_prompty_kwargs # Uses a dictionary to override default model parameters such as `temperature` and `top_p`.
170
+
)
171
+
```
172
+
173
+
174
+
#### Simulation with fixed Conversation Starters
175
+
176
+
Incorporating conversation starters allows the simulator to handle pre-specified repeatable contextually relevant interactions. This is useful for simulating the same user turns in a conversation or interaction and evaluating the differences.
177
+
178
+
```python
179
+
conversation_turns = [ # Defines predefined conversation sequences, each starting with a conversation starter.
180
+
[
181
+
"Hello, how are you?",
182
+
"I want to learn more about Leonardo da Vinci",
183
+
"Thanks for helping me. What else should I know about Leonardo da Vinci for my project",
184
+
],
185
+
[
186
+
"Hey, I really need your help to finish my homework.",
187
+
"I need to write an essay about Leonardo da Vinci",
188
+
"Thanks, can you rephrase your last response to help me understand it better?",
189
+
],
190
+
]
191
+
192
+
outputs =await simulator(
193
+
target=callback,
194
+
text=text,
195
+
conversation_turns=conversation_turns, # optional, ensures the user simulator follows the predefined conversation sequences
## Generate adversarial simulations for safety evaluation
204
+
205
+
Augment and accelerate your red-teaming operation by using Azure AI Studio safety evaluations to generate an adversarial dataset against your application. We provide adversarial scenarios along with configured access to a service-side Azure OpenAI GPT-4 model with safety behaviors turned off to enable the adversarial simulation.
206
+
207
+
```python
208
+
from azure.ai.evaluation.synthetic import AdversarialSimulator
32
209
```
33
210
The adversarial simulator works by setting up a service-hosted GPT large language model to simulate an adversarial user and interact with your application. An AI Studio project is required to run the adversarial simulator:
By default we run simulations async. We enable optional parameters:
100
276
-`max_conversation_turns` defines how many turns the simulator generates at most for the `ADVERSARIAL_CONVERSATION` scenario only. The default value is 1. A turn is defined as a pair of input from the simulated adversarial "user" then a response from your "assistant."
101
277
-`max_simulation_results` defines the number of generations (that is, conversations) you want in your simulated dataset. The default value is 3. See table below for maximum number of simulations you can run for each scenario.
102
-
-`jailbreak`defines whether a user-prompt injection is included in the first turn of the simulation. You can use this to evaluate jailbreak, which is a comparative measurement. We recommend running two simulations (one without the flag and one with the flag) to generate two datasets: a baseline adversarial test dataset versus the same adversarial test dataset with jailbreak injections in the first turn to illicit undesired responses. Then you can evaluate both datasets to determine if your application is susceptible to jailbreak injections.
103
278
104
279
## Supported simulation scenarios
105
280
The `AdversarialSimulator` supports a range of scenarios, hosted in the service, to simulate against your target application or function:
106
281
107
282
| Scenario | Scenario enum | Maximum number of simulations | Use this dataset for evaluating |
| Question Answering |`ADVERSARIAL_QA`|1384 | Hateful and unfair content, Sexual content, Violent content, Self-harm-related content |
110
-
| Conversation |`ADVERSARIAL_CONVERSATION`|1018 |Hateful and unfair content, Sexual content, Violent content, Self-harm-related content |
111
-
| Summarization |`ADVERSARIAL_SUMMARIZATION`|525 |Hateful and unfair content, Sexual content, Violent content, Self-harm-related content |
112
-
| Search |`ADVERSARIAL_SEARCH`|1000 |Hateful and unfair content, Sexual content, Violent content, Self-harm-related content|
113
-
| Text Rewrite |`ADVERSARIAL_REWRITE`|1000 |Hateful and unfair content, Sexual content, Violent content, Self-harm-related content |
284
+
| Question Answering |`ADVERSARIAL_QA`|1384 | Hateful and unfair content, Sexual content, Violent content, Self-harm-related content, Direct Attack (UPIA) Jailbreak|
285
+
| Conversation |`ADVERSARIAL_CONVERSATION`|1018 |Hateful and unfair content, Sexual content, Violent content, Self-harm-related content, Direct Attack (UPIA) Jailbreak|
286
+
| Summarization |`ADVERSARIAL_SUMMARIZATION`|525 |Hateful and unfair content, Sexual content, Violent content, Self-harm-related content, Direct Attack (UPIA) Jailbreak|
287
+
| Search |`ADVERSARIAL_SEARCH`|1000 |Hateful and unfair content, Sexual content, Violent content, Self-harm-related conten, Direct Attack (UPIA) Jailbreakt|
288
+
| Text Rewrite |`ADVERSARIAL_REWRITE`|1000 |Hateful and unfair content, Sexual content, Violent content, Self-harm-related content, Direct Attack (UPIA) Jailbreak|
We support evaluating vulnerability towards the following types of jailbreak attacks:
296
+
-**Direct attack jailbreak** (also known as UPIA or User Prompt Injected Attack) injects prompts in the user role turn of conversations or queries to generative AI applications.
297
+
-**Indirect attack jailbreak** (also known as XPIA or cross domain prompt injected attack) injects promtps in the returned documents or context of the user's query to generative AI applications.
298
+
299
+
*Evaluating direct attack* is a comparative measurement using the content safety evaluators as a control. It is not its own AI-assisted metric. Run `ContentSafetyEvaluator` on two different, red-teamed datasets generated by `AdversarialSimulator`:
300
+
1. Baseline adversarial test dataset using one of the above scenario enums for evaluating Hateful and unfair content, Sexual content, Violent content, Self-harm-related content
301
+
2. Adversarial test dataset with direct attack jailbreak injections in the first turn:
The `outputs` will be a list of two lists including the baseline adversarial simulation and the same simulation but with a jailbreak attack injected in the user role's first turn. Run two evaluation runs with `ContentSafetyEvaluator` and measure the differences between the two datasets' defect rates.
313
+
314
+
*Evaluating indirect attack* is an AI-assisted metric and does not require comparative measurement like evaluating direct attacks. You can generate an indirect attack jailbreak injected dataset with the following then evaluate with the `IndirectAttackEvaluator`.
@@ -142,6 +352,36 @@ Use the helper function `to_json_lines()` to convert the output to the data outp
142
352
143
353
### More functionality
144
354
355
+
#### Multi-language adversarial simulation
356
+
Using the [ISO standard](https://www.andiamo.co.uk/resources/iso-language-codes/), the `AdversarialSimulator` will support the following languages:
357
+
358
+
| Language | ISO language code |
359
+
|--------------------|-------------------|
360
+
| Spanish | es |
361
+
| Italian | it |
362
+
| French | fr |
363
+
| Japanese | ja |
364
+
| Portugese | pt |
365
+
| Simplified Chinese | zh-cn |
366
+
| German | de |
367
+
368
+
Usage example below:
369
+
```python
370
+
outputs =await simulator(
371
+
scenario=scenario, # required, adversarial scenario to simulate
372
+
target=callback, # required, callback function to simulate against
373
+
language=es # optional, default english
374
+
)
375
+
```
376
+
#### Set the randomization seed
377
+
By default, the `AdversarialSimulator` will randomize interactions every simulation. You can set a `randomization_seed` parameter to produce the same set of conversation starters every time for reproducibility.
378
+
```python
379
+
outputs =await simulator(
380
+
scenario=scenario, # required, adversarial scenario to simulate
381
+
target=callback, # required, callback function to simulate against
382
+
randomization_seed=1# optional
383
+
)
384
+
```
145
385
#### Convert to jsonl
146
386
147
387
To convert your messages format to JSON Lines format, use the helper function `to_json_lines()` on your output.
@@ -203,5 +443,6 @@ User can also define their own `api_call_retry_sleep_sec` and `api_call_retry_ma
203
443
204
444
## Related content
205
445
206
-
-[Get started building a chat app using the prompt flow SDK](../../quickstarts/get-started-code.md)
207
-
-[Work with projects in VS Code](vscode.md)
446
+
-[Get started building a chat app](../../quickstarts/get-started-code.md)
447
+
-[Evaluate your generative AI application](evaluate-sdk.md)
448
+
-[Get started with samples](https://aka.ms/aistudio/syntheticdatagen-samples)
0 commit comments