Skip to content

Commit 409e812

Browse files
committed
benchmarking updates
1 parent 7c71363 commit 409e812

File tree

1 file changed

+0
-7
lines changed

1 file changed

+0
-7
lines changed

articles/ai-foundry/concepts/model-benchmarks.md

Lines changed: 0 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -62,13 +62,6 @@ See more details in accuracy scores:
6262
Accuracy scores are provided on a scale of zero to one. Higher values are better.
6363

6464

65-
## Safety benchmarks of language models
66-
67-
Safety benchmarks use a standard metric Attack Success Rate to measure how vulnerable language models are to attacks in biosecurity, cybersecurity, and chemical security. Currently, the [Weapons of Mass Destruction Proxy (WMDP) benchmark](https://www.wmdp.ai/) is used to assess hazardous knowledge in language models. The lower the Attack Success Rate is, the safer is the model response.
68-
69-
All model endpoints are benchmarked with the default Azure AI Content Safety filters on with a default configuration. These safety filters detect and block [content harm categories](../../ai-services/content-safety/concepts/harm-categories.md) in violence, self-harm, sexual, hate, and unfairness, but do not specifically cover categories in cybersecurity, biosecurity, chemical security.
70-
71-
7265
## Performance benchmarks of language models
7366

7467
Performance metrics are calculated as an aggregate over 14 days, based on 24 trails (two requests per trail) sent daily with a one-hour interval between every trail. The following default parameters are used for each request to the model endpoint:

0 commit comments

Comments
 (0)