Skip to content

Commit fb07dd3

Browse files
authored
Update usage-considerations.mdx (#20497)
* Update usage-considerations.mdx add info on expected latency from enabling guardrails * Update usage-considerations.mdx
1 parent f3a618c commit fb07dd3

File tree

1 file changed

+5
-2
lines changed

1 file changed

+5
-2
lines changed

src/content/docs/ai-gateway/guardrails/usage-considerations.mdx

Lines changed: 5 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -11,8 +11,11 @@ Since Guardrails runs on Workers AI, enabling it incurs usage on Workers AI. You
1111

1212
## Additional considerations
1313

14-
- Model availability: If at least one hazard category is set to `block`, but AI Gateway is unable to receive a response from Workers AI, the request will be blocked. Conversely, if a hazard category is set to `flag` and AI Gateway cannot obtain a response from Workers AI, the request will proceed without evaluation. This approach prioritizes availability, allowing requests to continue even when content evaluation is not possible.
15-
- Latency impact: Enabling Guardrails adds some latency. Consider this when balancing safety and speed.
14+
- **Model availability**: If at least one hazard category is set to `block`, but AI Gateway is unable to receive a response from Workers AI, the request will be blocked. Conversely, if a hazard category is set to `flag` and AI Gateway cannot obtain a response from Workers AI, the request will proceed without evaluation. This approach prioritizes availability, allowing requests to continue even when content evaluation is not possible.
15+
- **Latency impact**: Enabling Guardrails adds some latency. Enabling Guardrails introduces additional latency to requests. Typically, evaluations using Llama Guard 3 8B on Workers AI add approximately 500 milliseconds per request. However, larger requests may experience increased latency, though this increase is not linear. Consider this when balancing safety and performance.
16+
- **Handling long content**: When evaluating long prompts or responses, Guardrails automatically segments the content into smaller chunks, processing each through separate Guardrail requests. This approach ensures comprehensive moderation but may result in increased latency for longer inputs.
17+
- **Supported languages**: Llama Guard 3.3 8B supports content safety classification in the following languages: English, French, German, Hindi, Italian, Portuguese, Spanish, and Thai.
18+
1619

1720
:::note
1821

0 commit comments

Comments
 (0)