Skip to content

Commit b5d6a58

Browse files
typo
1 parent cb2f1c4 commit b5d6a58

File tree

1 file changed

+2
-2
lines changed

1 file changed

+2
-2
lines changed

content/ai_exchange/content/docs/2_threats_through_use.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -40,7 +40,7 @@ These are the controls for input threats in general - more specific controls are
4040
Monitor use: observe, correlate, and log model usage (date, time, user), inputs, outputs, and system behavior to identify events or patterns that may indicate a cybersecurity incident. This can be used to reconstruct incidents, and make it part of the existing incident detection process - extended with AI-specific methods, including:
4141

4242
- Improper functioning of the model (see [#CONTINUOUS VALIDATION](/goto/continuousvalidation/), [#UNWANTED BIAS TESTING](/goto/unwantedbiastesting/))
43-
- Suspicious patterns of model use (e.g., high frequency - see [#RATE LIMIT](#ratelimit) and [#OVERSIGHT](/goto/oversight/)).
43+
- Suspicious patterns of model use (e.g., high frequency - see [#RATE LIMIT](/goto/ratelimit/) and [#OVERSIGHT](/goto/oversight/)).
4444
- Suspicious inputs or series of inputs (see [#ANOMALOUS INPUT HANDLING](/goto/anomalousinputhandling/), [#UNWANTED INPUT SERIES HANDLING](/goto/unwantedinputserieshandling/), [#EVASION INPUT HANDLING](/goto/evasioninputhandling/) and [#PROMPT INJECTION I/O handling](/goto/promptinjectioniohandling/)).
4545

4646
By adding details to logs on the version of the model used and the output, troubleshooting becomes easier. This control provides centralized visibility into how AI systems are used over time and across actors, sessions, and models.
@@ -1578,7 +1578,7 @@ Unless the purpose of your AI project is intended to include or encourage this k
15781578
The vagueness of words impact the effectiveness of language based guards. Unlike traditional engineering, the meaning of words are long known to drift through time in linguistics. Events will occur in real time and shift morality and ethics.
15791579
To compensate for this weakness, lengthening the system prompt to increase precision is sometimes used ([#OVERSIGHT](/goto/oversight/)).
15801580

1581-
However, emerging news or events that are against a certain countries’ national values cannot be effectively addressed in real time. In these cases, red teaming techniques (#promptinputinvalidation) can be used. They are preferably continuously updated with concerns according to your region of interest in order to reveal the weaknesses of your LLM and use guardrails ([#FILTER SENSITIVE MODEL OUTPUT](/goto/filtersensitivemodeloutput/), [#OVERSIGHT](/goto/oversight/)) to filter out the responses that are unwanted for additional protection.
1581+
However, emerging news or events that are against a certain countries’ national values cannot be effectively addressed in real time. In these cases, red teaming techniques (see [Testing section](/goto/testing/)) can be used. They are preferably continuously updated with concerns according to your region of interest in order to reveal the weaknesses of your LLM and use guardrails (see [#OVERSIGHT](/goto/oversight/)) to filter out or alert to the responses that are unwanted for additional protection.
15821582

15831583
### Culture-aware explanation of output refusal
15841584

0 commit comments

Comments
 (0)