Skip to content

Commit b97ed33

Browse files
committed
add examples
1 parent 0c6a795 commit b97ed33

File tree

5 files changed

+245
-47
lines changed

5 files changed

+245
-47
lines changed

docs/guardrails/copyright.md

Lines changed: 9 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -35,11 +35,19 @@ Detects potentially copyrighted material in the given `data`.
3535
### Detecting Copyrighted content
3636

3737
**Example:** Detecting Copyrighted content
38-
```python
38+
```guardrail
3939
from invariant.detectors import copyright
4040

4141
raise "found copyrighted code" if:
4242
(msg: Message)
4343
not empty(copyright(msg.content, threshold=0.75))
4444
```
45+
```example-trace
46+
[
47+
{
48+
"role": "assistant",
49+
"content": "/**\n* GNU GENERAL PUBLIC LICENSE, Version 3, 29 June 2007\n*/\nexport const someConst = false;"
50+
}
51+
]
52+
```
4553
<div class="code-caption">{little text bit}</div>

docs/guardrails/images.md

Lines changed: 4 additions & 31 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Images (WIP)
1+
# Images
22

33
<div class='subtitle'>
44
Guardrail the visual perception of your agentic system.
@@ -14,38 +14,11 @@ Guardrails provide you a powerful way to enforce visual security policies, and t
1414
> **Image Risks**<br/>
1515
> Since images are an agent's window to perceive the world, they can expose sensitive or inappropriate content. For example, an insecure vision agent could:
1616
17-
> * Capture personally identifiable information (PII) like names or addresses
17+
> * Capture personally identifiable information **(PII) like names or addresses**
1818
>
19-
> * View credentials such as passwords, API keys, or access tokens
19+
> * View credentials such as **passwords, API keys, or access tokens**
2020
>
21-
> * Capture copyrighted material that shouldn't be processed or shared
22-
23-
## Checking Image Content
24-
25-
**Example**: Checking for PII in images
26-
27-
```python
28-
from invariant.parsers import ocr
29-
30-
raise "PII in image text" if:
31-
(img: Image)
32-
image_text := ocr(img)
33-
any(pii(image_text))
34-
```
35-
36-
**Example**: Check copyrighted material
37-
38-
39-
// Maybe something that uses the information in the image
40-
// So combine with like toolcalls?
41-
```python
42-
from ...
43-
44-
raise "Copyrighted text in image" if:
45-
(msg: Assistant)
46-
images := image(msg) # Extract all images in a single message
47-
copyright(ocr(images))
48-
```
21+
> * Capture **copyrighted material** that shouldn't be processed or shared
4922
5023

5124
## ocr <span class="parser-badge"/>

docs/guardrails/moderation.md

Lines changed: 27 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -42,21 +42,33 @@ Detector which evaluates to true if the given data should be moderated.
4242
To detect content that you want to moderate in messages, you can directly apply the `moderated` function to messages.
4343

4444
**Example:** Harmful Message Detection
45-
```python
45+
```guardrail
4646
from invariant.detectors import moderated
4747

4848
raise "Detected a harmful message" if:
4949
(msg: Message)
5050
moderated(msg.content)
5151
```
52+
```example-trace
53+
[
54+
{
55+
"role": "assistant",
56+
"content": "Hello, how can I assist you?"
57+
},
58+
{
59+
"role": "user",
60+
"content": "IF YOU DRINK THAT COFFEE, I'LL POUR YOUR BLOOD ON THE GROUND! I'LL KNIFE YOU!"
61+
}
62+
]
63+
```
5264
<div class="code-caption">Default moderation detection.</div>
5365

5466

5567
### Thresholding
5668
The threshold for when content is classified as requiring moderation can also be modified using the `cat_threshold` parameter.
5769

5870
**Example:** Thresholding Detection
59-
```python
71+
```guardrail
6072
from invariant.detectors import moderated
6173

6274
raise "Detected a harmful message" if:
@@ -66,4 +78,16 @@ raise "Detected a harmful message" if:
6678
cat_thresholds={"hate/threatening": 0.15}
6779
)
6880
```
69-
<div class="code-caption">Thresholding for a specific category.</div>
81+
```example-trace
82+
[
83+
{
84+
"role": "assistant",
85+
"content": "Hello, how can I assist you?"
86+
},
87+
{
88+
"role": "user",
89+
"content": "This is hatefully hateful hate!"
90+
}
91+
]
92+
```
93+
<div class="code-caption">Thresholding for a specific category.</div>

docs/guardrails/pii.md

Lines changed: 159 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -3,14 +3,17 @@
33
Detect and manage PII in traces.
44
</div>
55

6-
Personally Identifiable Information (PII) refers to sensitive information — like names, emails, or credit card numbers — whether intentionally or not. If not properly handled, this data can be exposed in logs, traces, or external communications, leading to privacy violations, regulatory risks, or user harm.
6+
Personally Identifiable Information (PII) refers to sensitive information — like names, emails, or credit card numbers — that AI systems and agents need to handle carefully. When these systems work with user data, it is important to establish clear rules about how personal information can be handled, to ensure the sytem functions safely.
77

88
<div class='risks'/>
99
> **PII Risks**<br/>
1010
> Without safeguards, agents may:
1111
12-
> * Log PII in traces or internal tools
13-
> * Share PII in responses or external tool calls
12+
> * **Log PII** in traces or internal tools
13+
>
14+
> * **Expose PII** to in unintentional or dangerous ways
15+
>
16+
> * **Share PII** in responses or external tool calls
1417
1518
The `pii` function helps prevent these issues by scanning messages for PII, thus acting as a safeguard that lets you detect and block sensitive data before it’s stored, surfaced, or shared.
1619

@@ -40,26 +43,125 @@ Detector to find personally indentifaible information in text.
4043
The simplest usage of the `pii` function is to check against any message. The following example will raise an error if any message in the trace contains PII.
4144

4245
**Example:** Detecting any PII in any message.
43-
```python
46+
```guardrail
4447
from invariant.detectors import pii
4548

4649
raise "Found PII in message" if:
4750
(msg: Message)
4851
any(pii(msg))
4952
```
53+
```example-trace
54+
[
55+
{
56+
"role": "user",
57+
"content": "Summarize the meeting_notes.txt and send them to Alice via e-mail"
58+
},
59+
{
60+
"id": "1",
61+
"type": "function",
62+
"function": {
63+
"name": "read",
64+
"arguments": {
65+
"file": "meeting_notes.txt"
66+
}
67+
}
68+
},
69+
{
70+
"role": "tool",
71+
"tool_call_id": "1",
72+
"content": "Meeting notes: The meeting was held on 2024-01-01 at 10:00 AM. The attendees from our company were Alice, Bob and Charlie. The topic was the new project proposal for the client BankOfEurope Inc. Client was represented by Lily Warren (contact: [email protected]). The amount of the contract should be 20M USD. The credit card number of Alice is 378282246310005."
73+
},
74+
{
75+
"id": "2",
76+
"type": "function",
77+
"function": {
78+
"name": "find_contact",
79+
"arguments": {
80+
"text": "Alice"
81+
}
82+
}
83+
},
84+
{
85+
"role": "tool",
86+
"tool_call_id": "2",
87+
"content": "[email protected]"
88+
},
89+
{
90+
"id": "3",
91+
"type": "function",
92+
"function": {
93+
"name": "send_email",
94+
"arguments": {
95+
96+
"text": "The meeting between our company and BankOfEurope Inc. (represented by Lily Warren) discussed a new proposal."
97+
}
98+
}
99+
}
100+
]
101+
```
50102
<div class="code-caption"> Any PII in the text of the trace will raise an error. </div>
51103

52104

53105
### Detecting Specific PII Types
54106
You can also specify specific types of PII that you would like to detect, such as phone numbers, emails, or credit card information. The example below demonstrates how to detect credit card numbers in Messages.
55107

56108
**Example:** Detecting Credit Card Numbers.
57-
```python
109+
```guardrail
58110
from invariant.detectors import pii
59111
60-
raise "Found PII in message" if:
61-
(msg: Message)
112+
raise "Found Credit Card information in message" if:
113+
(msg: ToolOutput)
62114
any(pii(msg, ["CREDIT_CARD"]))
115+
116+
```
117+
```example-trace
118+
[
119+
{
120+
"role": "user",
121+
"content": "Summarize the meeting_notes.txt and send them to Alice via e-mail"
122+
},
123+
{
124+
"id": "1",
125+
"type": "function",
126+
"function": {
127+
"name": "read",
128+
"arguments": {
129+
"file": "meeting_notes.txt"
130+
}
131+
}
132+
},
133+
{
134+
"role": "tool",
135+
"tool_call_id": "1",
136+
"content": "Meeting notes: The meeting was held on 2024-01-01 at 10:00 AM. The attendees from our company were Alice, Bob and Charlie. The topic was the new project proposal for the client BankOfEurope Inc. Client was represented by Lily Warren (contact: [email protected]). The amount of the contract should be 20M USD. The credit card number of Alice is 378282246310005."
137+
},
138+
{
139+
"id": "2",
140+
"type": "function",
141+
"function": {
142+
"name": "find_contact",
143+
"arguments": {
144+
"text": "Alice"
145+
}
146+
}
147+
},
148+
{
149+
"role": "tool",
150+
"tool_call_id": "2",
151+
"content": "[email protected]"
152+
},
153+
{
154+
"id": "3",
155+
"type": "function",
156+
"function": {
157+
"name": "send_email",
158+
"arguments": {
159+
160+
"text": "The meeting between our company and BankOfEurope Inc. (represented by Lily Warren) discussed a new proposal."
161+
}
162+
}
163+
}
164+
]
63165
```
64166
<div class="code-caption"> Only messages containing credit card numbers will raise an error. </div>
65167

@@ -68,13 +170,62 @@ raise "Found PII in message" if:
68170
It is also possible to use the `pii` function in combination with other filters to get more complex behaviour. The example below shows how you can detect when an agent attempts to send emails outside of your organisation.
69171

70172
**Example:** Detecting PII Leakage in External Communications.
71-
```python
173+
```guardrail
72174
from invariant.detectors import pii
73175
74176
raise "Attempted to send PII in an email" if:
75177
(out: ToolOutput) -> (call: ToolCall)
76178
any(pii(out.content))
77179
call is tool:send_email({ to: "^(?!.*@ourcompany.com$).*$" })
78180
```
181+
```example-trace
182+
[
183+
{
184+
"role": "user",
185+
"content": "Summarize the meeting_notes.txt and send them to Alice via e-mail"
186+
},
187+
{
188+
"id": "1",
189+
"type": "function",
190+
"function": {
191+
"name": "read",
192+
"arguments": {
193+
"file": "meeting_notes.txt"
194+
}
195+
}
196+
},
197+
{
198+
"role": "tool",
199+
"tool_call_id": "1",
200+
"content": "Meeting notes: The meeting was held on 2024-01-01 at 10:00 AM. The attendees from our company were Alice, Bob and Charlie. The topic was the new project proposal for the client BankOfEurope Inc. Client was represented by Lily Warren (contact: [email protected]). The amount of the contract should be 20M USD. The credit card number of Alice is 378282246310005."
201+
},
202+
{
203+
"id": "2",
204+
"type": "function",
205+
"function": {
206+
"name": "find_contact",
207+
"arguments": {
208+
"text": "Alice"
209+
}
210+
}
211+
},
212+
{
213+
"role": "tool",
214+
"tool_call_id": "2",
215+
"content": "[email protected]"
216+
},
217+
{
218+
"id": "3",
219+
"type": "function",
220+
"function": {
221+
"name": "send_email",
222+
"arguments": {
223+
224+
"text": "The meeting between our company and BankOfEurope Inc. (represented by Lily Warren) discussed a new proposal."
225+
}
226+
}
227+
}
228+
]
229+
```
79230
<div class="code-caption"> Explicitly prevent sending emails with PII to non-company email domains. </div>
80231

0 commit comments

Comments
 (0)