Skip to content

Commit 8786f53

Browse files
style + content
1 parent cbe856c commit 8786f53

File tree

7 files changed

+101
-17
lines changed

7 files changed

+101
-17
lines changed

docs/assets/info.svg

Lines changed: 1 addition & 0 deletions
Loading

docs/assets/invariant.css

Lines changed: 17 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -454,7 +454,7 @@ label.md-nav__title {
454454
flex-wrap: wrap;
455455
flex-direction: row;
456456
padding: 4pt;
457-
padding-left: 3pt;
457+
padding-left: 4pt;
458458
padding-top: 9pt;
459459
align-items: flex-start;
460460
justify-content: flex-start;
@@ -683,6 +683,22 @@ ul.md-nav__list {
683683
margin-top: -5pt;
684684
}
685685

686+
.info blockquote {
687+
background-color: rgb(243, 245, 254);
688+
border: 2pt solid #8766ff !important;
689+
}
690+
691+
.info blockquote>p>strong:first-child {
692+
margin-bottom: 10pt;
693+
display: inline-block;
694+
padding-left: 25pt;
695+
696+
background: url("../assets/info.svg") no-repeat 3pt 1pt;
697+
background-size: 1.2em;
698+
padding-top: -1pt;
699+
margin-top: -5pt;
700+
}
701+
686702
.box.secondary {
687703
position: relative;
688704
}

docs/guardrails/introduction.md

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,11 @@ Guardrailing agents can be a complex undertaking, as it involves understanding t
88

99
In this chapter, we will cover the fundamentals of guardrailing with Invariant, with a primary focus on how Invariant allows you to write strict and fuzzy rules that precisely constrain your agent's behavior.
1010

11+
<div class="info"/>
12+
> **Get Started Directly**<br/>
13+
> Just looking to get started quickly? Take a look at our concise [rule writing reference](./rules.md) to jump right into code. This document serves as a more general introduction to the concepts of how to write rules with Invariant.
14+
15+
1116
## Understanding Your Agent's Capabilities
1217

1318
Before securing an agent, it is important to understand its capabilities. This includes understanding the tools and functions that the agent can call, as well as the parameters that can be passed to these functions, e.g. can it access private information, sensitive data, can it send emails, can it take destructive actions like deleting files or making payments, etc.
Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
2+
## `prompt_injection(content: str, threshold: number = 0.9)`
3+
4+
Checks for prompt injections in the provided piece of content.

docs/guardrails/rules.md

Lines changed: 39 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,42 @@
1-
# Reference Document for Rule Writing
1+
# Reference for Rule Writing
22

33
<div class="subtitle">
44
A concise reference for writing guardrailing rules with Invariant.
5-
</div>
5+
</div>
6+
7+
## Setting Up Your LLM Client
8+
9+
To get started with guardrailing, you have to setup your LLM client to use [Invariant Gateway](../gateway/index.md):
10+
11+
**Example:** Setting Up Your OpenAI client to use Guardrails
12+
```python hl_lines='8 9 10 16 17 18 19 20 21 22 23 24'
13+
import os
14+
from openai import OpenAI
15+
16+
# 1. Guardrailing Rules
17+
18+
guardrails = """
19+
raise "Rule 1: Do not talk about Fight Club" if:
20+
(msg: Message)
21+
"fight club" in msg.content
22+
"""
23+
24+
25+
# 2. Gateway Integration
26+
27+
client = OpenAI(
28+
default_headers={
29+
"Invariant-Authorization": "Bearer " + os.getenv("INVARIANT_API_KEY"),
30+
"Invariant-Guardrails": guardrails.encode("unicode_escape"),
31+
},
32+
base_url="https://explorer.invariantlabs.ai/api/v1/gateway/openai",
33+
)
34+
35+
# 3. Using the model
36+
client.chat.completions.create(
37+
messages=[{"role": "user", "content": "What do you know about Fight Club?"}],
38+
model="gpt-4o",
39+
)
40+
```
41+
42+
Before you run, make sure you export the relevant environment variables including an `INVARIANT_API_KEY` [(get one here)](https://explorer.invariantlabs.ai/settings), which you'll need to access Gateway and our low-latency Guardrailing API.

docs/guardrails/tool-calls.md

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ To prevent tool calling related risks, Invariant offers a wide range of options
3333
To match a specific tool call in a guardrailing rule, you can use `call is tool:<tool_name>` expressions. This allows you to only match a specific tool call, and apply guardrailing rules to it.
3434

3535
**Example**: Matching all `send_email` tool call
36-
```python
36+
```guardrail
3737
raise "Must not send any emails" if:
3838
(call: ToolCall)
3939
call is tool:send_email
@@ -46,7 +46,7 @@ This rule will trigger for all tool calls to function `send_email`, disregarding
4646
Tool calls can also be matched by their parameters. This allows you to match only tool calls with specific parameters, e.g. to block them or to restrict the tool interface exposed to the agent.
4747

4848
**Example**: Matching a `send_email` tool call with a specific recipient
49-
```python
49+
```guardrail
5050
raise "Must not send any emails to Alice" if:
5151
(call: ToolCall)
5252
call is tool:send_email({
@@ -59,7 +59,7 @@ raise "Must not send any emails to Alice" if:
5959
Similarly, you can use regex matching to match tool calls with specific parameters. This allows you to match specific tool calls with specific parameters, and apply guardrailing rules to them.
6060

6161
**Example**: Matching a `send_email` calls with a specific recipient domain
62-
```python
62+
```guardrail
6363
raise "Must not send any emails to <anyone>@disallowed.com" if:
6464
(call: ToolCall)
6565
call is tool:send_email({
@@ -72,7 +72,7 @@ raise "Must not send any emails to <anyone>@disallowed.com" if:
7272
You can also use content matching to match tool arguments with certain properties, like whether they contain personally identifiable information (PII), or whether they are flagged as toxic or inappropriate. This allows you to match specific tool calls with specific parameters, and apply guardrailing rules to them.
7373

7474
**Example**: Prevent `send_email` calls with phone numbers in the message body.
75-
```python
75+
```guardrail
7676
raise "Must not send any emails to <anyone>@disallowed.com" if:
7777
(call: ToolCall)
7878
call is tool:send_email({
@@ -86,7 +86,7 @@ This type of content matching also works for other types of content, including `
8686

8787
Alternatively, you can also directly use `invariant.detectors.pii` on the tool call arguments like so:
8888

89-
```python
89+
```guardrail
9090
from invariant.detectors import pii
9191
9292
raise "Must not send any emails to <anyone>@disallowed.com" if:
@@ -102,7 +102,7 @@ raise "Must not send any emails to <anyone>@disallowed.com" if:
102102
Similar to tool calls, you can check and validate tool outputs.
103103

104104
**Example**: Raise an error if PII is detected in the tool output
105-
```python
105+
```guardrail
106106
raise "PII in tool output" if:
107107
(out: ToolOutput)
108108
len(pii(out.content)) > 0
@@ -113,7 +113,7 @@ raise "PII in tool output" if:
113113
You can also check only certain tool outputs, e.g. to only check the output of a specific tool call.
114114

115115
**Example**: Raise an error if PII is detected in the tool output
116-
```python
116+
```guardrail
117117
from invariant.detectors import moderated
118118
119119
raise "Moderated content in tool output" if:
@@ -130,7 +130,7 @@ Here, only if the `read_website` tool call returns moderated content, the rule w
130130
To limit your guardrailing rule to a list of different tools, you can also access a tool's name directly:
131131

132132
**Example**: Raise an error if any of the banned tools is used.
133-
```python
133+
```guardrail
134134
raise "Banned tool used" if:
135135
(call: ToolCall)
136136
call.function.name in ["send_email", "delete_file"]

docs/index.md

Lines changed: 27 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -10,22 +10,43 @@ Integrate Invariant's contextual guardrailing for high-precision agent security,
1010

1111
Invariant is a **security layer to protect agentic AI systems**. It helps you prevent prompt injections, data leaks, steer your agent's behavior, and ensure compliance with your organization's policies.
1212

13-
Using a **highly-expressive and self-learning guardrailing system**, Invariant offers precise dataflow and steering capabilities, ensuring that your agents are secure and reliable.
14-
15-
You can **deploy Invariant within minutes**, using our hosted gateway, to ensure quick response to agent security incidents and to prevent prompt injections and data leaks.
13+
You can **deploy Invariant within minutes using our hosted gateway**, to ensure quick response to agent security incidents and to get your agent ready for production.
1614

1715
### How Invariant Works
1816

19-
Invariant acts as a transparent layer between your agent system and the LLM and tool providers. It intercepts all LLM calls and tool actions, and applies guardrailing rules according to a user-specified security policy, i.e. your guardrailing rules.
17+
Invariant acts as a transparent layer between your agent system and the LLM and tool providers. It intercepts all LLM calls and tool actions, and applies steering rules according to a provided guardrailing policies.
18+
19+
Policies are defined in terms of both [deterministic and fuzzy rules](./guardrails/). During operation, your agent is continuously evaluated against them, to restrict its behavior to prevent malfunction and abuse.
2020

21-
It does not require any invasive code changes, and can be used with any agent system, framework and LLM.
21+
Invariant does not require invasive code changes, and can be used with any agent, framework and LLM.
2222

2323
<br/><br/>
2424
<img src="./assets/invariant-overview.svg" alt="Invariant Architecture" class="invariant-architecture" style="display: block; margin: 0 auto; width: 100%; max-width: 500pt;"/>
2525
<br/><br/>
2626

27+
In this setup, a simple Invariant rule for safeguarding against leakage flows in an agent looks like this:
28+
29+
```python
30+
raise "agent leaks internal data" if:
31+
# check all flows between tool calls
32+
(output: ToolOutput) -> (call: ToolCall)
33+
# detects sensitive data in the first output
34+
is_sensitive(output.content)
35+
# detects a potentially sensitive action like sending an email
36+
call is tool:send_email
37+
```
38+
39+
Many security rules like these ship out-of-the-box with Invariant, and you can easily define your own rules to suit your needs and policies.
40+
2741
This documentation describes how to set up Invariant and the relevant guardrailing rules for your agent systems such that you can secure your agents and prevent them from engaging in malicious behavior.
2842

43+
<div class='tiles'>
44+
<a href="#getting-started-as-developer" class='tile primary'>
45+
<span class='tile-title'>Get Started As Developer →</span>
46+
<span class='tile-description'>Deploy your first guardrailing rules with Gateway</span>
47+
</a>
48+
</div>
49+
2950
## Why You Need A Security Layer for Agents
3051

3152
Invariant helps you make sure that your agents are safe from malicious actors and prevents fatal malfunction:
@@ -187,7 +208,7 @@ You can use each tool independently, or in combination with each other. The foll
187208
</div>
188209
<div class='offline'>
189210
<div class='title'>Trace Analysis</div>
190-
<a class='box fill' href='https://github.com/invariantlabs-ai/invariant?tab=readme-ov-file#analyzer'>
211+
<a class='box fill' href='./guardrails'>
191212
<p>Guardrails <i class='more'>↗ </i></p>
192213
<i>Steer and protect your agents</i>
193214
</a>

0 commit comments

Comments
 (0)