invariantlabs-ai
diff --git a/‎docs/assets/invariant.css‎
Lines changed: 87 additions & 8 deletions b/‎docs/assets/invariant.css‎
Lines changed: 87 additions & 8 deletions
diff --git a/‎docs/guardrails/copyright.md‎
Lines changed: 45 additions & 0 deletions b/‎docs/guardrails/copyright.md‎
Lines changed: 45 additions & 0 deletions
diff --git a/‎docs/guardrails/images.md‎
Lines changed: 84 additions & 1 deletion b/‎docs/guardrails/images.md‎
Lines changed: 84 additions & 1 deletion
diff --git a/‎docs/guardrails/moderation.md‎
Lines changed: 69 additions & 0 deletions b/‎docs/guardrails/moderation.md‎
Lines changed: 69 additions & 0 deletions
diff --git a/‎docs/guardrails/pii.md‎
Lines changed: 5 additions & 5 deletions b/‎docs/guardrails/pii.md‎
Lines changed: 5 additions & 5 deletions
@@ -380,13 +380,13 @@ span.llm::before {
 span.llm-badge::before {
     content: "LLM-based";
     color: white;
-    font-size: 8pt;
+    font-size: 10pt;
     position: relative;
     top: -3pt;
     margin-left: 3pt;
     background-color: rgb(199, 130, 199);
     display: inline-block;
-    height: 16pt;
+    height: 18pt;
 
     padding: 2pt 4pt;
     border-radius: 4pt;
@@ -407,12 +407,79 @@ span.detector-badge::before {
     border-radius: 4pt;
 }
 
+span.parser-badge::before {
+    content: "Parser";
+    color: #eef2ff;
+    font-size: 10pt;
+    position: relative;
+    top: -3pt;
+    margin-left: 3pt;
+    background-color: #3A99FF;
+    display: inline-block;
+    height: 18pt;
+    
+    padding: 2pt 4pt;
+    border-radius: 4pt;
+}
+
+.builtin-badge::before {
+    content: "Builtin";
+    color: #eef2ff;
+    font-size: 10pt;
+    position: relative;
+    top: -3pt;
+    margin-left: 3pt;
+    background-color: #3A99FF;
+    display: inline-block;
+    height: 18pt;
+    
+    padding: 2pt 4pt;
+    border-radius: 4pt;
+}
+
+.parser-badge[size-mod="small"]::before {
+    font-size: 10pt;
+    height: 16pt;
+    padding: 0pt 3pt;
+    top: 0pt; 
+    margin-left: 0pt;
+}
+
+
+.builtin-badge[size-mod="small"]::before {
+    font-size: 10pt;
+    height: 16pt;
+    padding: 0pt 3pt;
+    top: 0pt; 
+    margin-left: 0pt;
+}
+
+
 .detector-badge {
     position: relative;
-  }
-  
-  .detector-badge:hover::after {
+ }
+
+.detector-badge:hover::after {
     content: 'DETECTOR DESCRIPTION';
+}
+
+.parser-badge {
+    position: relative;
+}
+
+.parser-badge:hover::after {
+    content: 'PARSER DESCRIPTION';
+}
+
+.builtin-badge {
+    position: relative;
+}
+
+.builtin-badge:hover::after {
+    content: 'BUILTIN DESCRIPTION';
+}
+
+.parser-badge:hover::after, .detector-badge:hover::after, .llm-badge:hover::after, .builtin-badge:hover::after {
     position: absolute;
     left: 50%;
     transform: translateX(-50%);
@@ -426,7 +493,7 @@ span.detector-badge::before {
     white-space: nowrap;
     z-index: 99;
     pointer-events: none;
-  }
+}
 
 .jupyter-wrapper {
     margin-top: -20pt;
@@ -773,7 +840,7 @@ ul.md-nav__list {
 /* Set minimum widths for the first two columns */
 .md-typeset__table th:nth-child(1), 
 .md-typeset__table td:nth-child(1) {
-    width: 15%;
+    width: 22%;
     min-width: 100px;
 }
 
@@ -786,7 +853,7 @@ ul.md-nav__list {
 /* Let the description column take up remaining space */
 .md-typeset__table th:nth-child(3), 
 .md-typeset__table td:nth-child(3) {
-    width: 60%;
+    width: 50%;
 }
 
 .function-type {
@@ -860,4 +927,16 @@ ul.md-nav__list {
     text-decoration: none;
     color: var(--md-accent-fg-color);
     opacity: 1.0;
+}
+
+.boolean-value-true {
+    color: var(--md-code-hl-keyword-color);
+    font-weight: 500;
+    font-family: monospace;
+}
+
+.boolean-value-false {
+    color: var(--md-code-hl-function-color);
+    font-weight: 500;
+    font-family: monospace;
 }
@@ -0,0 +1,45 @@
+# Copyrighted Content
+<div class='subtitle'>
+{subheading}
+</div>
+
+{introduction}
+<div class='risks'/> 
+> **Copyrighted Content Risks**<br/> 
+> Without safeguards, agents may: 
+
+> * {reasons}
+
+{bridge}
+
+## copyright <span class="detector-badge"></span>
+```python
+def copyright(
+    data: Union[str, List[str]],
+) -> List[str]
+```
+Detects potentially copyrighted material in the given `data`.
+
+**Parameters**
+
+| Name        | Type   | Description                            |
+|-------------|--------|----------------------------------------|
+| `data`      | `Union[str, List[str]]` |  A single message or a list of messages. |
+
+**Returns**
+
+| Type   | Description                            |
+|--------|----------------------------------------|
+| `List[str]` |  List of detected copyright types. For example, `["GNU_AGPL_V3", "MIT_LICENSE", ...]`|
+
+### Detecting Copyrighted content
+
+**Example:** Detecting Copyrighted content
+```python
+from invariant.detectors import copyright
+
+raise "found copyrighted code" if:
+    (msg: Message)
+    not empty(copyright(msg.content, threshold=0.75))
+```
+<div class="code-caption">{little text bit}</div>
@@ -45,4 +45,87 @@ raise "Copyrighted text in image" if:
     (msg: Assistant)
     images := image(msg) # Extract all images in a single message
     copyright(ocr(images))
-```
+```
+
+
+## ocr <span class="parser-badge"/>
+```python
+def ocr(
+    data: Union[str, List[str]],
+    config: Optional[dict]
+) -> List[str]
+```
+Parser to extract text from images.
+
+**Parameters**
+
+| Name        | Type   | Description                            |
+|-------------|--------|----------------------------------------|
+| `data`      | `Union[str, List[str]]` | A single base64 encoded image or a list of base64 encoded images. |
+
+**Returns**
+
+| Type   | Description                            |
+|--------|----------------------------------------|
+| `List[str]` | A list of extracted pieces of text from `data`. |
+
+### Analyzing Text in Images
+The `ocr` function is a  <span class="parser-badge" size-mod="small"></span> so it returns the data found from parsing its content, in this case extracting text from an image. The extracted text can then be used for further detection, for example detecting a prompt injection in an image, like the example below.
+
+**Example:** Image Prompt Injection Detection.
+```python
+from invariant.detectors import prompt_injection
+from invariant.parsers import ocr
+
+raise "Found Prompt Injection in Image" if:
+    (msg: Image)
+    ocr_results := ocr(msg)
+    prompt_injection(ocr_results)
+```
+<div class="code-caption"> The text extracted from the image can be checked using, for example, detectors.</div>
+
+
+## image <span class="builtin-badge"/>
+
+```python
+def image(
+    content: Union[Content | List[Content]]
+) -> List[Image]
+```
+Given some `Content`, this <span class="builtin-badge" size-mod="small"></span> extracts all images. This is useful when messages may contain mixed content.
+
+**Parameters**
+
+| Name        | Type   | Description                            |
+|-------------|--------|----------------------------------------|
+| `content`      | `Union[Content | List[Content]]` | A single instance of `Content` or a list of `Content`, possibly with mixed types. |
+
+**Returns**
+
+| Type   | Description                            |
+|--------|----------------------------------------|
+| `List[Image]` | A list of extracted `Image`s from `content`. |
+
+
+### Extracting Images
+Some policies may wish to check images and text in specific ways. Using `image` and `text` we can create a policy that detects prompt injection attacks in user input, even when we allow users to submit images.
+
+**Example:** Prompt Injection Detection in Both Images and Text 
+```python
+from invariant.detectors import prompt_injection
+from invariant.parsers import ocr
+
+raise "Found Prompt Injection" if:
+    (msg: Message)
+
+    # Only check user messages
+    msg.role == 'user'
+    
+    # Use image function to get images
+    ocr_results := ocr(image(msg))
+
+    # Check both text and images
+    prompt_injection(text(msg))
+    prompt_injection(ocr_results)
+```
+<div class="code-caption"> Extract specific content types from mixed-content messages.</div>
@@ -0,0 +1,69 @@
+# Moderated and Toxic Content
+<div class='subtitle'>
+{subheading}
+</div>
+
+{introduction}
+<div class='risks'/> 
+> **Moderated and Toxic Content Risks**<br/> 
+> Without safeguards, agents may: 
+
+> * {reasons}
+
+{bridge}
+
+## moderated <span class="detector-badge"></span> <span class="llm-badge"/></span>
+```python
+def moderated(
+    data: Union[str, List[str]],
+    model: Optional[str],
+    default_threshhold: Optional[float],
+    cat_threshold: Optional[Dict[str, float]]
+) -> bool
+```
+Detector which evaluates to true if the given data should be moderated.
+
+**Parameters**
+
+| Name        | Type   | Description                            |
+|-------------|--------|----------------------------------------|
+| `data`      | `Union[str, List[str]]` | A single message or a list of messages to detect prompt injections in. |
+| `model`     | `Union[str, List[str]]` |  The model to use for moderation detection. |
+| `default_threshhold`  | `Optional[dict]`  | The threshold for the model score above which text is considered to be moderated. |
+| `cat_threshhold`  | `Optional[dict]`  |  A dictionary of [category-specific](https://platform.openai.com/docs/guides/moderation#quickstart) thresholds. |
+
+**Returns**
+
+| Type   | Description                            |
+|--------|----------------------------------------|
+| `bool` | <span class='boolean-value-true'>TRUE</span> if a prompt injection was detected, <span class='boolean-value-false'>FALSE</span> otherwise |
+
+### Detecting Harmful Messages
+To detect content that you want to moderate in messages, you can directly apply the `moderated` function to messages. 
+
+**Example:** Harmful Message Detection
+```python
+from invariant.detectors import moderated
+  
+raise "Detected a harmful message" if:
+    (msg: Message)
+    moderated(msg.content)
+```
+<div class="code-caption">Default moderation detection.</div>
+
+
+### Thresholding
+The threshold for when content is classified as requiring moderation can also be modified using the `cat_threshold` parameter.
+
+**Example:** Thresholding Detection
+```python
+from invariant.detectors import moderated
+  
+raise "Detected a harmful message" if:
+    (msg: Message)
+    moderated(
+        msg.content,
+        cat_thresholds={"hate/threatening": 0.15}
+    )
+```
+<div class="code-caption">Thresholding for a specific category.</div>
@@ -18,7 +18,7 @@ The `pii` function helps prevent these issues by scanning messages for PII, thus
 ```python
 def pii(
     data: Union[str, List[str]],
-    entities: Optional[List[str]] = None
+    entities: Optional[List[str]]
 ) -> List[str]
 ```
 Detector to find personally indentifaible information in text.
@@ -27,7 +27,7 @@ Detector to find personally indentifaible information in text.
 
 | Name        | Type   | Description                            |
 |-------------|--------|----------------------------------------|
-| `data`      | `Union[str, List[str]]` | A single message or a list of messages to detect PII in |
+| `data`      | `Union[str, List[str]]` | A single message or a list of messages to detect PII in. |
 | `entities`  | `Optional[List[str]]`   | A list of [PII entity types](https://microsoft.github.io/presidio/supported_entities/) to detect. Defaults to detecting all types. |
 
 **Returns**
@@ -40,7 +40,7 @@ Detector to find personally indentifaible information in text.
 The simplest usage of the `pii` function is to check against any message. The following example will raise an error if any message in the trace contains PII.
 
 **Example:** Detecting any PII in any message.
-``` py
+```python
 from invariant.detectors import pii
 
 raise "Found PII in message" if:
@@ -54,7 +54,7 @@ raise "Found PII in message" if:
 You can also specify specific types of PII that you would like to detect, such as phone numbers, emails, or credit card information. The example below demonstrates how to detect credit card numbers in Messages.
 
 **Example:** Detecting Credit Card Numbers.
-```guardrail
+```python
 from invariant.detectors import pii
 
 raise "Found PII in message" if:
@@ -64,7 +64,7 @@ raise "Found PII in message" if:
 <div class="code-caption"> Only messages containing credit card numbers will raise an error. </div>
 
 
-### Preventing PII leakage
+### Preventing PII Leakage
 It is also possible to use the `pii` function in combination with other filters to get more complex behaviour. The example below shows how you can detect when an agent attempts to send emails outside of your organisation. 
 
 **Example:** Detecting PII Leakage in External Communications.