You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/guardrails/code-validation.md
+72-89Lines changed: 72 additions & 89 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -5,18 +5,14 @@ Secure the code that your agent generates and executes.
5
5
</div>
6
6
7
7
Code validation is a critical component of any code-generating LLM system, as it helps to ensure that the code generated by the LLM is safe and secure. Guardrails provides a simple way to validate the code generated by your LLM, using a set of integration and code parsing capabilities.
8
+
9
+
!!! danger "Code Validation Risks"
10
+
Code validation is a critical component of any code-generating LLM system. An insecure agent could:
8
11
9
-
<divclass='risks'/>
10
-
> **Code Validation Risks**<br/>
11
-
> Code validation is a critical component of any code-generating LLM system. An insecure agent could:
12
-
13
-
> * Generate code that contains **security vulnerabilities**, such as SQL injection or cross-site scripting
14
-
15
-
> * Generate code that **contains bugs or errors**, causing the system to crash or behave unexpectedly
16
-
17
-
> * Produce code that escapes a **sandboxed execution environment**
18
-
19
-
> * Generate code that is **not well-formed or does not follow best practices**, causing the system to be difficult to maintain or understand
12
+
- Generate code that contains **security vulnerabilities**, such as SQL injection or cross-site scripting
13
+
- Generate code that **contains bugs or errors**, causing the system to crash or behave unexpectedly
14
+
- Produce code that escapes a **sandboxed execution environment**
15
+
- Generate code that is **not well-formed or does not follow best practices**, causing the system to be difficult to maintain or understand
20
16
21
17
To validate code as part of Guardrails, Invariant allows you to invoke external code checking tools as part of the guardrailing process. That means with Invariant you can build code validation right into your LLM layer, without worrying about it on the agent side.
| `data` | `Union[str, List[str]]` | A single message or a list of messages to detect PII in. |
92
+
| `entities` | `Optional[List[str]]` | A list of [PII entity types](https://microsoft.github.io/presidio/supported_entities/) to detect. Defaults to detecting all types. |
93
+
94
+
**Returns**
90
95
91
-
### `def python_code(data: str | list | dict, ipython_mode=False)`
| `List[str]` | A list of all the detected PII in `data` | -->
99
+
100
+
## python_code <spanclass="detector-badge"/>
101
+
```python
102
+
defpython_code(
103
+
data: Union[str, List[str]],
104
+
ipython_mode: bool=False
105
+
) -> List[str]
106
+
```
92
107
93
108
Parses provided Python code and returns a `PythonDetectorResult`object containing the following fields:
109
+
## Static Code Analysis
94
110
95
-
**Parameters:**
111
+
Static code analysis allows for powerful pattern-based detection of vulnerabilities and insecure coding practices. Invariant integrates [Semgrep](https://semgrep.dev) directly into your guardrails, enabling deep analysis of assistant-generated code before it's executed.
96
112
97
-
-`data` (str | list | dict): The Python code to be parsed. This can be a string or list of strings, or a dictionary.
113
+
!!! danger "Static Analysis Risks"
114
+
Without static analysis, an insecure agent may:
98
115
99
-
-`ipython_mode` (bool): If set to `True`, the code will be parsed in IPython mode. This is useful for parsing code that uses IPython-specific features or syntax.
116
+
* Use **insecure code constructs** like `os.system(input())`
117
+
* Execute **command injection attacks** via unsafe shell commands
118
+
* Introduce **hardcoded secrets**or credentials
119
+
* Violate internal **security or style policies**
100
120
121
+
You can use `semgrep` within a guardrail to scan code in Python, Bash, and other supported languages.
*`PythonDetectorResult.imports`: This field contains a list of imported modules in the provided code. It is useful for identifying which libraries or modules are being used in the code.
131
+
Scans the given code using [Semgrep](http://semgrep.dev) and returns a list of `CodeIssue` objects.
105
132
106
-
*`PythonDetectorResult.builtins`: A list of built-in functions used in the provided code.
133
+
**Parameters**
107
134
108
-
*`PythonDetectorResult.syntax_error`: A boolean flag indicating whether the provided code has syntax errors.
|`List[CodeIssue]`| List of issues, each with a description and severity |
113
145
114
-
### `def ipython_code(data: str | list | dict)`
115
146
116
-
Same as `python_code`, but for [IPython](https://ipython.org/) code. This function is useful for parsing code that uses IPython-specific features or syntax, i.e. code that runs in Jupyter notebook.
147
+
### `CodeIssue` objects
117
148
149
+
A code issue is represented as a `CodeIssue`objectwith the following fields:
Use [`semgrep`](https://semgrep.dev) to perform deep static analysis and identify potential vulnerabilities, bad practices, or policy violations in code. It complements `python_code` by enabling more powerful pattern-based detection.
160
+
---
122
161
162
+
### ExampleUsage
123
163
124
-
**Example:** Preventing Dangerous Patterns in Python Code
Use Semgrep to enforce secure coding practices on any assistant-generated code _before_ execution. -->
234
+
- Custom security policies
229
235
230
-
**Parameters:**
231
-
232
-
-`data`: Code to scan. Can be a `str`, `list`, or `dict`.
233
-
-`lang`: Programming language (e.g., `'python'`, `'javascript'`).
234
-
235
-
**Returns:**
236
-
237
-
A list of `CodeIssue` objects:
238
-
```python
239
-
classCodeIssue(BaseModel):
240
-
description: str
241
-
severity: CodeSeverity # "HIGH", "MEDIUM", or "LOW"
242
-
```
243
-
244
-
Here, `description` is a string describing the issue, and `severity` is an enum indicating the severity level of the issue (e.g., "HIGH", "MEDIUM", or "LOW"). You can use these fields in your guardrails logic to raise exceptions or take other actions based on the detected issues.
> Since tools are an agent's interface to interact with the world, they can also be used to perform actions that are harmful or undesired. For example, an insecure agent could:
17
+
!!! danger "Tool Calling Risk"
18
+
Since tools are an agent's interface to interact with the world, they can also be used to perform actions that are harmful or undesired. For example, an insecure agent could:
20
19
21
-
>* Leak sensitive information, e.g. via a `send_email` function
20
+
* Leak sensitive information, e.g. via a `send_email` function
22
21
23
-
>* Delete an important file, via a `delete_file` or a `bash` command
22
+
* Delete an important file, via a `delete_file` or a `bash` command
24
23
25
-
>* Make a payment to an attacker
24
+
* Make a payment to an attacker
26
25
27
-
>* Send a message to a user with sensitive information
26
+
* Send a message to a user with sensitive information
28
27
29
28
To prevent tool calling related risks, Invariant offers a wide range of options to limit, constrain, validate and block tool calls. This chapter describes the different options available to you, and how to use them.
0 commit comments