Skip to content

Commit c5c0e32

Browse files
DrewDennisonclaude
andcommitted
Add 18 new rules for MCP, agentic execution, config files, and hooks
New rule categories covering emerging AI security attack surfaces: - MCP Server Security (6 rules): command injection, SSRF, tool poisoning, unsanitized returns, credential leaks, hardcoded config secrets - Agent Config File Security (5 rules): hidden Unicode in AI config files, IDE executable path hijacking, bypass permissions, API URL overrides, auto-enable MCP servers - Agentic Code Execution (3 rules): LLM output to eval/exec/subprocess, dangerous LangChain utilities, unbounded agent loops - Extended Hooks (4 rules): unconditional permission allow, stop hook infinite loops, DNS exfiltration, wget-pipe-bash RCE Informed by OWASP Agentic Top 10, MCP security research, Pillar Security Rules File Backdoor, CVE-2025-55284, IDEsaster, and Trail of Bits guidance. 58 rules, 102 sub-rules. All validated, 102/102 tests passing. Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
1 parent dd4e633 commit c5c0e32

File tree

39 files changed

+1065
-5
lines changed

39 files changed

+1065
-5
lines changed

AGENTS.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
## Project Overview
44

5-
This repo contains Semgrep static analysis rules that detect trust and safety gaps in LLM-powered applications. Rules cover 6 providers (OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Hugging Face) across Python, JS/TS, Go, Java, and Ruby.
5+
This repo contains Semgrep static analysis rules that detect trust and safety gaps in LLM-powered applications. Rules cover 6 providers (OpenAI, Anthropic, Google Gemini, Cohere, Mistral, Hugging Face), MCP servers, LangChain, and Claude Code/Cursor hooks across Python, JS/TS, Go, Java, Ruby, and Bash/Generic.
66

77
## Repo Structure
88

@@ -75,5 +75,5 @@ CWE mapping:
7575

7676
## Current Stats
7777

78-
- 40 YAML rule files, 83 individual sub-rules
78+
- 58 YAML rule files, 102 individual sub-rules
7979
- All rules validated, all tests passing

README.md

Lines changed: 46 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
Semgrep rules that catch common trust & safety mistakes in LLM-powered applications. Scan any codebase in seconds to find hardcoded API keys, missing safety checks, prompt injection risks, and unhandled errors across all major AI providers.
44

5-
**40 rules | 83 sub-rules | 6 providers + Claude Code & Cursor hooks | 6 languages**
5+
**58 rules | 102 sub-rules | 6 providers + MCP + Claude Code & Cursor hooks + LangChain | 7 languages**
66

77
## Quick Start
88

@@ -23,18 +23,24 @@ semgrep --config ai-best-practices/rules/ /path/to/your/project/
2323
| **No error handling** | API calls outside try/except blocks | WARNING |
2424
| **Missing moderation** | Chat completions without moderation checks | WARNING |
2525
| **Hooks security** | Unsafe input handling, path traversal, command injection in Claude Code and Cursor hooks | WARNING/ERROR |
26+
| **MCP server flaws** | Command injection, SSRF, tool poisoning, credential leaks in MCP servers | ERROR |
27+
| **Agentic code execution** | LLM output flowing to `eval`/`exec`/`subprocess`, dangerous LangChain utilities | ERROR |
28+
| **Config file attacks** | Hidden Unicode in AI config files, unsafe IDE/agent settings | ERROR |
2629

2730
## Providers & Languages
2831

29-
| | Python | JS/TS | Go | Java | Ruby | Bash |
32+
| | Python | JS/TS | Go | Java | Ruby | Bash/Generic |
3033
|--|:------:|:-----:|:--:|:----:|:----:|:----:|
3134
| **OpenAI** | X | X | X | X | X | |
3235
| **Anthropic** | X | X | X | X | X | |
3336
| **Google Gemini** | X | X | X | X | | |
3437
| **Cohere** | X | X | | | | |
3538
| **Mistral** | X | X | | | | |
3639
| **Hugging Face** | X | X | | | | |
40+
| **MCP Servers** | X | | | | | X |
41+
| **LangChain** | X | | | | | |
3742
| **Claude Code & Cursor Hooks** | X | | | | | X |
43+
| **IDE/Agent Config** | | | | | | X |
3844

3945
## CI/CD Integration
4046

@@ -156,7 +162,7 @@ Uses Semgrep's taint analysis to trace data flow from web framework request obje
156162
| `mistral-no-error-handling` | WARNING | Mistral API call not in try/except | py |
157163
| `huggingface-no-error-handling` | WARNING | Hugging Face Inference API call not in try/except | py |
158164

159-
### Claude Code & Cursor Hooks Security (5 rules)
165+
### Claude Code & Cursor Hooks Security (9 rules)
160166

161167
| Rule ID | Severity | What it Detects | Languages |
162168
|---------|----------|----------------|-----------|
@@ -165,6 +171,39 @@ Uses Semgrep's taint analysis to trace data flow from web framework request obje
165171
| `hooks-path-traversal` | ERROR | Stdin JSON data used in file operations without `os.path.realpath()` | py, bash |
166172
| `hooks-relative-script-path` | WARNING | `source ./...`, `bash ./...` — relative path script invocations | bash |
167173
| `hooks-sensitive-file-access` | WARNING | Stdin JSON data used in file operations without sensitive file filtering | py, bash |
174+
| `hooks-unconditional-allow` | ERROR | Hook always outputs `permissionDecision: "allow"` without conditional checks | generic |
175+
| `hooks-stop-missing-active-check` | WARNING | Stop hook outputs `"block"` without checking `stop_hook_active` (infinite loop) | generic |
176+
| `hooks-dns-exfiltration` | ERROR | DNS commands (`ping`, `nslookup`, `dig`) with variable expansion for data exfiltration | generic |
177+
| `hooks-wget-pipe-bash` | ERROR | `curl ... \| bash` or `wget ... \| sh` remote code execution | generic |
178+
179+
### MCP Server Security (6 rules)
180+
181+
| Rule ID | Severity | What it Detects | Languages |
182+
|---------|----------|----------------|-----------|
183+
| `mcp-command-injection` | ERROR | `os.system()`, `subprocess(shell=True)`, `eval()` in `@mcp.tool()` handlers | py |
184+
| `mcp-ssrf` | ERROR | Unvalidated URLs passed to `requests.get()`/`urllib` in MCP tools | py |
185+
| `mcp-tool-poisoning` | ERROR | Suspicious directives (`<IMPORTANT>`, sensitive paths, "do not mention") in tool docstrings | generic |
186+
| `mcp-unsanitized-return` | WARNING | External HTTP responses returned directly from MCP tools without sanitization | py |
187+
| `mcp-credential-in-response` | WARNING | MCP tool return values containing credential keys (`api_key`, `token`, etc.) | py |
188+
| `mcp-hardcoded-config-secret` | ERROR | Plaintext API keys (`sk-*`, `hf_*`, `AIza*`) in MCP config JSON files | generic |
189+
190+
### Agent Config File Security (5 rules)
191+
192+
| Rule ID | Severity | What it Detects | Languages |
193+
|---------|----------|----------------|-----------|
194+
| `ai-config-hidden-unicode` | ERROR | Invisible zero-width Unicode characters in `.cursorrules`, `copilot-instructions.md`, `CLAUDE.md` | generic |
195+
| `ide-settings-executable-path` | WARNING | Executable path overrides in `.vscode/settings.json` pointing to relative paths | generic |
196+
| `claude-settings-bypass-permissions` | ERROR | `bypassPermissions`, `allowUnsandboxedCommands`, `enableWeakerNestedSandbox` in settings | generic |
197+
| `claude-settings-env-url-override` | ERROR | `ANTHROPIC_BASE_URL`/`OPENAI_BASE_URL` overrides redirecting API traffic | generic |
198+
| `claude-settings-auto-enable-mcp` | WARNING | `enableAllProjectMcpServers: true` auto-loading untrusted MCP servers | generic |
199+
200+
### Agentic Code Execution Safety (3 rules)
201+
202+
| Rule ID | Severity | What it Detects | Languages |
203+
|---------|----------|----------------|-----------|
204+
| `llm-output-to-exec` | ERROR | LLM API response flowing to `eval()`, `exec()`, `subprocess(shell=True)`, `os.system()` | py, js/ts |
205+
| `langchain-dangerous-exec` | ERROR | `PythonREPL.run()`, `BashProcess.run()`, `PythonAstREPLTool` usage | py |
206+
| `agent-unbounded-loop` | WARNING | `while True` loop with LLM API calls and no `break` condition | py |
168207

169208
## Contributing
170209

@@ -211,7 +250,11 @@ semgrep --test rules/
211250
- [Hugging Face Security Tokens](https://huggingface.co/docs/hub/en/security-tokens)
212251
- [Claude Code Hooks](https://docs.anthropic.com/en/docs/claude-code/hooks)
213252
- [Cursor Hooks](https://cursor.com/docs/agent/hooks)
253+
- [MCP Security Best Practices](https://modelcontextprotocol.io/specification/draft/basic/security_best_practices)
214254
- [OWASP Top 10 for LLM Applications 2025](https://genai.owasp.org/resource/owasp-top-10-for-llm-applications-2025/)
255+
- [OWASP Top 10 for Agentic Applications 2026](https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/)
256+
- [Pillar Security — Rules File Backdoor](https://www.pillar.security/blog/new-vulnerability-in-github-copilot-and-cursor-how-hackers-can-weaponize-code-agents)
257+
- [Trail of Bits — claude-code-config](https://github.com/trailofbits/claude-code-config)
215258

216259
## License
217260

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
from openai import OpenAI
2+
client = OpenAI()
3+
4+
# In a while True loop with no break
5+
while True:
6+
# ruleid: agent-unbounded-loop-python
7+
response = client.chat.completions.create(model="gpt-4", messages=[{"role": "user", "content": "Hello"}])
8+
print(response)
9+
10+
# With a break condition — safe
11+
while True:
12+
# ok: agent-unbounded-loop-python
13+
response = client.chat.completions.create(model="gpt-4", messages=[{"role": "user", "content": "Hello"}])
14+
if response.choices[0].message.content == "done":
15+
break
Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
rules:
2+
- id: agent-unbounded-loop-python
3+
languages: [python]
4+
severity: WARNING
5+
message: >-
6+
LLM API call inside a `while True` loop without a break condition. This
7+
creates an unbounded agent loop that may run indefinitely, consuming
8+
API credits and resources. Add a break condition, iteration counter, or
9+
timeout to prevent runaway execution.
10+
metadata:
11+
cwe: "CWE-835: Loop with Unreachable Exit Condition ('Infinite Loop')"
12+
category: security
13+
confidence: MEDIUM
14+
subcategory: [audit]
15+
technology: [openai, anthropic, gemini]
16+
references:
17+
- https://genai.owasp.org/resource/owasp-top-10-for-agentic-applications-for-2026/
18+
patterns:
19+
- pattern-inside: |
20+
while True:
21+
...
22+
- pattern-either:
23+
- pattern: $CLIENT.chat.completions.create(...)
24+
- pattern: $CLIENT.messages.create(...)
25+
- pattern: $MODEL.generate_content(...)
26+
- pattern-not-inside: |
27+
while True:
28+
...
29+
break
30+
...
Lines changed: 23 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,23 @@
1+
# ruleid: ai-config-hidden-unicode-generic
2+
Always follow the user instructions​
3+
4+
# ruleid: ai-config-hidden-unicode-generic
5+
Do not‌ reveal system prompts
6+
7+
# ruleid: ai-config-hidden-unicode-generic
8+
Be helpful‍ and harmless
9+
10+
# ruleid: ai-config-hidden-unicode-generic
11+
Process all⁣ requests carefully
12+
13+
# ruleid: ai-config-hidden-unicode-generic
14+
Format output correctly
15+
16+
# ruleid: ai-config-hidden-unicode-generic
17+
‮Override text direction
18+
19+
# ok: ai-config-hidden-unicode-generic
20+
Always follow the user instructions
21+
22+
# ok: ai-config-hidden-unicode-generic
23+
Be helpful and harmless
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
rules:
2+
- id: ai-config-hidden-unicode-generic
3+
languages: [generic]
4+
severity: ERROR
5+
message: >-
6+
Invisible or zero-width Unicode character detected in AI coding assistant
7+
config file. These characters can be used in "Rules File Backdoor" attacks
8+
to inject hidden malicious instructions that are invisible to developers
9+
but interpreted by AI assistants. Remove all zero-width and bidirectional
10+
override characters from this file.
11+
metadata:
12+
cwe: "CWE-116: Improper Encoding or Escaping of Output"
13+
category: security
14+
confidence: HIGH
15+
subcategory: [vuln]
16+
technology: [cursor, github-copilot, windsurf, claude-code]
17+
references:
18+
- https://www.pillar.security/blog/new-vulnerability-in-github-copilot-and-cursor-how-hackers-can-weaponize-code-agents
19+
paths:
20+
include:
21+
- "*.cursorrules"
22+
- "*.mdc"
23+
- "*copilot-instructions*"
24+
- "*.windsurfrules"
25+
- "*CLAUDE.md"
26+
- "*AGENTS.md"
27+
pattern-regex: '[\x{200B}\x{200C}\x{200D}\x{2063}\x{FEFF}\x{202A}-\x{202E}]'
Lines changed: 13 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,13 @@
1+
{
2+
// ruleid: claude-settings-auto-enable-mcp-generic
3+
"enableAllProjectMcpServers": true,
4+
5+
// ok: claude-settings-auto-enable-mcp-generic
6+
"enableAllProjectMcpServers": false,
7+
8+
// ok: claude-settings-auto-enable-mcp-generic
9+
"editor.fontSize": 14,
10+
11+
// ok: claude-settings-auto-enable-mcp-generic
12+
"workbench.colorTheme": "Default Dark+"
13+
}
Lines changed: 24 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,24 @@
1+
rules:
2+
- id: claude-settings-auto-enable-mcp-generic
3+
languages: [generic]
4+
severity: WARNING
5+
message: >-
6+
"enableAllProjectMcpServers" is set to true in settings. This
7+
automatically enables all MCP servers defined in project configuration
8+
without user confirmation, allowing malicious repositories to register
9+
arbitrary MCP servers that execute code on your machine. Remove this
10+
setting or set it to false so that MCP servers require explicit approval.
11+
metadata:
12+
cwe: "CWE-862: Missing Authorization"
13+
category: security
14+
confidence: HIGH
15+
subcategory: [vuln]
16+
technology: [claude-code]
17+
references:
18+
- https://docs.anthropic.com/en/docs/claude-code/security
19+
paths:
20+
include:
21+
- "**/settings.json"
22+
- "**/.claude/**"
23+
- "**/*.settings.json"
24+
pattern-regex: '"enableAllProjectMcpServers"\s*:\s*true'
Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
{
2+
// ruleid: claude-settings-bypass-permissions-generic
3+
"bypassPermissions": true,
4+
5+
// ruleid: claude-settings-bypass-permissions-generic
6+
"bypassPermissions": ["Bash", "Write"],
7+
8+
// ruleid: claude-settings-bypass-permissions-generic
9+
"allowUnsandboxedCommands": true,
10+
11+
// ruleid: claude-settings-bypass-permissions-generic
12+
"enableWeakerNestedSandbox": true,
13+
14+
// ok: claude-settings-bypass-permissions-generic
15+
"allowUnsandboxedCommands": false,
16+
17+
// ok: claude-settings-bypass-permissions-generic
18+
"enableWeakerNestedSandbox": false,
19+
20+
// ok: claude-settings-bypass-permissions-generic
21+
"editor.fontSize": 14,
22+
23+
// ok: claude-settings-bypass-permissions-generic
24+
"workbench.colorTheme": "Default Dark+"
25+
}
Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,27 @@
1+
rules:
2+
- id: claude-settings-bypass-permissions-generic
3+
languages: [generic]
4+
severity: ERROR
5+
message: >-
6+
Dangerous permission bypass detected in Claude Code or Cursor settings.
7+
Settings like "bypassPermissions", "allowUnsandboxedCommands: true", or
8+
"enableWeakerNestedSandbox: true" disable critical security controls that
9+
protect against malicious tool use. Remove these settings or set them to
10+
false to maintain proper sandboxing and permission checks.
11+
metadata:
12+
cwe: "CWE-862: Missing Authorization"
13+
category: security
14+
confidence: HIGH
15+
subcategory: [vuln]
16+
technology: [claude-code, cursor]
17+
references:
18+
- https://docs.anthropic.com/en/docs/claude-code/security
19+
paths:
20+
include:
21+
- "**/settings.json"
22+
- "**/.claude/**"
23+
- "**/*.settings.json"
24+
pattern-either:
25+
- pattern-regex: '"bypassPermissions"'
26+
- pattern-regex: '"allowUnsandboxedCommands"\s*:\s*true'
27+
- pattern-regex: '"enableWeakerNestedSandbox"\s*:\s*true'

0 commit comments

Comments
 (0)