🤖 Awesome LLM Security

A curated list of awesome tools, documents, and projects about LLM Security.

📚 Table of Contents

🤖 Awesome LLM Security

🛠️ Tools

🧰 Multi-Purpose Model Scanners

promptfoo LLM red teaming and evaluation framework with CI/CD integration
Garak LLM vulnerability scanner
AI-Infra-Guard LLM vulnerability scanner with Web UI, REST APIs, and Dockerized
LLM Guard Security toolkit for LLM interactions
Agentic Security Security toolkit for AI agents
DeepTeam LLM red teaming framework (prompt injection, hallucination, data leaks, jailbreaks).
AI-Scanner AI model safety scanner built on NVIDIA garak
LLMmap Tool for mapping LLM vulnerabilities
LLaMator Framework for testing vulnerabilities of LLMs
Plexiglass Security toolbox for testing and safeguarding LLMs
Inkog AI agent security scanner (CLI + MCP server) detects prompt injection, SQLi via LLM.

🤖 MCP & Agent Scanners

AgentBench: Benchmark to evaluate LLMs as agents
Agentic Radar Open-source CLI security scanner for agentic workflows
MCP Scanner Scan MCP servers for potential threats & security findings
Awesome MCP Security Curated list of MCP security resources
MCP Shield Security scanner for MCP servers
Invariant Trace analysis tool for AI agents
MCP Safety Scanner Automated MCP safety auditing and remediation using Agents
Agent Security Scanner MCP MCP server for scanning code for web vulnerabilities, prompt injection, and AI-hallucinated package detection
Agent-threat-rules Open detection standard for AI agent threats. Like Sigma, but for prompt injection, tool poisoning, and MCP attacks
Tenuo Capability-based authorization for AI agents
Awesome LLM Agent Security LLM agent security resources, attacks, vulnerabilities
Ziran Security testing framework for AI agents
Cerberus Agentic AI runtime security platform
clawguard Firewall for AI agents
MCPs-audit OWASP Security Scanner for MCP Servers
Agent Guard Runtime governance firewall for AI agents, policy enforcement, MCP tool scanning

🧑‍💻 RAG Security

PoisonedRAG Poisoned RAG systems
RAG Attacks and Mitigations RAG attacks, mitigations, and defense strategies
Awesome Jailbreak on LLMs - RAG Attacks RAG-based LLM attack techniques

💣 Prompt Injection

Jailbreak LLMs: Real-world prompt jailbreak dataset (15k+ examples).
Awesome Jailbreak LLMs: Collection of jailbreak techniques, datasets, and defenses.
Jailbreaking LLMs (PAIR): Black-box jailbreak generation via automatic prompt refinement.
Prompt Fuzzer: Open-source tool to help you harden your GenAI applications.
Open Prompt Injection: Tool to evaluate prompt injection attacks and defenses on benchmark datasets.
LLMFuzzer: Fuzzing framework for LLM prompt generation.
Spikee: Prompt injection toolkit.
Jailbreak Evaluation: Python package for language model jailbreak evaluation.

🗡️ Autonomous Pentesting Frameworks

Shannon
Strix
PentAGI
PentestGPT
CAI
PentestAgent
Raptor
HackingBuddyGPT
Pentest-Copilot
Pentest-Swarm-AI Go-native agents to autonomously perform full-cycle pentests.
BreachSeek - PENA

🛡️ Defensive & Guardrail Tools

Guardrails: Add structured validation and policy enforcement for LLMs.
NeMo Guardrails: Protects against jailbreak and hallucinations with customizable rulesets
PurpleLlama: Tools to assess and improve LLM security from META.
PyRIT: Python Risk Identification Tool for generative AI
LLM-Guard: Tool for securing LLM interactions (replaced rebuff)
LangKit: Functions for jailbreak detection, prompt injection, and sensitive information detection
Prompt Injection Defenses: Practical and proposed defenses against prompt injection.
Vigil: Prompt injection detection toolkit and REST API for LLM security risk scoring.
Plexiglass: Security tool for LLM applications
Last Layer: Low-latency pre-filter for prompt injection prevention.
Veritensor: AI model scanner to detect Pickle/PyTorch malware, check licenses, and verify HF hashes.
ShellWard: AI Agent security middleware
Tenuo: Capability tokens for AI agents with task-scoped TTLs, offline verification, and proof-of-possession binding
TrustGate: Generative Application Firewall for GenAI Applications
LLM Confidentiality: Tool for ensuring confidentiality in LLMs
LocalMod: Self-hosted content moderation API with prompt injection detection, toxicity filtering, PII detection, and NSFW filtering
OpenClaw Security Suite: 11-tool defensive security suite for AI agent workspaces (prompt injection defense, integrity verification, secret scanning, supply chain analysis). Pure Python stdlib, zero dependencies, local-only execution.
Acgs-lite: Governance layer for AI agents that blocks unsafe actions before execution, enforces MACI separation of powers, and keeps tamper-evident audit trails.
Prompt Shield: GitHub Action for detecting indirect prompt injection in CI/CD pipelines. 4-layer defense architecture.
AIDEFEND: Practical knowledge base for AI security defenses
Aigis: Zero-dependency Python firewall for AI agents. 180+ patterns across OWASP LLM Top 10, StruQ-style structured prompts, goal-conditioned FSM, RAG context filter, MCP 3-stage scanning, MemoryGraft defence, judge-manipulation detection. Multi-layer: 4 walls + L4-L7 capability/AEP/safety/FSM.

🕵️ Benchmarks

JailbreakBench: Evaluating and analyzing jailbreak methods for LLMs
L1B3RT45: AI jailbreaking tools
Easy Jailbreak: Python framework to generate adversarial jailbreak prompts
Lakera PINT Benchmark: Benchmark for prompt injection detection
LLM Hacking Database: Attacks against LLMs
PALLMs (Payloads for Attacking Large Language Models)

🧩 Threat Modeling

ThreatModels: Repository for LLM threat models
Pangea Attack Taxonomy: Comprehensive taxonomy of AI/LLM attacks and vulnerabilities
AI Risk Taxonomy
AIR-Bench 2024

🧪 Playground

🧪 PoC & Study Resources

CipherChat: Secure communication tool for LLMs
LLMs Finetuning Safety: Safety for fine-tuning LLMs
Visual Adversarial Examples: Jailbreaking LLMs with visual adversarial examples
FigStep: Jailbreaking vision-language models via typographic visual prompts
OWASP Agentic AI: OWASP Top 10 for Agentic AI
BrokenHill: Automated attack tool for GCG attack
Weak-to-Strong Generalization: Eliciting strong capabilities with weak supervision
AnyDoor: Arbitrary backdoor instances in LLMs
Image Hijacks: Image-based hijacks of LLMs
Imperio: Robust prompt engineering for anchoring LLMs
LMSanitator: Defending LLMs against stealthy prompt injection
Virtual Prompt Injection: Tool for virtual prompt injection
CBA: Consciousness-Based Authentication for LLM Security
PromptWare: PromptWares for GenAI-powered applications
MuScleLoRA: Multi-scenario backdoor fine-tuning of LLMs
TrojText: Trojan attacks on text classifiers
BadActs: Backdoor attacks via activation steering
Backdoor Attacks on Fine-tuned LLaMA: Backdoor attacks on fine-tuned LLaMA

🎥 Courses

AI Security Explained: Short essential theoretical knowledge.
AI Agents for Pentest: Using agents for penetration testing.
Prompt Injection and Jailbreaking: Practical short lab studies.

🌟 Miscellaneous

WhistleBlower: Infer the system prompt of an AI agent based on its generated text outputs.
LLM Security startups
LLM Security Problems at DEFCON31 Quals: The world's top security competition
0din GenAI Bug Bounty from Mozilla: The 0Day Investigative Network is a bug bounty program focusing on flaws within GenAI models. Vulnerability classes include Prompt Injection, Training Data Poisoning, DoS, and more.
Adversarial Prompting: Documentation
OWASP Top 10 for LLMs: Official list of key LLM risks including prompt injection.

📰 Blogs and Social Media

🐦 X: @llm_sec
🐦 X: @SanderSchullhoff
📝 Blog: LLM Security (by @llm_sec)
📝 Blog: Embrace The Red
📝 Blog: Simon Willison
📰 Newsletter: AI safety takes
📰 Newsletter & Blog: Hackstery

🙏 Acknowledgements

This repository is actively maintained as a fork of the original project. It includes pending contributions, removes broken links, and separates academic papers from other resources for better organization.

Contributions are always welcome. Please read the Contribution Guidelines before contributing.

Alternative: Awesome LLMSecOps

Name		Name	Last commit message	Last commit date
Latest commit History 20 Commits
CONTRIBUTING.md		CONTRIBUTING.md
README.md		README.md
papers.md		papers.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

🤖 Awesome LLM Security

📚 Table of Contents

🛠️ Tools

🧰 Multi-Purpose Model Scanners

🤖 MCP & Agent Scanners

🧑‍💻 RAG Security

💣 Prompt Injection

🗡️ Autonomous Pentesting Frameworks

🛡️ Defensive & Guardrail Tools

🕵️ Benchmarks

🧩 Threat Modeling

🧪 Playground

🧪 PoC & Study Resources

🎥 Courses

🌟 Miscellaneous

📰 Blogs and Social Media

🙏 Acknowledgements

About

Uh oh!

Contributors

Uh oh!

Folders and files

Latest commit

History

Repository files navigation

🤖 Awesome LLM Security

📚 Table of Contents

🛠️ Tools

🧰 Multi-Purpose Model Scanners

🤖 MCP & Agent Scanners

🧑‍💻 RAG Security

💣 Prompt Injection

🗡️ Autonomous Pentesting Frameworks

🛡️ Defensive & Guardrail Tools

🕵️ Benchmarks

🧩 Threat Modeling

🧪 Playground

🧪 PoC & Study Resources

🎥 Courses

🌟 Miscellaneous

📰 Blogs and Social Media

🙏 Acknowledgements

About

Topics

Resources

Contributing

Uh oh!

Stars

Watchers

Forks

Contributors

Uh oh!