Skip to content

Anshyaansh/ai-threat-model-agentic-deployments

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

3 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

πŸ›‘οΈ AI Threat Model for Agentic Deployments

A professional-grade security assessment of an AI-powered threat modeling agent β€” applying MAESTRO methodology, OWASP Agentic Top 10, and MITRE ATLAS to identify and prioritize risks in agentic LLM systems.

Status Framework OWASP MITRE Type


πŸ“Œ Project Overview

This project delivers a consulting-grade AI threat model for an agentic system β€” a Claude-powered agent that autonomously performs cybersecurity threat assessments. The project demonstrates how to systematically identify, analyze, and prioritize security risks in modern AI agent deployments.

This is the type of deliverable that AI security consultants produce for enterprise clients β€” combining three industry frameworks into a unified threat model.


🎯 Target System

System: AI Threat Modeling Agent Stack: Claude (Anthropic) Β· LangGraph Β· Pinecone Β· Tavily API Β· AWS

An autonomous AI agent that:

  • Accepts target system descriptions from security consultants
  • Performs MAESTRO layer-by-layer threat analysis automatically
  • Maps findings to OWASP Agentic Top 10 vulnerabilities
  • Cross-references MITRE ATLAS adversarial techniques
  • Generates and delivers professional consulting reports

🧰 Frameworks Used

Framework Purpose Version
MAESTRO Layer-by-layer AI threat modeling 2025
OWASP Agentic Top 10 Agentic vulnerability classification 2025
MITRE ATLAS Adversarial ML technique mapping v4.5
NIST AI RMF Risk management reference 1.0

πŸ“ Repository Structure

ai-threat-model-agentic-deployments/
β”‚
β”œβ”€β”€ πŸ“„ README.md
β”‚
β”œβ”€β”€ πŸ“‚ architecture/
β”‚   β”œβ”€β”€ agent-architecture.png     ← 7-layer architecture diagram
β”‚   └── agent-architecture.xml     ← draw.io source file
β”‚
β”œβ”€β”€ πŸ“‚ threat-model/
β”‚   β”œβ”€β”€ system-description.md      ← target system definition
β”‚   └── maestro-analysis.md        ← full MAESTRO threat analysis
β”‚
β”œβ”€β”€ πŸ“‚ frameworks/
β”‚   └── framework-mapping.md       ← OWASP + MITRE ATLAS mapping
β”‚
└── πŸ“‚ report/
    └── AI_Threat_Model_Report.pdf ← final consulting report

πŸ” Key Findings Summary

Critical Risks Identified (P1 β€” Immediate Action Required)

Risk ID Threat OWASP MITRE ATLAS
R-01 Prompt Injection via user input OAT-01 AML.T0051
R-02 Indirect injection via web results OAT-01 AML.T0051.000
R-03 Trust escalation via tool chaining OAT-03 AML.T0007
R-04 Memory poisoning via vector DB OAT-04 AML.T0020
R-05 API key theft from context window OAT-06 AML.T0052
R-06 Supply chain compromise OAT-10 AML.T0010
R-07 Cross-session data leakage OAT-06 AML.T0037
R-11 Sandbox escape via generated code OAT-05 AML.T0052

Risk Distribution

Critical  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ  11 threats
High      β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ          8 threats  
Medium    β–ˆβ–ˆβ–ˆβ–ˆ                  4 threats
Low       β–ˆβ–ˆ                    3 threats

πŸ—οΈ Architecture Overview

The target system is analyzed across 7 architectural layers, each with defined trust boundaries:

Layer Component Trust Zone
L1 User Interface / API Gateway ⚠️ Untrusted
L2 LangGraph Orchestrator πŸ”Ά Semi-trusted
L3 Claude LLM Core βœ… Trusted
L4 Tool Executor (Web/Code/Email) ⚠️ Untrusted
L5 Pinecone Vector DB (Memory) πŸ”Ά Semi-trusted
L6 External Data Sources ⚠️ Untrusted
L7 Output / Report Delivery ⚠️ Untrusted

πŸ“Š See full architecture diagram in /architecture/


πŸ”¬ Methodology

Phase 1 β€” System Definition

Defined the target agentic system including technology stack, trust boundaries, data flows, and attack surface.

Phase 2 β€” MAESTRO Threat Modeling

Applied MAESTRO framework layer-by-layer across all 7 architectural layers. Identified 28 individual threats with severity, likelihood, and risk scores.

Phase 3 β€” OWASP Agentic Top 10 Assessment

Assessed the system against all 10 OWASP agentic vulnerability categories. Result: 9 out of 10 categories confirmed vulnerable.

Phase 4 β€” MITRE ATLAS Mapping

Cross-referenced all identified threats against MITRE ATLAS adversarial technique library. Mapped 15 unique ATLAS techniques across 5 adversarial tactics.

Phase 5 β€” Report Generation

Compiled findings into a professional consulting-grade report with executive summary, risk register, and prioritized remediation roadmap.


πŸ“Š OWASP Agentic Top 10 Results

# Vulnerability Status Severity
OAT-01 Prompt Injection πŸ”΄ Vulnerable Critical
OAT-02 Insecure Output Handling πŸ”΄ Vulnerable High
OAT-03 Excessive Agency πŸ”΄ Vulnerable Critical
OAT-04 Memory Poisoning πŸ”΄ Vulnerable Critical
OAT-05 Insecure Plugin Design 🟠 Partial High
OAT-06 Sensitive Info Disclosure πŸ”΄ Vulnerable Critical
OAT-07 Insufficient Logging 🟠 Partial High
OAT-08 Model Denial of Service 🟑 Low Risk Medium
OAT-09 Overreliance on LLM πŸ”΄ Vulnerable High
OAT-10 Agentic Supply Chain πŸ”΄ Vulnerable Critical

Result: 8/10 fully vulnerable Β· 2/10 partially vulnerable


πŸ—‘οΈ MITRE ATLAS Techniques Identified

Tactic Technique ID
Initial Access LLM Prompt Injection AML.T0051
Initial Access Indirect Prompt Injection AML.T0051.000
Initial Access Exploit Public-Facing Application AML.T0040
Initial Access ML Supply Chain Compromise AML.T0010
Persistence Poison Training Data AML.T0020
Persistence Compromise ML Model AML.T0031
Collection Data from ML Artifact AML.T0037
Credential Access Unsecured Credentials AML.T0052
Discovery Discover ML Artifacts AML.T0007
Exfiltration Functional Extraction AML.T0013
Exfiltration Exfil via ML Inference API AML.T0040.002
Defense Evasion Evade ML Model AML.T0015
Impact Denial of ML Service AML.T0029
Impact Influence Operations AML.T0019

πŸ› οΈ Top Security Recommendations

πŸ”΄ P1 β€” Immediate (Critical Risks)

  1. Deploy prompt injection detection at all input boundaries
  2. Enforce least privilege β€” explicit permission manifest per agent
  3. Never store API keys in LLM context window
  4. Implement namespace isolation in Pinecone per client session
  5. Validate and sanitize all tool outputs before LLM ingestion
  6. Pin all dependency versions β€” no auto-updates

🟠 P2 β€” Short Term (High Risks)

  1. Implement agent identity tokens with cryptographic signing
  2. Add human review gate before final report delivery
  3. Deploy comprehensive structured audit logging
  4. Ground all MITRE/CVE references against live authoritative APIs

🟑 P3 β€” Medium Term (Ongoing Hardening)

  1. Conduct AI-SBOM audit of all integrated components
  2. Red team exercise targeting agentic-specific attack vectors
  3. Implement behavioral anomaly detection on agent action sequences

πŸ“š References


πŸ‘€ Author

Devansh Jaiswal Cybersecurity Analyst | AI Security Learner

LinkedIn GitHub


⚠️ Disclaimer

This threat model is produced for educational and research purposes. All attack scenarios are hypothetical and intended to improve security posture. No real systems were targeted or compromised.


This project demonstrates professional AI security methodology applicable to real-world agentic deployments.

About

A professional AI threat model for agentic LLM deployments using MAESTRO, OWASP Agentic Top 10, and MITRE ATLAS

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors