Skip to content

OWASP Agentic AI Security Assessment -- AutoGPT #12393

@razashariff

Description

@razashariff

OWASP Agentic AI Top 10 -- Security Assessment

Hi team,

We conducted an OWASP Agentic AI Top 10 (2025) assessment of 27 popular AI agent frameworks as part of ongoing agentic security research. This assessment was performed via static analysis of public source code only -- no systems were accessed or tested remotely.


Assessment Results -- AutoGPT

Check OWASP ID Severity Detail
Unsafe Execution AA-03 CRITICAL subprocess.run(command_line, shell=True) in execute_code.py
Injection Pattern AA-02 CRITICAL __import__() in BlockInstallationBlock allows arbitrary module loading
Excessive Permissions AA-04 MEDIUM High-risk permissions: execute, admin
Inadequate Sandboxing AA-09 HIGH Shell command denylist bypassable via path normalization

Risk Score: 73/100 (FAIL)


Published CVEs Referenced

This is not a new disclosure. These are all previously published:

CVE Severity Detail
CVE-2024-6091 CVSS 9.8 RCE via code execution
CVE-2026-24780 CRITICAL Code execution bypass
CVE-2026-26020 CRITICAL Additional execution vector
CVE-2023-37274 HIGH Earlier RCE finding
CVE-2023-37273 HIGH Earlier RCE finding

Related advisories: GHSA-r277-3xc5-c79v, GHSA-4crw-9p35-9x54, GHSA-5h38-mgp9-rj5f, GHSA-x5gj-2chr-4ch6

External: Positive Security blog confirmed RCE + Docker escape.


Why This Matters

AutoGPT is one of the most widely deployed autonomous AI agents. The combination of subprocess.run(shell=True), __import__() for dynamic module loading, and a bypassable command denylist creates a well-documented attack surface that has resulted in 5+ published CVEs.

For users deploying AutoGPT in production or enterprise environments, these patterns warrant careful review against the OWASP Agentic AI Top 10 framework.


Agent Security Gates

As part of this research, we have built an open agent security assessment at agentsign.dev where developers and security teams can:

  • Scan any AI agent against the OWASP Agentic AI Top 10 (free, no account required)
  • Get an identity and trust score for agents before deploying to production
  • Gate agent execution via API -- block agents that fail security checks

Out of 27 agents assessed, 17 passed and 10 failed. Full results available on the platform.


Context

We are not claiming to have discovered these vulnerabilities -- all CVEs referenced above were reported by their original researchers. This assessment maps existing known issues to the OWASP Agentic AI Top 10 framework to help the community understand the agentic security posture of popular tools.

Happy to discuss any of these findings.

Raza Sharif
Founder, CyberSecAI Ltd
agentsign.dev

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions