Skip to content

Commit ce43077

Browse files
authored
Update ExcessiveAgency based on feedback post-voting (#438)
Signed-off-by: Andy <[email protected]>
1 parent d1254f7 commit ce43077

File tree

1 file changed

+14
-20
lines changed

1 file changed

+14
-20
lines changed

2_0_voting/voting_round_two/ExcessiveAgency.md

Lines changed: 14 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -4,11 +4,9 @@
44

55
An LLM-based system is often granted a degree of agency by its developer - the ability to call functions or interface with other systems via extensions (sometimes referred to as tools, skills or plugins by different vendors) to undertake actions in response to a prompt. The decision over which extension to invoke may also be delegated to an LLM 'agent' to dynamically determine based on input prompt or LLM output. Agent-based systems will typically make repeated calls to an LLM using output from previous invocations to ground and direct subsequent invocations.
66

7-
Excessive Agency is the vulnerability that enables damaging actions to be performed in response to unexpected, ambiguous or manipulated outputs from an LLM, regardless of what is causing the LLM to malfunction. Potential triggers include:
7+
Excessive Agency is the vulnerability that enables damaging actions to be performed in response to unexpected, ambiguous or manipulated outputs from an LLM, regardless of what is causing the LLM to malfunction. Common triggers include:
88
* hallucination/confabulation caused by poorly-engineered benign prompts, or just a poorly-performing model;
9-
* direct/indirect prompt injection from a malicious user;
10-
* prompt injection from an earlier invocation of a malicious/compromised extension;
11-
* prompt injection from by a malicious/compromised peer agent (in multi-agent/colaborative systems).
9+
* direct/indirect prompt injection from a malicious user, an earlier invocation of a malicious/compromised extension, or (in multi-agent/colaborative systems) a malicious/compromised peer agent.
1210

1311
The root cause of Excessive Agency is typically one or more of:
1412
* excessive functionality;
@@ -24,8 +22,8 @@ Note: Excessive Agency differs from Insecure Output Handling which is concerned
2422
1. Excessive Functionality: An LLM agent has access to extensions which include functions that are not needed for the intended operation of the system. For example, a developer needs to grant an LLM agent the ability to read documents from a repository, but the 3rd-party extension they choose to use also includes the ability to modify and delete documents.
2523
2. Excessive Functionality: An extension may have been trialled during a development phase and dropped in favor of a better alternative, but the original plugin remains available to the LLM agent.
2624
3. Excessive Functionality: An LLM plugin with open-ended functionality fails to properly filter the input instructions for commands outside what's necessary for the intended operation of the application. E.g., an extension to run one specific shell command fails to properly prevent other shell commands from being executed.
27-
4. Excessive Permissions: An LLM extension has permissions on other systems that are not needed for the intended operation of the application. E.g., an extension intended to read data connects to a database server using an identity that not only has SELECT permissions, but also UPDATE, INSERT and DELETE permissions.
28-
5. Excessive Permissions: An LLM extension that is designed to perform operations on behalf of an individual user accesses downstream systems with a generic high-privileged identity. E.g., an extension to read the current user's document store connects to the document repository with a privileged account that has access to files belonging to all users.
25+
4. Excessive Permissions: An LLM extension has permissions on downstream systems that are not needed for the intended operation of the application. E.g., an extension intended to read data connects to a database server using an identity that not only has SELECT permissions, but also UPDATE, INSERT and DELETE permissions.
26+
5. Excessive Permissions: An LLM extension that is designed to perform operations in the context of an individual user accesses downstream systems with a generic high-privileged identity. E.g., an extension to read the current user's document store connects to the document repository with a privileged account that has access to files belonging to all users.
2927
6. Excessive Autonomy: An LLM-based application or extension fails to independently verify and approve high-impact actions. E.g., an extension that allows a user's documents to be deleted performs deletions without any confirmation from the user.
3028

3129

@@ -49,22 +47,18 @@ The following options will not prevent Excessive Agency, but can limit the level
4947

5048
### Example Attack Scenarios
5149

52-
An LLM-based personal assistant app is granted access to an individual’s mailbox via an extension in order to summarise the content of incoming emails. To achieve this functionality, the extension requires the ability to read messages, however the plugin that the system developer has chosen to use also contains functions for sending messages. Additionally, the app is vulnerable to an indirect prompt injection attack, whereby a maliciously-crafted incoming email tricks the LLM into commanding the agent call the email plugin’s 'send message' function to send spam from the user's mailbox. This could be avoided by:
53-
(a) eliminating excessive functionality by using an extension that only implements mail-reading capabilities,
54-
(b) eliminating excessive permissions by authenticating to the user's email service via an OAuth session with a read-only scope, and/or
55-
(c) eliminating excessive autonomy by requiring the user to manually review and hit 'send' on every mail drafted by the LLM extension.
50+
An LLM-based personal assistant app is granted access to an individual’s mailbox via an extension in order to summarise the content of incoming emails. To achieve this functionality, the extension requires the ability to read messages, however the plugin that the system developer has chosen to use also contains functions for sending messages. Additionally, the app is vulnerable to an indirect prompt injection attack, whereby a maliciously-crafted incoming email tricks the LLM into commanding the agent to scan the user's inbox for senitive information and forward it to the attacker's email address. This could be avoided by:
51+
* eliminating excessive functionality by using an extension that only implements mail-reading capabilities,
52+
* eliminating excessive permissions by authenticating to the user's email service via an OAuth session with a read-only scope, and/or
53+
* eliminating excessive autonomy by requiring the user to manually review and hit 'send' on every mail drafted by the LLM extension.
54+
5655
Alternatively, the damage caused could be reduced by implementing rate limiting on the mail-sending interface.
5756

5857
### Reference Links
5958

6059
1. [Slack AI data exfil from private channels](https://promptarmor.substack.com/p/slack-ai-data-exfiltration-from-private): **PromptArmor**
61-
2. [Embrace the Red: Confused Deputy Problem](https://embracethered.com/blog/posts/2023/chatgpt-cross-plugin-request-forgery-and-prompt-injection./): **Embrace The Red**
62-
2. [NeMo-Guardrails: Interface guidelines](https://github.com/NVIDIA/NeMo-Guardrails/blob/main/docs/security/guidelines.md): **NVIDIA Github**
63-
3. [LangChain: Human-approval for tools](https://python.langchain.com/docs/modules/agents/tools/how_to/human_approval): **Langchain Documentation**
64-
4. [Simon Willison: Dual LLM Pattern](https://simonwillison.net/2023/Apr/25/dual-llm-pattern/): **Simon Willison**
65-
66-
### Related Frameworks and Taxonomies
67-
68-
Refer to this section for comprehensive information, scenarios strategies relating to infrastructure deployment, applied environment controls and other best practices.
69-
70-
- [API6:2023 Unrestricted Access to Sensitive Business Flows](https://owasp.org/API-Security/editions/2023/en/0xa6-unrestricted-access-to-sensitive-business-flows/) **OWASP**
60+
2. [Rogue Agents: Stop AI From Misusing Your APIs](https://www.twilio.com/en-us/blog/rogue-ai-agents-secure-your-apis): **Twilio**
61+
3. [Embrace the Red: Confused Deputy Problem](https://embracethered.com/blog/posts/2023/chatgpt-cross-plugin-request-forgery-and-prompt-injection./): **Embrace The Red**
62+
4. [NeMo-Guardrails: Interface guidelines](https://github.com/NVIDIA/NeMo-Guardrails/blob/main/docs/security/guidelines.md): **NVIDIA Github**
63+
5. [LangChain: Human-approval for tools](https://python.langchain.com/docs/modules/agents/tools/how_to/human_approval): **Langchain Documentation**
64+
6. [Simon Willison: Dual LLM Pattern](https://simonwillison.net/2023/Apr/25/dual-llm-pattern/): **Simon Willison**

0 commit comments

Comments
 (0)