You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
my name is Francesco Camporeale, ICT student at KTH in Stockholm. For the past few weeks, I have been diving into the codebase to research solutions for GSoC Idea #1.
I’ve made some progress locally by implementing the base Docker/Seatbelt configuration and binding a custom proxy as the default gateway to intercept HTTP/HTTPS requests. However, before finalizing my proposal, I would greatly appreciate some feedback on the policy-engine and domain filtering implementation from the mentors.
To resolve the conflict between strict security policies and the risk of "noisy" user prompts, I thought of a "user-in-the-loop" feedback cycle. Instead of the proxy halting execution to prompt the user directly, the flow would operate like this:
1. Default Deny: The proxy enforces a strict allowlist based on the .toml rules. Uncharted domains are instantly blocked.
A point I would love feedback on: What should this initial allowlist contain out-of-the-box? To avoid the overhead of maintaining enterprise domain mappings, I thought that shipping the CLI with a minimal "seed list" of globally trusted domains would be the best option. Any request outside this core list naturally falls into the user-feedback cycle.
2. Payload Injection: When blocking a connection, the proxy returns an HTTP 403 error containing injected LLM-specific system instructions.
3. Agentic Prompting: This causes the active tool to fail, feeding the injected error back into the agent's context. The agent is strictly instructed by the error to halt and ask the user for explicit permission via the UI.
4. Safe Write-Back: If the user approves, the agent does not write to the file directly. Instead, it invokes a custom tool that validates and appends the domain to the .toml policy.
The aim of this approach is to prevent silent bypasses and prompt fatigue besides minimizing the UI code by leaning on the already existing AI structure.
Is this prototype viable and does it align with the aim of the project? Any feedback on it would be great. Thank you!
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello everyone!
my name is Francesco Camporeale, ICT student at KTH in Stockholm. For the past few weeks, I have been diving into the codebase to research solutions for GSoC Idea #1.
I’ve made some progress locally by implementing the base Docker/Seatbelt configuration and binding a custom proxy as the default gateway to intercept HTTP/HTTPS requests. However, before finalizing my proposal, I would greatly appreciate some feedback on the policy-engine and domain filtering implementation from the mentors.
To resolve the conflict between strict security policies and the risk of "noisy" user prompts, I thought of a "user-in-the-loop" feedback cycle. Instead of the proxy halting execution to prompt the user directly, the flow would operate like this:
1. Default Deny: The proxy enforces a strict allowlist based on the .toml rules. Uncharted domains are instantly blocked.
A point I would love feedback on: What should this initial allowlist contain out-of-the-box? To avoid the overhead of maintaining enterprise domain mappings, I thought that shipping the CLI with a minimal "seed list" of globally trusted domains would be the best option. Any request outside this core list naturally falls into the user-feedback cycle.
2. Payload Injection: When blocking a connection, the proxy returns an HTTP 403 error containing injected LLM-specific system instructions.
3. Agentic Prompting: This causes the active tool to fail, feeding the injected error back into the agent's context. The agent is strictly instructed by the error to halt and ask the user for explicit permission via the UI.
4. Safe Write-Back: If the user approves, the agent does not write to the file directly. Instead, it invokes a custom tool that validates and appends the domain to the .toml policy.
The aim of this approach is to prevent silent bypasses and prompt fatigue besides minimizing the UI code by leaning on the already existing AI structure.
Is this prototype viable and does it align with the aim of the project? Any feedback on it would be great. Thank you!
Beta Was this translation helpful? Give feedback.
All reactions