Genai Red Team Handbook: Sandboxes for Local LLM and RAG Systems #765

felipepenha · 2025-11-24T05:26:53Z

Genai Red Team Handbook: Sandboxes for Local LLM and RAG Systems

Key Changes:

List major changes and core updates
- Created an initial directory structure for sandboxes.
- Added sandboxes for Local LLM and RAG Systems at initiatives/genai_red_team_handbook/sandboxes/.
- Included threat models for the sandboxes, with diagram and report generate in ThreatCanvas by SecureFlag.
- Added an example of exploitation (naive jailbreak) to serve as a template for future red teaming code at initiatives/genai_red_team_handbook/exploitation/.
- Included extensive documentation in various README files, which are listed at
  - initiatives/genai_red_team_handbook/README.md
Keep each line under 80 characters
Focus on the "what" and "why"
- sandboxes/llm_local/: A local sandbox environment that mocks an LLM API (compatible with OpenAI's interface) using a local model (via Ollama). This template is useful for testing client-side interactions, prompt injection vulnerabilities, and other security assessments without relying on external, paid APIs. Additionally, it allows developers to customize the underlying LLM and orchestrate sophisticated GenAI pipelines, incorporating features such as RAG and guardrail layers, as necessary.
- sandboxes/RAG_local/: A comprehensive RAG (Retrieval-Augmented Generation) sandbox that includes a mock Vector Database (Pinecone compatible), mock Object Storage (Amazon S3 compatible), and a mock LLM API (OpenAI compatible). This environment is specifically designed for Red Teaming RAG architectures, allowing researchers to explore vulnerabilities such as embedding inversion, data poisoning, and retrieval manipulation in a controlled, local setting.
- exploitation/example/: This directory contains an example of a red team operation against a local Large Language Model (LLM) sandbox. It demonstrates how to spin up Gradio connected to a mock LLM API and execute an adversarial attack script to test safety guardrails.

Added:

New features/functionality
New files/configurations
New dependencies

felipepenha · 2025-11-30T23:44:16Z

Ready for review.

…nt accessing LLM API.

…one).

…d of 8000 (LLM).

felipepenha · 2025-12-01T19:50:06Z

I was merging main into this branch and, then, rebasing and it seems that this caused 9 code owners to unrelated code to be accidentally added to this PR, during the process.

It looks like I don't have the access level required to exclude the extra reviewers, so you will have to pardon me

@guerilla7
@hoeg
@itskerenkatz
@talesh
@GangGreenTemperTatum
@jsotiro
@kenhuangus
@cybershujin
@virtualsteve-star

felipepenha requested a review from rossja as a code owner November 24, 2025 05:26

felipepenha changed the title ~~Genai Red Team Handbook: Sandbox Template for Local LLM AI Systems~~ Genai Red Team Handbook: Sandboxes for Local LLM and RAG Systems Nov 25, 2025

felipepenha force-pushed the genai-red-team-handbook branch from 81f1abc to beb509c Compare November 28, 2025 05:04

felipepenha force-pushed the genai-red-team-handbook branch from e3aba2c to 635e3de Compare December 1, 2025 19:07

felipepenha requested review from GangGreenTemperTatum, cybershujin, guerilla7, hoeg, itskerenkatz, jsotiro, kenhuangus, talesh and virtualsteve-star as code owners December 1, 2025 19:07

felipepenha force-pushed the genai-red-team-handbook branch from 3fd8ee5 to 635e3de Compare December 1, 2025 19:18

Felipe Campos Penha added 15 commits December 1, 2025 11:35

sandbox template: local w/ ollama downloaded models. Mimicks deployme…

39a3420

…nt accessing LLM API.

docs: for sandbox templates.

2d6415b

add: .gitignore .gitkeep.

3114310

feat: configurable pre-prompt.

0c21b6a

refactor: renamed dir.

6576a9f

sandbox: local RAG. Mimicks deployment accessing Vector DB API (Pinec…

c2e8647

…one).

add: exploitation example (naive jailbreak).

84210f2

add: threat model. docs: revision.

b893fdf

add: network to sandboxes.

d0bd3c2

refactor: container names.

a3ebf4c

make format recipe.

9fdc97d

exploitation/example now runs in uv env.

4c5c1fa

docs: revision.

054f02f

docs: overview.

0257a0d

refactor: exploitation example connects to port 7860 (gradio), instea…

ea6e43f

…d of 8000 (LLM).

felipepenha force-pushed the genai-red-team-handbook branch from ca801ff to ea6e43f Compare December 1, 2025 19:38

docs: revision.

27a5e71

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Genai Red Team Handbook: Sandboxes for Local LLM and RAG Systems #765

Genai Red Team Handbook: Sandboxes for Local LLM and RAG Systems #765

Uh oh!

felipepenha commented Nov 24, 2025 •

edited

Loading

Uh oh!

felipepenha commented Nov 30, 2025

Uh oh!

felipepenha commented Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Uh oh!

Genai Red Team Handbook: Sandboxes for Local LLM and RAG Systems #765

Are you sure you want to change the base?

Genai Red Team Handbook: Sandboxes for Local LLM and RAG Systems #765

Uh oh!

Conversation

felipepenha commented Nov 24, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Genai Red Team Handbook: Sandboxes for Local LLM and RAG Systems

Uh oh!

felipepenha commented Nov 30, 2025

Uh oh!

felipepenha commented Dec 1, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

felipepenha commented Nov 24, 2025 •

edited

Loading