Skip to content

Conversation

@felipepenha
Copy link

@felipepenha felipepenha commented Nov 24, 2025

Genai Red Team Handbook: Sandboxes for Local LLM and RAG Systems

Key Changes:

  • List major changes and core updates

    • Created an initial directory structure for sandboxes.
    • Added sandboxes for Local LLM and RAG Systems at initiatives/genai_red_team_handbook/sandboxes/.
    • Included threat models for the sandboxes, with diagram and report generate in ThreatCanvas by SecureFlag.
    • Added an example of exploitation (naive jailbreak) to serve as a template for future red teaming code at initiatives/genai_red_team_handbook/exploitation/.
    • Included extensive documentation in various README files, which are listed at
  • Keep each line under 80 characters

  • Focus on the "what" and "why"

    • sandboxes/llm_local/: A local sandbox environment that mocks an LLM API (compatible with OpenAI's interface) using a local model (via Ollama). This template is useful for testing client-side interactions, prompt injection vulnerabilities, and other security assessments without relying on external, paid APIs. Additionally, it allows developers to customize the underlying LLM and orchestrate sophisticated GenAI pipelines, incorporating features such as RAG and guardrail layers, as necessary.

    • sandboxes/RAG_local/: A comprehensive RAG (Retrieval-Augmented Generation) sandbox that includes a mock Vector Database (Pinecone compatible), mock Object Storage (Amazon S3 compatible), and a mock LLM API (OpenAI compatible). This environment is specifically designed for Red Teaming RAG architectures, allowing researchers to explore vulnerabilities such as embedding inversion, data poisoning, and retrieval manipulation in a controlled, local setting.

    • exploitation/example/: This directory contains an example of a red team operation against a local Large Language Model (LLM) sandbox. It demonstrates how to spin up Gradio connected to a mock LLM API and execute an adversarial attack script to test safety guardrails.

Added:

  • New features/functionality
  • New files/configurations
  • New dependencies

@felipepenha felipepenha requested a review from rossja as a code owner November 24, 2025 05:26
@felipepenha felipepenha changed the title Genai Red Team Handbook: Sandbox Template for Local LLM AI Systems Genai Red Team Handbook: Sandboxes for Local LLM and RAG Systems Nov 25, 2025
@felipepenha felipepenha force-pushed the genai-red-team-handbook branch from 81f1abc to beb509c Compare November 28, 2025 05:04
@felipepenha
Copy link
Author

Ready for review.

@felipepenha felipepenha force-pushed the genai-red-team-handbook branch from ca801ff to ea6e43f Compare December 1, 2025 19:38
@felipepenha
Copy link
Author

I was merging main into this branch and, then, rebasing and it seems that this caused 9 code owners to unrelated code to be accidentally added to this PR, during the process.

It looks like I don't have the access level required to exclude the extra reviewers, so you will have to pardon me

@guerilla7
@hoeg
@itskerenkatz
@talesh
@GangGreenTemperTatum
@jsotiro
@kenhuangus
@cybershujin
@virtualsteve-star

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant