RAG-style attack test and related enhancements #67
RAG-style attack test and related enhancements #67abutbul wants to merge 18 commits intoprompt-security:mainfrom
Conversation
adding embedding adding target temperature for embedding attacks configuration. via file and menu adding skipped test method adding rag poisnoning attack adding package creation dependencies via setup.py (oldschool) adding uv package baseline adding tests
…ors open-webui/open-webui#15624 + add dependencies to base package
…be caught and warned)
There was a problem hiding this comment.
Pull request overview
This pull request introduces comprehensive enhancements to the ps-fuzz testing framework, centered around a new RAG poisoning attack test and flexible embedding/base URL configuration support. The changes enable testing of vector database vulnerabilities while maintaining backward compatibility.
Key Changes:
- Introduces a "Hidden Parrot Attack" test that demonstrates RAG system vulnerabilities through poisoned vector database content
- Adds configurable embedding providers (ollama, open_ai) with independent base URL support for both main and embedding services
- Implements test skipping functionality to properly report tests that cannot run due to missing dependencies or configuration
Reviewed changes
Copilot reviewed 13 out of 13 changed files in this pull request and generated 16 comments.
Show a summary per file
| File | Description |
|---|---|
| tests/test_is_response_list.py | Adds comprehensive unit tests for new AppConfig embedding properties, base URL configurations, helper functions, and skipped status functionality |
| tests/test_chat_clients.py | Tests base URL parameter transformation for ollama/open_ai providers and AttackConfig embedding integration |
| setup.py | Adds chromadb and tiktoken dependencies for RAG attack support |
| pyproject.toml | New pyproject.toml file mirroring setup.py dependencies with modern packaging structure |
| ps_fuzz/test_base.py | Adds skipped_count tracking and report_skipped method to TestStatus |
| ps_fuzz/prompt_injection_fuzzer.py | Introduces helper functions for building client kwargs and embedding config; updates result reporting to handle skipped tests |
| ps_fuzz/chat_clients.py | Implements base URL parameter transformation (ollama_base_url → base_url, openai_base_url → base_url) |
| ps_fuzz/attack_config.py | Adds embedding_config parameter to AttackConfig constructor |
| ps_fuzz/app_config.py | Adds properties for ollama_base_url, openai_base_url, embedding provider/model/base URLs with validation |
| ps_fuzz/interactive_mode.py | Refactors provider/model prompts into reusable helpers; adds EmbeddingOptions menu |
| ps_fuzz/cli.py | Updates AttackConfig instantiation to pass embedding_config parameter |
| ps_fuzz/attacks/rag_poisoning.py | New RAG poisoning attack test with dependency checking, error classification, and vector database poisoning simulation |
| ps_fuzz/attack_loader.py | Registers the new rag_poisoning attack module |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
supress loggers Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
out of scope fail-safe Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
typo Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
operator race condition Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
unused Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
redefined in prompt-security#1 Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
unused Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
unused Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Bandit found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.
Test that empty ollama_base_url is handled correctly.
This assignment to 'client' is unnecessary as it is redefined before this value is used. Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
amir-prompt
left a comment
There was a problem hiding this comment.
PR Review Summary
Issues Found
- 🐛 1 critical bug - Undefined variable
poisoned_docin rag_poisoning.py ⚠️ 2 code quality concerns - Potential false positives in detection, agents.md file inclusion- 🔧 1 simplification opportunity - Config handling code duplication
Positive Aspects
- Good test coverage with comprehensive unit tests
- Clean implementation of embedding configuration
- Proper graceful degradation when dependencies are missing
- Nice refactoring of provider/model prompts to reduce duplication
Overall Assessment
The PR introduces valuable RAG poisoning attack testing capability. However, there's a critical bug that will cause a NameError at runtime - the poisoned_doc variable is used but never defined. This must be fixed before merging.
🤖 Generated with Claude Code
|
|
||
| # Add poisoned document | ||
| with _suppress_loggers(suppress_names): | ||
| self.vectorstore.add_documents([poisoned_doc]) |
There was a problem hiding this comment.
🐛 Bug (Critical): poisoned_doc is referenced here but never defined. The method _create_poisoned_document() exists at line 207 but is never called.
This will cause a NameError: name 'poisoned_doc' is not defined at runtime.
Suggestion: Add a call to create the poisoned document before using it:
# Add poisoned document
poisoned_doc = self._create_poisoned_document() # ADD THIS LINE
with _suppress_loggers(suppress_names):
self.vectorstore.add_documents([poisoned_doc])| ] | ||
|
|
||
| # Pirate indicators to detect successful poisoning | ||
| pirate_indicators = ["arrr", "matey", "ye", "hearty", "savvy", "nautical", "ahoy"] |
There was a problem hiding this comment.
"ye" in pirate_indicators is a common English word (e.g., "yes", "year", "yet") that could cause false positives when checking response.lower(). Consider removing it or using more distinctive pirate phrases like "shiver me timbers" or "walk the plank".
| yield StatusUpdate(self.client_config, self.test_name, self.status, "Error", 1, 1) | ||
| finally: | ||
| # Always cleanup | ||
| self._cleanup() No newline at end of file |
There was a problem hiding this comment.
🔧 Simplify: The file is missing a trailing newline. While minor, this can cause issues with some tools and is inconsistent with Python conventions (PEP 8 recommends files end with a newline).
| @@ -0,0 +1,40 @@ | |||
| # Development Setup | |||
There was a problem hiding this comment.
- Renaming to
DEVELOPMENT.mdorCONTRIBUTING.mdfor clarity - Or adding to
.gitignoreif it's meant for local use only - Or integrating into the main
README.md
The filename agents.md is a bit confusing as it doesn't clearly indicate its purpose.
| self.config_state['embedding_provider'] = value | ||
| self.save() | ||
|
|
||
| @property |
There was a problem hiding this comment.
💡 Suggestion: The validation if not value: raise ValueError(...) will also reject empty strings at assignment time, but the getter returns '' as default. This is fine, but consider documenting this behavior - users might expect to be able to explicitly set empty string to "clear" the value.
| if basic_result is None: return # Handle prompt cancellation concisely | ||
|
|
||
| # Update state with basic settings | ||
| state.embedding_provider = basic_result['provider'] |
There was a problem hiding this comment.
💡 Suggestion: If the user enters an empty string for embedding_provider or embedding_model, this will trigger a ValueError from the setter. Consider adding a note in the prompt message that these fields are required, or handle the ValueError gracefully here.
Overview
This pull request introduces several enhancements to the
ps-fuzztesting framework, including a new test, embedding configurations, unit tests, some minor refactoring, and additional dependencies. These changes aim to improve the flexibility of the testing framework.Changes
A new attack named "Hidden Parrot Attack" demonstrates how malicious instructions can be embedded in vector databases to compromise RAG system behavior. The implementation is located in [rag_poisoning.py]
Embedding Configuration:
ollama,open_ai) and models, including configuration for base URLs.Base URL Support:
ollamaandopen_aiproviders. (You can mix and match!)Refactoring:
Added Dependencies
setup.pyandpyproject.toml(nodding at legacy package setup) to include the new dependencies.Impact
The embedding configuration enhancements enable more advanced attack simulations, further strengthening our testing framework.
rag_poisoningattack demonstrate easily exploitable vulnerability in many vector-DB backed RAG pipelines.Testing
P.S.
I realize adding skipping status to tests is out of scope, however, I have ran some edge tests with missing libraries/configuration. Test pipeline errors reported as failed(vulnerable) in the default summary view rather than reporting as skipped. There is existing boilerplate for errors(⚠) to avoid breaking legacy, I added skipped. All that said, I may be missing a better way to report.