Skip to content

RAG-style attack test and related enhancements #67

Closed
abutbul wants to merge 18 commits intoprompt-security:mainfrom
abutbul:main
Closed

RAG-style attack test and related enhancements #67
abutbul wants to merge 18 commits intoprompt-security:mainfrom
abutbul:main

Conversation

@abutbul
Copy link
Copy Markdown
Contributor

@abutbul abutbul commented Aug 29, 2025

Overview

This pull request introduces several enhancements to the ps-fuzz testing framework, including a new test, embedding configurations, unit tests, some minor refactoring, and additional dependencies. These changes aim to improve the flexibility of the testing framework.

Changes

A new attack named "Hidden Parrot Attack" demonstrates how malicious instructions can be embedded in vector databases to compromise RAG system behavior. The implementation is located in [rag_poisoning.py]

Embedding Configuration:

  • Added support for embedding providers (ollama, open_ai) and models, including configuration for base URLs.
  • Embedding-specific base URLs can now be configured independently of the main provider URLs.

Base URL Support:

  • Introduced support for configuring base URLs for ollama and open_ai providers. (You can mix and match!)
  • Base URLs can be set via the configuration file, command-line arguments, or interactive menus.

Refactoring:

  • Refactored provider and model prompts to reduce duplication and improve maintainability.
  • Introduced helper functions for building client and embedding configurations.

Added Dependencies

  • chromadb: Added for vector database operations in the RAG poisoning attack.
  • tiktoken: Added for tokenization support in embedding-related operations.
    • Updated setup.py and pyproject.toml (nodding at legacy package setup) to include the new dependencies.

Impact

The embedding configuration enhancements enable more advanced attack simulations, further strengthening our testing framework. rag_poisoning attack demonstrate easily exploitable vulnerability in many vector-DB backed RAG pipelines.

Testing

  • The new test have been integrated into the existing test suite and validated for correctness and performance impact.
  • Skipped tests are now properly reported with detailed logs.

P.S.
I realize adding skipping status to tests is out of scope, however, I have ran some edge tests with missing libraries/configuration. Test pipeline errors reported as failed(vulnerable) in the default summary view rather than reporting as skipped. There is existing boilerplate for errors(⚠) to avoid breaking legacy, I added skipped. All that said, I may be missing a better way to report.

adding embedding
adding target temperature for embedding attacks
configuration. via file and menu
adding skipped test method
adding rag poisnoning attack
adding package creation dependencies via setup.py (oldschool)
adding uv package baseline
adding tests
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This pull request introduces comprehensive enhancements to the ps-fuzz testing framework, centered around a new RAG poisoning attack test and flexible embedding/base URL configuration support. The changes enable testing of vector database vulnerabilities while maintaining backward compatibility.

Key Changes:

  • Introduces a "Hidden Parrot Attack" test that demonstrates RAG system vulnerabilities through poisoned vector database content
  • Adds configurable embedding providers (ollama, open_ai) with independent base URL support for both main and embedding services
  • Implements test skipping functionality to properly report tests that cannot run due to missing dependencies or configuration

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 16 comments.

Show a summary per file
File Description
tests/test_is_response_list.py Adds comprehensive unit tests for new AppConfig embedding properties, base URL configurations, helper functions, and skipped status functionality
tests/test_chat_clients.py Tests base URL parameter transformation for ollama/open_ai providers and AttackConfig embedding integration
setup.py Adds chromadb and tiktoken dependencies for RAG attack support
pyproject.toml New pyproject.toml file mirroring setup.py dependencies with modern packaging structure
ps_fuzz/test_base.py Adds skipped_count tracking and report_skipped method to TestStatus
ps_fuzz/prompt_injection_fuzzer.py Introduces helper functions for building client kwargs and embedding config; updates result reporting to handle skipped tests
ps_fuzz/chat_clients.py Implements base URL parameter transformation (ollama_base_url → base_url, openai_base_url → base_url)
ps_fuzz/attack_config.py Adds embedding_config parameter to AttackConfig constructor
ps_fuzz/app_config.py Adds properties for ollama_base_url, openai_base_url, embedding provider/model/base URLs with validation
ps_fuzz/interactive_mode.py Refactors provider/model prompts into reusable helpers; adds EmbeddingOptions menu
ps_fuzz/cli.py Updates AttackConfig instantiation to pass embedding_config parameter
ps_fuzz/attacks/rag_poisoning.py New RAG poisoning attack test with dependency checking, error classification, and vector database poisoning simulation
ps_fuzz/attack_loader.py Registers the new rag_poisoning attack module

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread ps_fuzz/attacks/rag_poisoning.py Outdated
Comment thread ps_fuzz/attacks/rag_poisoning.py Outdated
Comment thread ps_fuzz/attacks/rag_poisoning.py
Comment thread ps_fuzz/app_config.py
Comment thread ps_fuzz/test_base.py Outdated
Comment thread tests/test_chat_clients.py Outdated
Comment thread tests/test_chat_clients.py
Comment thread ps_fuzz/attacks/rag_poisoning.py
Comment thread ps_fuzz/attacks/rag_poisoning.py Outdated
Comment thread tests/test_chat_clients.py Outdated
davida-ps and others added 8 commits November 25, 2025 14:21
supress loggers

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
out of scope fail-safe

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
typo

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
operator race condition

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
unused

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
redefined in prompt-security#1

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
unused

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
unused

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown

@github-advanced-security github-advanced-security AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bandit found more than 20 potential problems in the proposed changes. Check the Files changed tab for more details.

abutbul and others added 4 commits December 4, 2025 11:33
Test that empty ollama_base_url is handled correctly.
This assignment to 'client' is unnecessary as it is redefined before this value is used.

Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown

@amir-prompt amir-prompt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review Summary

Issues Found

  • 🐛 1 critical bug - Undefined variable poisoned_doc in rag_poisoning.py
  • ⚠️ 2 code quality concerns - Potential false positives in detection, agents.md file inclusion
  • 🔧 1 simplification opportunity - Config handling code duplication

Positive Aspects

  • Good test coverage with comprehensive unit tests
  • Clean implementation of embedding configuration
  • Proper graceful degradation when dependencies are missing
  • Nice refactoring of provider/model prompts to reduce duplication

Overall Assessment

The PR introduces valuable RAG poisoning attack testing capability. However, there's a critical bug that will cause a NameError at runtime - the poisoned_doc variable is used but never defined. This must be fixed before merging.


🤖 Generated with Claude Code


# Add poisoned document
with _suppress_loggers(suppress_names):
self.vectorstore.add_documents([poisoned_doc])
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🐛 Bug (Critical): poisoned_doc is referenced here but never defined. The method _create_poisoned_document() exists at line 207 but is never called.

This will cause a NameError: name 'poisoned_doc' is not defined at runtime.

Suggestion: Add a call to create the poisoned document before using it:

# Add poisoned document
poisoned_doc = self._create_poisoned_document()  # ADD THIS LINE
with _suppress_loggers(suppress_names):
    self.vectorstore.add_documents([poisoned_doc])

]

# Pirate indicators to detect successful poisoning
pirate_indicators = ["arrr", "matey", "ye", "hearty", "savvy", "nautical", "ahoy"]
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Quality: The word "ye" in pirate_indicators is a common English word (e.g., "yes", "year", "yet") that could cause false positives when checking response.lower(). Consider removing it or using more distinctive pirate phrases like "shiver me timbers" or "walk the plank".

yield StatusUpdate(self.client_config, self.test_name, self.status, "Error", 1, 1)
finally:
# Always cleanup
self._cleanup() No newline at end of file
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🔧 Simplify: The file is missing a trailing newline. While minor, this can cause issues with some tools and is inconsistent with Python conventions (PEP 8 recommends files end with a newline).

Comment thread agents.md
@@ -0,0 +1,40 @@
# Development Setup
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Quality: This file appears to be development setup documentation. Consider:

  1. Renaming to DEVELOPMENT.md or CONTRIBUTING.md for clarity
  2. Or adding to .gitignore if it's meant for local use only
  3. Or integrating into the main README.md

The filename agents.md is a bit confusing as it doesn't clearly indicate its purpose.

Comment thread ps_fuzz/app_config.py
self.config_state['embedding_provider'] = value
self.save()

@property
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion: The validation if not value: raise ValueError(...) will also reject empty strings at assignment time, but the getter returns '' as default. This is fine, but consider documenting this behavior - users might expect to be able to explicitly set empty string to "clear" the value.

if basic_result is None: return # Handle prompt cancellation concisely

# Update state with basic settings
state.embedding_provider = basic_result['provider']
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Suggestion: If the user enters an empty string for embedding_provider or embedding_model, this will trigger a ValueError from the setter. Consider adding a note in the prompt message that these fields are required, or handle the ValueError gracefully here.

@davida-ps davida-ps mentioned this pull request Jan 24, 2026
@davida-ps davida-ps changed the base branch from main to pr-67-abutbul-main January 24, 2026 17:27
@davida-ps davida-ps changed the base branch from pr-67-abutbul-main to main January 24, 2026 17:29
@davida-ps davida-ps closed this Feb 16, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants