This repo consist of exploratory work related to AI pen testing using open source versions of garak and promptfoo
Note:- Please use this repo for exploration and learning purposes while keeping ethics in mind. The responses elicited from this exploration or the type of probes used during this exploration DOES NOT represent my beliefs.
Medium article can be found at https://medium.com/@keerthi.ningegowda/one-prompt-to-break-it-all-automated-ai-red-teaming-with-garak-and-promptfoo-315331438fbf?postPublishedType=initial
The RAG workflow at https://github.com/KeerthiNingegowda/n8n_workflow was exposed via webhook and the associated API was pentested to mimic production. Plus this probbaly the only was to pentest GenAI features at scale
Note:- Basic authorization was used in n8n workflow. Convert the username and password using Bse64 encoding
echo -n "uname":"pwd" | base64
For tracking the request-response cycle between these pentesting tool and the N8N workflow you can use Burpsuite community edition. The tool intercepts the CLI http request-response back-and-forth via a proxy. Long story short you get to
- See each prompt and associated buffs and the associated response from the AI model
- Manipulate the request from garak on the fly, without and entire orchestra
Before starting with pentesting, do check out the original paper published by authors of garak - https://arxiv.org/pdf/2406.11036 . A snapshot of their framework is below:-

To list the all the probes via CLI
garak --list_probes
In this example, the RAG workflow at https://github.com/KeerthiNingegowda/n8n_workflow was exposed via webhook.
Sample example - running DAN (Do Anything Now) probes
python3 -m garak -m rest.RestGenerator -G ./garak_testing/telecom_config.json --probes dan.AntiDAN --report_prefix ~/Desktop/AI_pentesting_RAG_chatbot/garak_reports/antiDAN
Pro-tip:- Use -g option to control the number of prompts to be generated for each probe type. Use --parallel_attempts option to parallelize sending requests
Sample result for probe hijacking safety filters of an LLM and making it say bad stuff about humans

From this exploartion, I think that garak is more appropriate to use at model level rather than application level. Garak also has some limited probes related to AI applications related to different modalities like images.
Is quite similar to garak, except that it is more dynamic and developer facing than garak. In garak the probes are created using a database with known attacks and vulnerabilities whereas in promptfoo, the probes are more context-specific i.e you need to provide information on what type of application you are testing to be able to pentest using promptfoo.
The difference between Promptfoo and garak is quite well discussed at https://www.promptfoo.dev/blog/promptfoo-vs-garak/
A simple snapshot of their methodology
In arguably promptfoo has more comprehensive test suite/strategies compared to garak. Very easy to configure how many test cases to run as opposed to garak. It is very tricky to do this in garak. Note that for some of the basic testing strategies an API key is not necessary. The type of plugins you use will determine how long it will take for your testing to run.
To run a test with yaml file use - In the directory where promptfooconfig.yaml is present
promptfoo redteam run
To view the report - In the directory where promptfooconfig.yaml is present
promptfoo redteam report
Sample response for successful document extraction in RAG

https://arxiv.org/html/2410.16527v2
Note:- In this paper promptfoo is not included for sake of comparison. But has a lot of interesting insights from similar other tools like Giskard.
Use trufflehog to check if you left any secrets/api keys hanging around. Based on the results you can doublecheck your code/results
You can also rely on Gitguardian . This looks more efficient than trufflehog
PS:- I intentionally left my n8n basic auth credentials 😂. Gitguardian caught it and trufflehog didnt. Just as a reminder N8N workflow is running locally not even exposed within local network.
