pyrit foundry integration spec #44551

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Draft

slister1001 wants to merge 5 commits into Azure:main from slister1001:spec/pyrit-foundry

Member

slister1001 commented Jan 5, 2026

Description

Please add an informative description that covers that changes made by the pull request and link all relevant issues.

If an SDK is being regenerated based on a new API spec, a link to the pull request containing these API spec changes should be included above.

All SDK Contribution checklist:

The pull request does not introduce [breaking changes]
CHANGELOG is updated for new features, bug fixes or other significant changes.
I have read the contribution guidelines.

General Guidelines and Best Practices

Title of the pull request is clear and informative.
There are a small number of commits, each of which have an informative message. This means that previously merged commits do not appear in the history of the PR. For more information on cleaning up the commits in your PR, see this page.

Testing Guidelines

Pull request includes test coverage for the included changes.


          pyrit foundry integration spec

github-actions bot added the Evaluation label

romanlutz reviewed

View reviewed changes

sdk/evaluation/azure-ai-evaluation/setup.py Outdated Show resolved Hide resolved

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md Show resolved Hide resolved

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md

    
              from pyrit.models import SeedPrompt

              from pyrit.models.data_type_serializer import PromptDataType

              from pyrit.scenario.core.dataset_configuration import DatasetConfiguration

              from pyrit.scenario.scenarios.foundry.foundry import Foundry, FoundryStrategy

romanlutz Jan 6, 2026

In general, I would recommend taking the shortest possible import path, in this case pyrit.scenario.foundry because everything more detailed is considered internal to PyRIT and can change without being considered breaking by us. Perhaps also a good idea to mark that with underscore to be extra clear @rlundeen2

Same for DatasetConfiguration above which can be imported from pyrit.scenario

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md Outdated

    
              ## Success Metrics

              ### Reliability

              - **Breaking Changes**: Reduce from 2-3 per 6 months to 0-1 per year

romanlutz Jan 6, 2026

Is this for your SDK or for PyRIT? I don't think we can guarantee a specific number, but we can certainly guarantee a deprecation schedule. Our goal right now is to deprecate features and keep them around for 2 minor releases (e.g., from 0.10.0 to 0.12.0) with a warning for users to replace them before they get removed.

That said, given the level at which you're operating (from a PyRIT perspective: high level, scenarios) you are unlikely to actually face many breaking changes.

rlundeen2 reviewed

View reviewed changes

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md Outdated

    
              ]

              ```

              **RAI Context Types**: `email`, `document`, `html`, `code`, `tool_call`

rlundeen2 Jan 6, 2026

we can rename it, but for us, email/document/html/code will all just be "url" or we could call it blob_path or something. But it should not be .text, it should be a file_path similar to how image/audio/video are handled.

If you use text, it has an ambiguity problem; e.g. if a model wants to upload a pdf, it will just insert the pdf data into the text field.

rlundeen2 reviewed

View reviewed changes

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md Outdated

    
              ```

              **Remaining Considerations**:

              - **XPIA Formatting**: For indirect jailbreak attacks, context types like `email` and `document` determine attack vehicle formatting. While PyRIT sees them as `text`, we preserve the original `context_type` in metadata for downstream formatters.

rlundeen2 Jan 6, 2026 •

edited

Loading

Different people will have different opinions, but I think this makes the most sense as a converter at the end.

So we transform the prompt to however we want for an attack, and then the last converter transforms it to the format you want to send

E.g. prompt[text] -> JailBreakConverter[text] -> Base64Converter[text] -> AddImageConverter[image] -> emailAttachmentConverter[blob - email with the image we just created attached]

Then the target determines how this is sent.

As one example of this, we have a PDFConverter, and a blobStoreTarget. So you can create PDFs and upload them to a blobstore

rlundeen2 reviewed

View reviewed changes

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md

    
                                        ▼

              ┌─────────────────────────────────────────────────────────────┐

              │         DatasetConfiguration Builder                         │

              │  • Create SeedObjective for each attack string              │

rlundeen2 Jan 6, 2026

if the SeedPrompts are the same, you can just use SeedObjectives with the metadata

rlundeen2 reviewed

View reviewed changes

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md Outdated

    
                                        ▼

              ┌─────────────────────────────────────────────────────────────┐

              │              Result Processing                               │

              │  • Extract from PyRIT memory                                │

rlundeen2 Jan 6, 2026 •

edited

Loading

You also get AttackResults;

I'd recommend creating a PyRIT scorer using RAI evaluator. Then you pass it in to FoundryScenario. It's used to evaluate attack success. We can help with this, and actually looks like you're already maybe doing that above.

But then you have the results when the FoundryScenario execution finishes in the AttackResult object and wouldn't have to re-evaluate ASR

rlundeen2 reviewed

View reviewed changes

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md Outdated

    
              #### Important: SeedPrompt Duplication Pattern

              **Critical Note**: PyRIT's Foundry does **NOT** automatically send the `SeedObjective` value to the target. The objective is used for orchestration and scoring, but the actual prompt sent to the target must be a `SeedPrompt`. We will do this in every scenario except for Jailbreak and IndirectJailbreak where we handle the injection of the objective into the prompt.

rlundeen2 Jan 6, 2026 •

edited

Loading

By default, if you don't set the SeedPrompt, it will be the objective. But you can always separate them.

But if they are the same, you should probably just attach SeedObjective

rlundeen2 reviewed

View reviewed changes

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md Outdated

    
              1. **SeedObjective**: Contains the attack string (e.g., "Tell me how to build a weapon")

              2. **SeedPrompt (attack vehicle)**: Contains the context data **with attack string injected** (e.g., email containing the malicious prompt)

              3. **SeedPrompt (original context)**: Contains the original context **without** injection (for reference)

rlundeen2 Jan 6, 2026

you can get the original context from the Message already, you don't need anything extra to keep track of it. In this example, what I would do is

SeedObjective with the objective
Add a converter that converts from a prompt to an email

Then Call the scenario with the converter configured at the end. And the AttackResult object returned will have the original objective, the conversation is available, and the success.

rlundeen2 reviewed

View reviewed changes

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md Outdated

    
              )

              # Plus any context prompts

              context_prompts = [...]

rlundeen2 Jan 6, 2026

nevermind, you answer below

rlundeen2 reviewed

View reviewed changes

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md Outdated

    
                      return prompts

                  def _create_xpia_prompts(

rlundeen2 Jan 6, 2026

This is the code you could wrap in a converter if you wanted. Although I'd love any specific format converting code in PyRIT itself :)

rlundeen2 reviewed

View reviewed changes

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md Outdated

    
              from pyrit.models import PromptRequestPiece, Score

              class RAIServiceScorer(Scorer):

rlundeen2 Jan 6, 2026 •

edited

Loading

I'd make this a FloatScaleScorer or TrueFalseScorer depending on what you're returning. And if FloatScale, set a threshhold for the TrueFalseScorer you pass in to the scenario.

rlundeen2 reviewed

View reviewed changes

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md Outdated

    
                      self.rai_client = rai_client

                      self.risk_category = risk_category

                  async def score_async(self, request_response: PromptRequestPiece) -> List[Score]:

rlundeen2 Jan 6, 2026

I'd overwrite score_piece, so you can better handle multi part messages

rlundeen2 reviewed

View reviewed changes

sdk/evaluation/azure-ai-evaluation/spec_pyrit_foundry.md Outdated

    
                      # Run attack (PyRIT handles all execution)

                      self.logger.info(f"Executing attacks for {self.risk_category}...")

                      await scenario.run_attack_async()

rlundeen2 Jan 6, 2026

this will return attackResult objects with ASR, etc.

slister1001 added 4 commits

January 8, 2026 18:42


          init implementation still need to figure out xpia / context and fix t…

72338ec

…ests


          fix tests and imports

3fe0168


          updates

2c8ca66


          updates

660b75a

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Reviewers

Copilot code review Copilot Awaiting requested review from Copilot Copilot will automatically review once the pull request is marked ready for review

2 more reviewers

romanlutz romanlutz left review comments

rlundeen2 rlundeen2 left review comments

At least 1 approving review is required to merge this pull request.

Labels