Skip to content

“Rewrite-then-Edit” two-stage pipelines & system-prompt leaks: do they systematically boost RISEBench scores? #7

@wyhlovecpp

Description

@wyhlovecpp

Congratulation for Neurips Oral work. But I still have some question. Several models on your leaderboard (e.g., Gemini-2.5-Flash-Image, Seedream, BAGEL w/ CoT) appear to run a two-stage pipeline: an MLLM first rewrites/structures the edit instruction (often with CoT), then a second stage performs the actual image edit. The community has also shown system-prompt leaks from some closed models suggesting this “rewrite-then-generate” template like Seedream and nano banana.

Questions / requests for clarification:

  1. If we apply a strong, neutral MLLM rewriter (e.g., GPT-5 / Gemini / Qwen-VL) to all RISEBench prompts before feeding them to open-source editors (e.g., FLUX.1-Kontext-dev, Qwen-Image-Edit), do scores jump substantially?
  2. Does the leaderboard distinguish direct instruction vs rewrite-then-edit settings? Could you provide a reproducible toggle and a sub-leaderboard to avoid methodological apples-to-oranges?
  3. Would you consider an official Prompt-Rewrite Protocol (with a fixed rewriter + template) so we can attribute gains between “instruction enhancement” and the “editor’s intrinsic capability”?

Thanks—clear guidance here would be helpful

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions