[Script request]: Step1X-Edit: A Practical Framework for General Image Editing #6535
Replies: 2 comments 1 reply
-
Do you have a few GPUs and GPU docks to spare? Then we can take a look at it and get it halfway up and running. ![]() |
Beta Was this translation helpful? Give feedback.
1 reply
-
What has happened guys, do you have your Proxmox running on a Potato? |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Application Name
Step1X-Edit
Website
https://github.com/stepfun-ai/Step1X-Edit
Description
We introduce a state-of-the-art image editing model, Step1X-Edit, which aims to provide comparable performance against the closed-source models like GPT-4o and Gemini2 Flash. More specifically, we adopt the Multimodal LLM to process the reference image and user's editing instruction. A latent embedding has been extracted and integrated with a diffusion image decoder to obtain the target image. To train the model, we build a data generation pipeline to produce a high-quality dataset. For evaluation, we develop the GEdit-Bench, a novel benchmark rooted in real-world user instructions. Experimental results on GEdit-Bench demonstrate that Step1X-Edit outperforms existing open-source baselines by a substantial margin and approaches the performance of leading proprietary models, thereby making significant contributions to the field of image editing. More details please refer to our technical report.
Due Diligence
Beta Was this translation helpful? Give feedback.
All reactions