Skip to content

Conversation

@RandomGitUser321
Copy link

@RandomGitUser321 RandomGitUser321 commented Jun 21, 2024

What does this PR do?

This PR refactors the pipeline to mirror other common pipelines by adding in callback_on_step_end and callback_on_step_end_tensor_inputs, along with cfg rescaling. In the callbacks, you can interrupt, retrieve latents and/or retrieve pos/neg embeds. The older callback method will continue to provide steps_idx, t and latents, but that's it.

I've added in deprecation warnings for those still using the legacy callback and callback_steps method, instead of the newer callback_on_step_end and callback_on_step_end_tensor_inputs method, as well added in an error for if you tried to use both at the same time.

To sum it up, this adds:

  • callback_on_step_end and callback_on_step_end_tensor_inputs, which allow you to obtain latents and pos/neg embeds
  • deprecation warnings for the older callback and callback_steps methods
  • the ability to use self._interrupt=True on callback_on_step_end
  • cfg rescaling
Some snippets of the code in my app that I tested it with to verify that it works:
def interrupt_callback(self, i, t, callback_kwargs):
    # using latching variable for onkeypress event to trigger
    if not queue_latch:
        self._interrupt = True
    
    latents = callback_kwargs["latents"]
    with torch.no_grad():
        image = pipe.vae.decode(latents / 0.13025, return_dict=False)[0]
        image = pipe.image_processor.postprocess(image, output_type="pil")
        image[0].save(f"{i}.png")

    return callback_kwargs

and

latents = pipe(
                prompt_embeds=prompt_embeds,
                negative_prompt_embeds=negative_embeds,
                prompt_attention_mask=prompt_attention_mask,
                negative_prompt_attention_mask=negative_prompt_attention_mask,
                num_images_per_prompt=1,
                height=height,
                width=width,
                num_inference_steps=steps,
                guidance_scale=cfg,
                generator=seedgen,
                callback_on_step_end=interrupt_callback, ###############
                callback_on_step_end_tensor_inputs=["latents"], ########
                output_type="latent",
            ).images

A test image showing the latent callbacks working at each step(can also be used to generate realtime previews in apps like shown in my update 4 comment):
image

Example of the interrupt callback working while using my app:
image

Example of cfg rescale working(gif compression degrades quality a lot):
cfg

Before submitting

Who can review?

Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.

Had to change some things around vs how the original file was, in order to get the callbacks to work correctly. I tried to base a lot of the layout from newer diffusers like SD3,
@RandomGitUser321 RandomGitUser321 changed the title pixart sigma: add in an interrupt callback pixart sigma: add in an interrupt callback + latent callbacks Jun 23, 2024
@RandomGitUser321

This comment was marked as resolved.

@RandomGitUser321 RandomGitUser321 changed the title pixart sigma: add in an interrupt callback + latent callbacks pixart sigma: callbacks(interrupt, latent, pos/neg embeds) and cfg_rescale Jun 28, 2024
Copy link
Collaborator

@yiyixuxu yiyixuxu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks!
I think this PR should focus on adding the new callback and cfg_rescale, I think we should not include these changes introduced for encode_prompt in this PR

additionally, can you test out the dynamic classifier-free guidance on pixart using the new callback API? https://huggingface.co/docs/diffusers/using-diffusers/callback#dynamic-classifier-free-guidance

@RandomGitUser321
Copy link
Author

RandomGitUser321 commented Jun 28, 2024

thanks! I think this PR should focus on adding the new callback and cfg_rescale, I think we should not include these changes introduced for encode_prompt in this PR

I addressed this in the code comments I reverted the negative_prompt changes, will open a new PR eventually after this one is finished.

additionally, can you test out the dynamic classifier-free guidance on pixart using the new callback API? https://huggingface.co/docs/diffusers/using-diffusers/callback#dynamic-classifier-free-guidance

I'll try to take a look at it as well, but after I figure out how to handle the old callback/callbacksteps deprecation.

The lecay callback will require (self, step_idx, t, latents), but has 1:1 parity with the newer callback_on_step_end method. I also included a deprecation warning and an error if both are used at the same time.
@RandomGitUser321
Copy link
Author

RandomGitUser321 commented Jun 29, 2024

@yiyixuxu Alright, I reworked the legacy callback/callback_steps back in. The legacy callback will require (self, step_idx, t, latents), but has 1:1 parity with the newer callback_on_step_end method. I also included a deprecation warning and an error if both are used at the same time.

I tested both in my app and was able to get latent callbacks for previews and interrupt the process still(only with the newer method). If needed, I can add some kind of message warning if the callback(self, step_idx, t, latents) line toward the very end of the code detects only three inputs, instead of the four, as a hint that people just need to add a self or something else like that to their def somefunction(self, i, t, latents): function that they use for their callbacks.

For struckout stuff, read my comment below. I reverted the old callback to callback(steps_idx, t, latents) again.

@RandomGitUser321
Copy link
Author

@yiyixuxu @sayakpaul I reverted the negative_prompt changes and fully updated my original post to be more clear about the changes.

Since the original implementation doesn't appear to have the ability to interrupt, I'm just going to roll this back. If people want to interrupt, they need to use the newer method anyways, since the older callback method is deprecating.

The legacy callback will still provide step_idx, t and latents, like before.
@RandomGitUser321
Copy link
Author

RandomGitUser321 commented Jun 30, 2024

Since the legacy implementation of callback doesn't appear to have the ability to interrupt, I'm just going to roll this back. If people want to interrupt, they should be using the newer method anyways, since the older callback method is being deprecated.

The legacy callback will still function the same as before callback(step_idx, t, latents)

@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Sep 14, 2024
@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

@github-actions github-actions bot removed the stale Issues that haven't received updates label Nov 17, 2024
@github-actions
Copy link
Contributor

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Dec 11, 2024
@sayakpaul sayakpaul removed the stale Issues that haven't received updates label Dec 12, 2024
@sayakpaul
Copy link
Member

@RandomGitUser321 apologies for the delay on our end. But would love to come to the PR. What is blocking for this PR currently? How can we help?

@github-actions
Copy link
Contributor

github-actions bot commented Jan 5, 2025

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

@github-actions github-actions bot added the stale Issues that haven't received updates label Jan 5, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

stale Issues that haven't received updates

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants