-
Notifications
You must be signed in to change notification settings - Fork 32
Open
Description
https://github.com/Saquib764/omini-kontext/blob/main/src/train/model.py#L123
x_0, x_ids = encode_images(self.flux_pipe, target_images)
x_init, init_img_ids = encode_images(self.flux_pipe, input_images)
# Prepare reference image with delta
x_ref, ref_img_ids = encode_images(self.flux_pipe, reference_images)
# Apply position delta to reference image IDs
delta = reference_deltas[0]
ref_img_ids[:, 0] += delta[0]
ref_img_ids[:, 1] += delta[1]
ref_img_ids[:, 2] += delta[2]
# Combine input and reference images
condition = torch.cat([x_init, x_ref], dim=1)
condition_ids = torch.cat([init_img_ids, ref_img_ids], dim=0)
cond_latents_ids = FluxKontextPipeline._prepare_latent_image_ids(
cond_model_input.shape[0],
cond_model_input.shape[2] // 2,
cond_model_input.shape[3] // 2,
accelerator.device,
weight_dtype,
)
cond_latents_ids[..., 0] = 1
latent_image_ids = torch.cat([latent_image_ids, cond_latents_ids], dim=0)
Why do not add 1 on init_img_ids[:, 0] like flux kontext?
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels