-
Notifications
You must be signed in to change notification settings - Fork 472
Open
Description
In the follwing code, I believe it intends to process the mask so it can be applied to the same dimension as the image latents; however, I am not sure about mask_np = 1 - mask_np part, in what context would that be necessary?
def preprocess_mask(mask: Image.Image, scale_factor: int = 8) -> torch.Tensor:
"""
Preprocess a mask for the model.
"""
# Convert to grayscale
mask = mask.convert("L")
# Resize to integer multiple of 32
w, h = mask.size
w, h = map(lambda x: x - x % 32, (w, h))
mask = mask.resize((w // scale_factor, h // scale_factor), resample=Image.Resampling.NEAREST)
# Convert to numpy array and rescale
mask_np = np.array(mask).astype(np.float32) / 255.0
# Tile and transpose
mask_np = np.tile(mask_np, (4, 1, 1))
mask_np = mask_np[None].transpose(0, 1, 2, 3) # what does this step do?
# Invert to repaint white and keep black
mask_np = 1 - mask_np
return torch.from_numpy(mask_np)
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels