Skip to content

mask #187

@michaelku1

Description

@michaelku1

In the follwing code, I believe it intends to process the mask so it can be applied to the same dimension as the image latents; however, I am not sure about mask_np = 1 - mask_np part, in what context would that be necessary?

def preprocess_mask(mask: Image.Image, scale_factor: int = 8) -> torch.Tensor:
"""
Preprocess a mask for the model.
"""
# Convert to grayscale
mask = mask.convert("L")

# Resize to integer multiple of 32
w, h = mask.size
w, h = map(lambda x: x - x % 32, (w, h))
mask = mask.resize((w // scale_factor, h // scale_factor), resample=Image.Resampling.NEAREST)

# Convert to numpy array and rescale
mask_np = np.array(mask).astype(np.float32) / 255.0

# Tile and transpose
mask_np = np.tile(mask_np, (4, 1, 1))
mask_np = mask_np[None].transpose(0, 1, 2, 3)  # what does this step do?

# Invert to repaint white and keep black
mask_np = 1 - mask_np

return torch.from_numpy(mask_np)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions