mask

In the follwing code, I believe it intends to process the mask so it can be applied to the same dimension as the image latents; however, I am not sure about mask_np = 1 - mask_np part, in what context would that be necessary?

def preprocess_mask(mask: Image.Image, scale_factor: int = 8) -> torch.Tensor:
    """
    Preprocess a mask for the model.
    """
    # Convert to grayscale
    mask = mask.convert("L")

    # Resize to integer multiple of 32
    w, h = mask.size
    w, h = map(lambda x: x - x % 32, (w, h))
    mask = mask.resize((w // scale_factor, h // scale_factor), resample=Image.Resampling.NEAREST)

    # Convert to numpy array and rescale
    mask_np = np.array(mask).astype(np.float32) / 255.0

    # Tile and transpose
    mask_np = np.tile(mask_np, (4, 1, 1))
    mask_np = mask_np[None].transpose(0, 1, 2, 3)  # what does this step do?

    # Invert to repaint white and keep black
    mask_np = 1 - mask_np

    return torch.from_numpy(mask_np)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mask #187

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

mask #187

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions