Skip to content

Positional Stickiness #8

@afiaka87

Description

@afiaka87

For lack of a better word; I've noticed during training that the VitGAN tends to get stuck on one, two, or three (i don't see four happen very often/at all) "positional blobs" for lack of better words.

Does this match your experience? Effectively what I'm see is that the VitGAN needs to slide from one generation to the next in its latent space. In doing so - it seems to find that it's easier to just sort of create two "spots" in the image that are highly likely to contain specific concepts from each caption.

Does this match your experience? Any idea if this is bad/good? In my experience with the "chimera" examples; it seems to hurt things.

progress_0000422000
progress_0000421900
progress_0000421400

I hope you can see what I mean - there's a position in particular that seems designated for the "head" of the animal. But it also biases the outputs from other captions as well; for instance -

tri - x 4 0 0 tx a cylinder made of coffee beans . a cylinder with the texture of coffee beans .
progress_0000418200

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions