Positional Stickiness

For lack of a better word; I've noticed during training that the VitGAN tends to get stuck on one, two, or three (i don't see four happen very often/at all) "positional blobs" for lack of better words. 

Does this match your experience? Effectively what I'm see is that the VitGAN needs to slide from one generation to the next in its latent space. In doing so - it seems to find that it's easier to just sort of create two "spots" in the image that are highly likely to contain specific concepts from each caption. 

Does this match your experience? Any idea if this is bad/good? In my experience with the "chimera" examples; it seems to hurt things.

![progress_0000422000](https://user-images.githubusercontent.com/3994972/127195754-29a5ef99-e270-488d-a5ed-53a49478ad40.png)
![progress_0000421900](https://user-images.githubusercontent.com/3994972/127195766-53960862-7d50-4d3b-828e-b620dbd1bea3.png)
![progress_0000421400](https://user-images.githubusercontent.com/3994972/127195789-4a8ab821-6e6c-4c5e-94fc-d44429d097d0.png)

I hope you can see what I mean - there's a position in particular that seems designated for the "head" of the animal. But it also biases the outputs from other captions as well; for instance -

`tri - x 4 0 0 tx a cylinder made of coffee beans . a cylinder with the texture of coffee beans .`
![progress_0000418200](https://user-images.githubusercontent.com/3994972/127196036-14e8ea0b-40de-4fe6-b1b6-632b8196c01f.png)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Positional Stickiness #8

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Positional Stickiness #8

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions