Skip to content

Commit 545ab02

Browse files
committed
No centering
1 parent 3c3df5d commit 545ab02

File tree

1 file changed

+2
-8
lines changed

1 file changed

+2
-8
lines changed

index.md

Lines changed: 2 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -82,9 +82,7 @@ Three specific issues arise from the ViT sparse sampling problem;
8282
These issues hinder efficient optimization of SFS under standard tokenization - in other words, we posit that **grids cannot align every salient region**.
8383

8484
![Issues with Grid Tokenization](/figures/nocover.png)
85-
<div align="center">
8685
*Figure 1: A $5 \times 5$ patch grid (gray) with three optimal region placements for sparse feature selection. **(a)** The green patch is well aligned (A), yellow straddles two cells (B), and red lies on a corner (C) and leaks into four cells. Translating the grid only swaps which peak is misaligned---one patch is always bad. **(b)** Our subpixel tokenizer drops fixed-size windows (\textcolor{ok}{green} squares) directly on each peak, eliminating the alignment trade-off while still allowing conventional grid tokens when they \emph{are} well aligned.*
87-
</div>
8886

8987

9088
## Methodology: SPoT in a Nutshell
@@ -115,19 +113,15 @@ This means that models can be evaluated with the exact same features as a standa
115113
By removing the strict adherence to grids in ViTs, we can leverage more continuous spatial priors for token placements for optimal feature extraction.
116114
We compare several spatial priors, each encoding different assumptions about feature importance and spatial distribution.
117115

118-
119-
![Spatial Priors](/figures/spatialprior.png)
120-
<div align="center">
121-
*Figure 2: An illustration of different spatial priors investigated with SPoT.*
122-
</div>
123-
124116
- *Uniform*: randomly samples locations with no spatial bias, assuming all regions are equally important.
125117
- *Gaussian*: randomly samples locations with a central bias, which encodes a prior belief that subjects are typically centered in images.
126118
- *Sobol*: provides quasirandom sampling aimed at uniform coverage while reducing overlap.
127119
- *Isotropic*: deterministically distributes tokens evenly in a subpixel grid, emphasizing coverage.
128120
- *Center*: deterministically distributes tokens evenly with slight central-bias.
129121
- *Salient*: encodes object-centric bias by placing tokens based on regions identified as visually salient from a pretrained saliency model.
130122

123+
![Spatial Priors](/figures/spatialprior.png)
124+
*Figure 2: An illustration of different spatial priors investigated with SPoT.*
131125

132126
### Exploring Oracle Neighbourhoods with SPoT-ON
133127
In addition to investigating spatial different spatial priors, we also look to directly explore differentiable optimization for token placement.

0 commit comments

Comments
 (0)