Skip to content

Commit 54a7dfd

Browse files
committed
update readme
1 parent c73f656 commit 54a7dfd

File tree

1 file changed

+3
-3
lines changed

1 file changed

+3
-3
lines changed

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -64,18 +64,18 @@ To use VILA-HD models, please refer to [VILA-HD repo](https://github.com/NVlabs/
6464

6565
## Performance
6666

67-
### Performance of PS3 models
67+
### Comparing to other high-res encoding approaches such as AnyRes and S<sup>2</sup>
6868

6969
See Table 1 in the paper for full results.
7070

7171
| Vision Model | Pre-Trained Weights | Max Resolution | # High-Res Token | TextVQA | ChartQA | DocVQA | InfoVQA | OCRBench | V*Bench | RealWorldQA | Avg |
7272
|---------------------|-------------------------------------------------------------------------|----------------|------------------|---------|---------|--------|---------|----------|---------|-------------|------|
7373
| SigLIP | | 378 | 0 | 62.3 | 56.6 | 51.9 | 30.7 | 387 | 51.8 | 57.1 | 49.9 |
7474
| SigLIP + AnyRes | | 1512 | 3136 | 67.4 | 58.4 | 67.9 | 34.1 | 468 | 60.2 | 59.0 | 56.3 |
75-
| SigLIP + S2 | | 1512 | 2916 | 66.1 | 71.0 | 78.3 | 41.1 | 526 | 55.2 | 61.0 | 60.8 |
75+
| SigLIP + S<sup>2</sup> | | 1512 | 2916 | 66.1 | 71.0 | 78.3 | 41.1 | 526 | 55.2 | 61.0 | 60.8 |
7676
| **PS3-1.5K-SigLIP** | [nvidia/PS3-1.5K-SigLIP](https://huggingface.co/nvidia/PS3-1.5K-SigLIP) | 1512 | 3645 | 69.3 | 71.1 | 79.4 | 41.3 | 534 | 64.0 | 63.8 | 63.2 |
7777
| SigLIP + AnyRes | | 3780 | 19600 | OOM | OOM | OOM | OOM | OOM | OOM | OOM | OOM |
78-
| SigLIP + S2 | | 3780 | 18225 | OOM | OOM | OOM | OOM | OOM | OOM | OOM | OOM |
78+
| SigLIP + S<sup>2</sup> | | 3780 | 18225 | OOM | OOM | OOM | OOM | OOM | OOM | OOM | OOM |
7979
| **PS3-4K-SigLIP** | [nvidia/PS3-4K-SigLIP](https://huggingface.co/nvidia/PS3-4K-SigLIP) | 3780 | 3840 | 69.8 | 70.9 | 79.1 | 40.5 | 543 | 67.8 | 64.7 | 63.9 |
8080

8181

0 commit comments

Comments
 (0)