[ICLR'26] A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models
- January 31, 2026: Evaluation codes for A-TPT are released.
- January 26, 2026: Paper accepted at ICLR 2026! 🎉
- October 29, 2025: Paper is available on arXiv! 📄
For more details, please feel free to check out our:
This repository provides the official PyTorch implementation of our ICLR 2026 paper:
A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models Authors: Shihab Aaqil Ahamed, Udaya S.K.P. Miriya Thanthrige, Ranga Rodrigo, Muhammad Haris Khan
Our major contributions are summarized as follows:
- We introduce a numerical optimization method, called
A-TPT, for better calibration of test-time prompt tuning for VLMs. This resolves the suboptimal performance of existing leading calibration techniques for test-time prompt tuning. - We introduce novel angular diversity that effectively promotes the diversity among textual features, thereby improving the calibration capabilities of VLMs when
$N > |D|$ and$N < |D|$ . This is accomplished by maximizing the minimum pairwise angular distance between normalized textual features. - We conduct extensive experiments to validate the generalizability of our approach on different datasets, including medical datasets, across various baselines. The results show that
A-TPTsurpasses state-of-the-art methods in calibration performance. We also provide thorough analyses, including theoretical aspects. Moreover, our approach provides superior calibration compared to the zero-shot CLIP model, which reveals improved calibration.
![]() |
![]() |
|---|---|
| Hard Prompt | Tuned Prompt |
![]() |
![]() |
|---|
# Clone this repo
git clone https://github.com/MB-Shihab-Aaqil-Ahamed/A-TPT.git
cd A-TPT
# Create a conda enviroment
conda env create -f environment.yml
conda activate atptWe evaluate our method (A-TPT) on fine-grained and natural distribution shift datasets:
-
For fine-grained classification, we consider 11 datasets:
- ImageNet
- Flower102
- OxfordPets
- SUN397
- DTD
- Food101
- StanfordCars
- Aircraft
- UCF101
- EuroSAT
- Caltech101
-
For natural distribution shift, we consider 4 datasets:
- ImageNet-V2
- ImageNet-A
- ImageNet-R
- ImageNet-Sketch
Prepare the datasets based on the following GitHub repository TPT.
In each of the bash script .sh files, change the {data_root} accordingly. And, you can change the CLIP pretrained backbone by modifying the {arch} parameter to either ‘RN50’ or ‘ViT-B/16’. Also, you can change baselines by modifying the {run_type} to either ‘tpt’ or ‘tpt_ts’ or ‘tpt_atpt’.
- Baseline (CLIP)
bash scripts/test_baseline.sh {dataset}- Test-Time Prompt Tuning (TPT)
# for Fine-grained classification
bash scripts/test_tpt_fg.sh {dataset}
# for natural distribution shift
bash scripts/test_tpt_ds.sh {dataset}
# for temperature scaling experiments, change the run_type to tpt_ts in the .sh file.- Ours (A-TPT)
# for Fine-grained classification
bash scripts/test_tpt_atpt_fg.sh {dataset}
# for natural distribution shift
bash scripts/test_tpt_atpt_ds.sh {dataset}The command line argument {dataset} can be specified as follows: for fine-grained classification datasets, ‘I’, ‘DTD’, ‘Flower102’, ‘Food101’, ‘Cars’, ‘SUN397’, ‘Aircraft’, ‘Pets’, ‘Caltech101’, ‘UCF101’, or ‘eurosat’, and for datasets with natural distribution shifts, ‘V2’, ‘A’, ‘R’, or ‘K’.
CLIP ViT-B/16 (512-d)
| Method | Metric | ImageNet | DTD | Flowers102 | Food101 | SUN397 | Aircrafts | OxfordPets | Caltech101 | UCF101 | EuroSAT | Stanford Cars | Average |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Baseline | Acc. | 66.70 | 44.30 | 67.30 | 83.60 | 62.50 | 23.90 | 88.00 | 92.90 | 65.00 | 41.30 | 65.30 | 63.70 |
| ECE | 2.12 | 8.50 | 3.00 | 2.39 | 2.53 | 5.11 | 4.37 | 5.50 | 3.59 | 13.89 | 4.25 | 4.43 | |
| TPT | Acc. | 69.00 | 46.70 | 69.00 | 84.70 | 64.50 | 23.40 | 87.10 | 93.80 | 67.30 | 42.40 | 66.30 | 65.00 |
| ECE | 10.60 | 21.20 | 13.50 | 3.98 | 11.30 | 16.80 | 5.77 | 4.51 | 2.54 | 13.20 | 5.16 | 11.60 | |
| C-TPT | Acc. | 68.50 | 46.00 | 69.80 | 83.70 | 64.80 | 24.85 | 88.20 | 93.63 | 65.70 | 43.20 | 65.80 | 64.57 |
| ECE | 3.15 | 11.90 | 5.04 | 3.43 | 5.04 | 4.36 | 1.90 | 4.24 | 2.54 | 13.20 | 1.59 | 5.13 | |
| O-TPT | Acc. | 67.33 | 45.68 | 70.07 | 84.13 | 64.23 | 23.64 | 87.95 | 93.95 | 64.16 | 42.84 | 64.53 | 64.41 |
| ECE | 1.96 | 7.88 | 3.87 | 1.46 | 4.93 | 3.68 | 1.90 | 3.80 | 2.34 | 12.98 | 1.78 | 4.23 | |
| A-TPT | Acc. | 67.70 | 45.51 | 69.22 | 83.64 | 66.04 | 23.76 | 88.33 | 93.87 | 66.16 | 44.06 | 65.78 | 64.92 |
| (Ours) | ECE | 1.45 | 4.76 | 3.61 | 1.37 | 3.28 | 3.14 | 1.17 | 2.76 | 2.12 | 3.92 | 1.09 | 2.61 |
CLIP RN50 (1024-d)
| Method | Metric | ImageNet | DTD | Flowers102 | Food101 | SUN397 | Aircrafts | OxfordPets | Caltech101 | UCF101 | EuroSAT | Stanford Cars | Average |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| Baseline | Acc. | 58.10 | 40.00 | 61.00 | 74.00 | 58.60 | 15.60 | 83.80 | 85.80 | 58.40 | 23.70 | 55.70 | 55.90 |
| ECE | 2.09 | 9.91 | 3.19 | 3.11 | 3.54 | 6.45 | 5.91 | 4.33 | 3.05 | 15.40 | 4.70 | 5.61 | |
| TPT | Acc. | 60.70 | 41.50 | 62.50 | 74.90 | 61.10 | 17.00 | 84.50 | 87.00 | 59.50 | 28.30 | 58.00 | 57.70 |
| ECE | 11.40 | 25.70 | 13.40 | 5.25 | 9.24 | 16.10 | 3.65 | 5.04 | 12.40 | 22.50 | 3.76 | 11.70 | |
| C-TPT | Acc. | 60.20 | 42.20 | 65.20 | 74.70 | 61.00 | 17.00 | 84.10 | 86.90 | 59.70 | 27.80 | 56.50 | 57.75 |
| ECE | 3.01 | 19.80 | 4.14 | 1.86 | 2.93 | 10.70 | 2.77 | 2.07 | 3.83 | 15.10 | 1.94 | 6.19 | |
| O-TPT | Acc. | 58.97 | 41.90 | 65.61 | 74.22 | 60.85 | 16.77 | 83.40 | 86.86 | 58.84 | 28.35 | 56.44 | 57.47 |
| ECE | 3.10 | 16.53 | 2.50 | 1.20 | 3.20 | 8.18 | 3.50 | 2.75 | 2.60 | 14.71 | 1.69 | 5.45 | |
| A-TPT | Acc. | 58.44 | 40.90 | 64.89 | 74.10 | 60.46 | 14.58 | 83.48 | 86.57 | 60.24 | 32.14 | 57.08 | 57.53 |
| (Ours) | ECE | 2.49 | 6.41 | 2.39 | 1.11 | 2.90 | 6.14 | 2.47 | 1.98 | 2.34 | 2.51 | 1.38 | 2.92 |
CLIP ViT-B/16 (512-d)
| Method | Metric | ImageNet-A | ImageNet-V2 | ImageNet-R | ImageNet-S | Average |
|---|---|---|---|---|---|---|
| Baseline | Acc. | 47.80 | 60.80 | 74.00 | 46.10 | 57.20 |
| ECE | 8.61 | 3.01 | 3.58 | 4.95 | 5.04 | |
| TPT | Acc. | 52.60 | 63.00 | 76.70 | 47.50 | 59.90 |
| ECE | 16.40 | 11.10 | 4.36 | 16.10 | 12.00 | |
| C-TPT | Acc. | 51.60 | 62.70 | 76.00 | 47.90 | 59.60 |
| ECE | 8.16 | 6.23 | 1.54 | 7.35 | 5.82 | |
| O-TPT | Acc. | 49.87 | 61.65 | 72.55 | 47.12 | 57.80 |
| ECE | 7.22 | 3.97 | 1.46 | 6.87 | 4.88 | |
| A-TPT | Acc. | 50.39 | 60.90 | 74.87 | 46.09 | 58.06 |
| (Ours) | ECE | 6.45 | 2.96 | 1.39 | 4.87 | 3.92 |
CLIP RN50 (1024-d)
| Method | Metric | ImageNet-A | ImageNet-V2 | ImageNet-R | ImageNet-S | Average |
|---|---|---|---|---|---|---|
| Baseline | Acc. | 21.70 | 51.40 | 56.00 | 33.30 | 40.60 |
| ECE | 21.30 | 3.33 | 2.07 | 3.15 | 7.46 | |
| TPT | Acc. | 25.20 | 54.60 | 58.90 | 35.10 | 43.50 |
| ECE | 31.00 | 13.10 | 9.18 | 13.70 | 16.70 | |
| C-TPT | Acc. | 23.40 | 54.70 | 58.00 | 35.10 | 42.80 |
| ECE | 25.40 | 8.58 | 4.57 | 9.70 | 12.10 | |
| O-TPT | Acc. | 23.07 | 53.11 | 54.47 | 33.98 | 41.16 |
| ECE | 24.56 | 3.87 | 4.47 | 5.85 | 9.69 | |
| A-TPT | Acc. | 21.66 | 51.48 | 55.78 | 33.37 | 40.57 |
| (Ours) | ECE | 21.14 | 3.10 | 3.96 | 3.09 | 7.82 |
| Method | Food | DTD | FLW | Air | UCF | Cars | SUN |
|---|---|---|---|---|---|---|---|
| C-TPT | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| O-TPT | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| A-TPT | ![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
![]() |
| Method | Air | UCF | Cars | SUN |
|---|---|---|---|---|
| C-TPT | ![]() |
![]() |
![]() |
![]() |
| O-TPT | ![]() |
![]() |
![]() |
![]() |
| A-TPT | ![]() |
![]() |
![]() |
![]() |
The computational resources for this research were supported by the Accelerating Higher Education Expansion and Development (AHEAD) Operation Grant No. 6026-LK/8743-LK from the Ministry of Higher Education, Sri Lanka, funded by the World Bank and the National Research Council of Sri Lanka Grant No. 19-080.
Also we would like thank the authors of the CoOp/CoCoOp, TPT and C-TPT for releasing their code open-source and their instructions for data preparation.
If you find our work, this repository useful in your research, please consider giving a star ⭐ and citation.
@inproceedings{ahamed2026atpt,
title = {A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models},
author = {Shihab Aaqil Ahamed and Udaya Sampath K. Perera Miriya Thanthrige and Ranga Rodrigo and Muhammad Haris Khan},
booktitle = {The Fourteenth International Conference on Learning Representations},
year = {2026},
url = {https://openreview.net/forum?id=VhlSBZebEw}
}If you have any questions, please feel free to reach out at shihabaaqilahamed@gmail.com



































