Skip to content

MB-Shihab-Aaqil-Ahamed/A-TPT

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

[ICLR'26] A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models

🚀 News

  • January 31, 2026: Evaluation codes for A-TPT are released.
  • January 26, 2026: Paper accepted at ICLR 2026! 🎉
  • October 29, 2025: Paper is available on arXiv! 📄

For more details, please feel free to check out our:

Paper arXiv Project Page

This repository provides the official PyTorch implementation of our ICLR 2026 paper:

A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models Authors: Shihab Aaqil Ahamed, Udaya S.K.P. Miriya Thanthrige, Ranga Rodrigo, Muhammad Haris Khan

Our major contributions are summarized as follows:

  • We introduce a numerical optimization method, called A-TPT, for better calibration of test-time prompt tuning for VLMs. This resolves the suboptimal performance of existing leading calibration techniques for test-time prompt tuning.
  • We introduce novel angular diversity that effectively promotes the diversity among textual features, thereby improving the calibration capabilities of VLMs when $N > |D|$ and $N < |D|$. This is accomplished by maximizing the minimum pairwise angular distance between normalized textual features.
  • We conduct extensive experiments to validate the generalizability of our approach on different datasets, including medical datasets, across various baselines. The results show that A-TPT surpasses state-of-the-art methods in calibration performance. We also provide thorough analyses, including theoretical aspects. Moreover, our approach provides superior calibration compared to the zero-shot CLIP model, which reveals improved calibration.

Angular Diversity (AD) vs. Expected Calibration Error (ECE)

Hard Prompt Tuned Prompt

Comparison with C-TPT and O-TPT

Installation

# Clone this repo
git clone https://github.com/MB-Shihab-Aaqil-Ahamed/A-TPT.git
cd A-TPT

# Create a conda enviroment
conda env create -f environment.yml
conda activate atpt

Datasets

We evaluate our method (A-TPT) on fine-grained and natural distribution shift datasets:

  • For fine-grained classification, we consider 11 datasets:

    • ImageNet
    • Flower102
    • OxfordPets
    • SUN397
    • DTD
    • Food101
    • StanfordCars
    • Aircraft
    • UCF101
    • EuroSAT
    • Caltech101
  • For natural distribution shift, we consider 4 datasets:

    • ImageNet-V2
    • ImageNet-A
    • ImageNet-R
    • ImageNet-Sketch

Prepare the datasets based on the following GitHub repository TPT.

Experiments

In each of the bash script .sh files, change the {data_root} accordingly. And, you can change the CLIP pretrained backbone by modifying the {arch} parameter to either ‘RN50’ or ‘ViT-B/16’. Also, you can change baselines by modifying the {run_type} to either ‘tpt’ or ‘tpt_ts’ or ‘tpt_atpt’.

  1. Baseline (CLIP)
bash scripts/test_baseline.sh {dataset}
  1. Test-Time Prompt Tuning (TPT)
# for Fine-grained classification
bash scripts/test_tpt_fg.sh {dataset}

# for natural distribution shift
bash scripts/test_tpt_ds.sh {dataset}

# for temperature scaling experiments, change the run_type to tpt_ts in the .sh file.
  1. Ours (A-TPT)
# for Fine-grained classification
bash scripts/test_tpt_atpt_fg.sh {dataset}

# for natural distribution shift
bash scripts/test_tpt_atpt_ds.sh {dataset}

The command line argument {dataset} can be specified as follows: for fine-grained classification datasets, ‘I’, ‘DTD’, ‘Flower102’, ‘Food101’, ‘Cars’, ‘SUN397’, ‘Aircraft’, ‘Pets’, ‘Caltech101’, ‘UCF101’, or ‘eurosat’, and for datasets with natural distribution shifts, ‘V2’, ‘A’, ‘R’, or ‘K’.

Results

Comparison on Fine-grained Datasets

CLIP ViT-B/16 (512-d)

Method Metric ImageNet DTD Flowers102 Food101 SUN397 Aircrafts OxfordPets Caltech101 UCF101 EuroSAT Stanford Cars Average
Baseline Acc. 66.70 44.30 67.30 83.60 62.50 23.90 88.00 92.90 65.00 41.30 65.30 63.70
ECE 2.12 8.50 3.00 2.39 2.53 5.11 4.37 5.50 3.59 13.89 4.25 4.43
TPT Acc. 69.00 46.70 69.00 84.70 64.50 23.40 87.10 93.80 67.30 42.40 66.30 65.00
ECE 10.60 21.20 13.50 3.98 11.30 16.80 5.77 4.51 2.54 13.20 5.16 11.60
C-TPT Acc. 68.50 46.00 69.80 83.70 64.80 24.85 88.20 93.63 65.70 43.20 65.80 64.57
ECE 3.15 11.90 5.04 3.43 5.04 4.36 1.90 4.24 2.54 13.20 1.59 5.13
O-TPT Acc. 67.33 45.68 70.07 84.13 64.23 23.64 87.95 93.95 64.16 42.84 64.53 64.41
ECE 1.96 7.88 3.87 1.46 4.93 3.68 1.90 3.80 2.34 12.98 1.78 4.23
A-TPT Acc. 67.70 45.51 69.22 83.64 66.04 23.76 88.33 93.87 66.16 44.06 65.78 64.92
(Ours) ECE 1.45 4.76 3.61 1.37 3.28 3.14 1.17 2.76 2.12 3.92 1.09 2.61

CLIP RN50 (1024-d)

Method Metric ImageNet DTD Flowers102 Food101 SUN397 Aircrafts OxfordPets Caltech101 UCF101 EuroSAT Stanford Cars Average
Baseline Acc. 58.10 40.00 61.00 74.00 58.60 15.60 83.80 85.80 58.40 23.70 55.70 55.90
ECE 2.09 9.91 3.19 3.11 3.54 6.45 5.91 4.33 3.05 15.40 4.70 5.61
TPT Acc. 60.70 41.50 62.50 74.90 61.10 17.00 84.50 87.00 59.50 28.30 58.00 57.70
ECE 11.40 25.70 13.40 5.25 9.24 16.10 3.65 5.04 12.40 22.50 3.76 11.70
C-TPT Acc. 60.20 42.20 65.20 74.70 61.00 17.00 84.10 86.90 59.70 27.80 56.50 57.75
ECE 3.01 19.80 4.14 1.86 2.93 10.70 2.77 2.07 3.83 15.10 1.94 6.19
O-TPT Acc. 58.97 41.90 65.61 74.22 60.85 16.77 83.40 86.86 58.84 28.35 56.44 57.47
ECE 3.10 16.53 2.50 1.20 3.20 8.18 3.50 2.75 2.60 14.71 1.69 5.45
A-TPT Acc. 58.44 40.90 64.89 74.10 60.46 14.58 83.48 86.57 60.24 32.14 57.08 57.53
(Ours) ECE 2.49 6.41 2.39 1.11 2.90 6.14 2.47 1.98 2.34 2.51 1.38 2.92

Comparison on Natural Distribution Shift Datasets

CLIP ViT-B/16 (512-d)

Method Metric ImageNet-A ImageNet-V2 ImageNet-R ImageNet-S Average
Baseline Acc. 47.80 60.80 74.00 46.10 57.20
ECE 8.61 3.01 3.58 4.95 5.04
TPT Acc. 52.60 63.00 76.70 47.50 59.90
ECE 16.40 11.10 4.36 16.10 12.00
C-TPT Acc. 51.60 62.70 76.00 47.90 59.60
ECE 8.16 6.23 1.54 7.35 5.82
O-TPT Acc. 49.87 61.65 72.55 47.12 57.80
ECE 7.22 3.97 1.46 6.87 4.88
A-TPT Acc. 50.39 60.90 74.87 46.09 58.06
(Ours) ECE 6.45 2.96 1.39 4.87 3.92

CLIP RN50 (1024-d)

Method Metric ImageNet-A ImageNet-V2 ImageNet-R ImageNet-S Average
Baseline Acc. 21.70 51.40 56.00 33.30 40.60
ECE 21.30 3.33 2.07 3.15 7.46
TPT Acc. 25.20 54.60 58.90 35.10 43.50
ECE 31.00 13.10 9.18 13.70 16.70
C-TPT Acc. 23.40 54.70 58.00 35.10 42.80
ECE 25.40 8.58 4.57 9.70 12.10
O-TPT Acc. 23.07 53.11 54.47 33.98 41.16
ECE 24.56 3.87 4.47 5.85 9.69
A-TPT Acc. 21.66 51.48 55.78 33.37 40.57
(Ours) ECE 21.14 3.10 3.96 3.09 7.82

Qualitative Results - Reliability Diagrams

CLIP ViT-B/16

Method Food DTD FLW Air UCF Cars SUN
C-TPT
O-TPT
A-TPT

CLIP RN50 Backbone

Method Air UCF Cars SUN
C-TPT
O-TPT
A-TPT

Acknowledgement

The computational resources for this research were supported by the Accelerating Higher Education Expansion and Development (AHEAD) Operation Grant No. 6026-LK/8743-LK from the Ministry of Higher Education, Sri Lanka, funded by the World Bank and the National Research Council of Sri Lanka Grant No. 19-080.

Also we would like thank the authors of the CoOp/CoCoOp, TPT and C-TPT for releasing their code open-source and their instructions for data preparation.

Citation

If you find our work, this repository useful in your research, please consider giving a star ⭐ and citation.

@inproceedings{ahamed2026atpt,
   title     = {A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models},
   author    = {Shihab Aaqil Ahamed and Udaya Sampath K. Perera Miriya Thanthrige and Ranga Rodrigo and Muhammad Haris Khan},
   booktitle = {The Fourteenth International Conference on Learning Representations},
   year      = {2026},
   url       = {https://openreview.net/forum?id=VhlSBZebEw}
}

Contact

If you have any questions, please feel free to reach out at shihabaaqilahamed@gmail.com

About

[ICLR'26] Official code for "A-TPT: Angular Diversity Calibration Properties for Test-Time Prompt Tuning of Vision-Language Models"

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors