@@ -21,6 +21,68 @@ And a big thanks to all GitHub sponsors who helped with some of my costs before
2121
2222## What's New
2323
24+ # Dec 5, 2022
25+
26+ * Pre-release (` 0.8.0dev0 ` ) of multi-weight support (` model_arch.pretrained_tag ` )
27+ * vision_transformer, maxvit, convnext are the first three model impl w/ support
28+ * model names are changing with this (previous _ 21k, etc. fn will merge), still sorting out deprecation handling
29+ * bugs are likely, but I need feedback so please try it out
30+ * if stability is needed, please use 0.6.x pypi releases or clone from [ 0.6.x branch] ( https://github.com/rwightman/pytorch-image-models/tree/0.6.x )
31+ * Support for PyTorch 2.0 compile is added in train/validate/inference/benchmark, use ` --torchcompile ` argument
32+ * Inference script allows more control over output, select k for top-class index + prob json, csv or parquet output
33+ * Add a full set of fine-tuned CLIP image tower weights from both LAION-2B and original OpenAI CLIP models
34+
35+ | model | top1 | param_count | gmac | macts | hub |
36+ | :-------------------------------------------------| -------:| --------------:| -------:| --------:| :-------------------------------------------------------------------------------------|
37+ | vit_huge_patch14_clip_336.laion2b_ft_in12k_in1k | 88.6 | 632.5 | 391 | 407.5 | [ link] ( https://huggingface.co/timm/vit_huge_patch14_clip_336.laion2b_ft_in12k_in1k ) |
38+ | vit_large_patch14_clip_336.openai_ft_in12k_in1k | 88.3 | 304.5 | 191.1 | 270.2 | [ link] ( https://huggingface.co/timm/vit_large_patch14_clip_336.openai_ft_in12k_in1k ) |
39+ | vit_huge_patch14_clip_224.laion2b_ft_in12k_in1k | 88.2 | 632 | 167.4 | 139.4 | [ link] ( https://huggingface.co/timm/vit_huge_patch14_clip_224.laion2b_ft_in12k_in1k ) |
40+ | vit_large_patch14_clip_336.laion2b_ft_in12k_in1k | 88.2 | 304.5 | 191.1 | 270.2 | [ link] ( https://huggingface.co/timm/vit_large_patch14_clip_336.laion2b_ft_in12k_in1k ) |
41+ | vit_large_patch14_clip_224.openai_ft_in12k_in1k | 88.2 | 304.2 | 81.1 | 88.8 | [ link] ( https://huggingface.co/timm/vit_large_patch14_clip_224.openai_ft_in12k_in1k ) |
42+ | vit_large_patch14_clip_224.laion2b_ft_in12k_in1k | 87.9 | 304.2 | 81.1 | 88.8 | [ link] ( https://huggingface.co/timm/vit_large_patch14_clip_224.laion2b_ft_in12k_in1k ) |
43+ | vit_large_patch14_clip_224.openai_ft_in1k | 87.9 | 304.2 | 81.1 | 88.8 | [ link] ( https://huggingface.co/timm/vit_large_patch14_clip_224.openai_ft_in1k ) |
44+ | vit_large_patch14_clip_336.laion2b_ft_in1k | 87.9 | 304.5 | 191.1 | 270.2 | [ link] ( https://huggingface.co/timm/vit_large_patch14_clip_336.laion2b_ft_in1k ) |
45+ | vit_huge_patch14_clip_224.laion2b_ft_in1k | 87.6 | 632 | 167.4 | 139.4 | [ link] ( https://huggingface.co/timm/vit_huge_patch14_clip_224.laion2b_ft_in1k ) |
46+ | vit_large_patch14_clip_224.laion2b_ft_in1k | 87.3 | 304.2 | 81.1 | 88.8 | [ link] ( https://huggingface.co/timm/vit_large_patch14_clip_224.laion2b_ft_in1k ) |
47+ | vit_base_patch16_clip_384.laion2b_ft_in12k_in1k | 87.2 | 86.9 | 55.5 | 101.6 | [ link] ( https://huggingface.co/timm/vit_base_patch16_clip_384.laion2b_ft_in12k_in1k ) |
48+ | vit_base_patch16_clip_384.openai_ft_in12k_in1k | 87 | 86.9 | 55.5 | 101.6 | [ link] ( https://huggingface.co/timm/vit_base_patch16_clip_384.openai_ft_in12k_in1k ) |
49+ | vit_base_patch16_clip_384.laion2b_ft_in1k | 86.6 | 86.9 | 55.5 | 101.6 | [ link] ( https://huggingface.co/timm/vit_base_patch16_clip_384.laion2b_ft_in1k ) |
50+ | vit_base_patch16_clip_384.openai_ft_in1k | 86.2 | 86.9 | 55.5 | 101.6 | [ link] ( https://huggingface.co/timm/vit_base_patch16_clip_384.openai_ft_in1k ) |
51+ | vit_base_patch16_clip_224.laion2b_ft_in12k_in1k | 86.2 | 86.6 | 17.6 | 23.9 | [ link] ( https://huggingface.co/timm/vit_base_patch16_clip_224.laion2b_ft_in12k_in1k ) |
52+ | vit_base_patch16_clip_224.openai_ft_in12k_in1k | 85.9 | 86.6 | 17.6 | 23.9 | [ link] ( https://huggingface.co/timm/vit_base_patch16_clip_224.openai_ft_in12k_in1k ) |
53+ | vit_base_patch32_clip_448.laion2b_ft_in12k_in1k | 85.8 | 88.3 | 17.9 | 23.9 | [ link] ( https://huggingface.co/timm/vit_base_patch32_clip_448.laion2b_ft_in12k_in1k ) |
54+ | vit_base_patch16_clip_224.laion2b_ft_in1k | 85.5 | 86.6 | 17.6 | 23.9 | [ link] ( https://huggingface.co/timm/vit_base_patch16_clip_224.laion2b_ft_in1k ) |
55+ | vit_base_patch32_clip_384.laion2b_ft_in12k_in1k | 85.4 | 88.3 | 13.1 | 16.5 | [ link] ( https://huggingface.co/timm/vit_base_patch32_clip_384.laion2b_ft_in12k_in1k ) |
56+ | vit_base_patch16_clip_224.openai_ft_in1k | 85.3 | 86.6 | 17.6 | 23.9 | [ link] ( https://huggingface.co/timm/vit_base_patch16_clip_224.openai_ft_in1k ) |
57+ | vit_base_patch32_clip_384.openai_ft_in12k_in1k | 85.2 | 88.3 | 13.1 | 16.5 | [ link] ( https://huggingface.co/timm/vit_base_patch32_clip_384.openai_ft_in12k_in1k ) |
58+ | vit_base_patch32_clip_224.laion2b_ft_in12k_in1k | 83.3 | 88.2 | 4.4 | 5 | [ link] ( https://huggingface.co/timm/vit_base_patch32_clip_224.laion2b_ft_in12k_in1k ) |
59+ | vit_base_patch32_clip_224.laion2b_ft_in1k | 82.6 | 88.2 | 4.4 | 5 | [ link] ( https://huggingface.co/timm/vit_base_patch32_clip_224.laion2b_ft_in1k ) |
60+ | vit_base_patch32_clip_224.openai_ft_in1k | 81.9 | 88.2 | 4.4 | 5 | [ link] ( https://huggingface.co/timm/vit_base_patch32_clip_224.openai_ft_in1k ) |
61+
62+ * Port of MaxViT Tensorflow Weights from official impl at https://github.com/google-research/maxvit
63+ * There was larger than expected drops for the upscaled 384/512 in21k fine-tune weights, possible detail missing, but the 21k FT did seem sensitive to small preprocessing
64+
65+ | model | top1 | param_count | gmac | macts | hub |
66+ | :-----------------------------------| -------:| --------------:| -------:| --------:| :-----------------------------------------------------------------------|
67+ | maxvit_xlarge_tf_512.in21k_ft_in1k | 88.5 | 475.8 | 534.1 | 1413.2 | [ link] ( https://huggingface.co/timm/maxvit_xlarge_tf_512.in21k_ft_in1k ) |
68+ | maxvit_xlarge_tf_384.in21k_ft_in1k | 88.3 | 475.3 | 292.8 | 668.8 | [ link] ( https://huggingface.co/timm/maxvit_xlarge_tf_384.in21k_ft_in1k ) |
69+ | maxvit_base_tf_512.in21k_ft_in1k | 88.2 | 119.9 | 138 | 704 | [ link] ( https://huggingface.co/timm/maxvit_base_tf_512.in21k_ft_in1k ) |
70+ | maxvit_large_tf_512.in21k_ft_in1k | 88 | 212.3 | 244.8 | 942.2 | [ link] ( https://huggingface.co/timm/maxvit_large_tf_512.in21k_ft_in1k ) |
71+ | maxvit_large_tf_384.in21k_ft_in1k | 88 | 212 | 132.6 | 445.8 | [ link] ( https://huggingface.co/timm/maxvit_large_tf_384.in21k_ft_in1k ) |
72+ | maxvit_base_tf_384.in21k_ft_in1k | 87.9 | 119.6 | 73.8 | 332.9 | [ link] ( https://huggingface.co/timm/maxvit_base_tf_384.in21k_ft_in1k ) |
73+ | maxvit_base_tf_512.in1k | 86.6 | 119.9 | 138 | 704 | [ link] ( https://huggingface.co/timm/maxvit_base_tf_512.in1k ) |
74+ | maxvit_large_tf_512.in1k | 86.5 | 212.3 | 244.8 | 942.2 | [ link] ( https://huggingface.co/timm/maxvit_large_tf_512.in1k ) |
75+ | maxvit_base_tf_384.in1k | 86.3 | 119.6 | 73.8 | 332.9 | [ link] ( https://huggingface.co/timm/maxvit_base_tf_384.in1k ) |
76+ | maxvit_large_tf_384.in1k | 86.2 | 212 | 132.6 | 445.8 | [ link] ( https://huggingface.co/timm/maxvit_large_tf_384.in1k ) |
77+ | maxvit_small_tf_512.in1k | 86.1 | 69.1 | 67.3 | 383.8 | [ link] ( https://huggingface.co/timm/maxvit_small_tf_512.in1k ) |
78+ | maxvit_tiny_tf_512.in1k | 85.7 | 31 | 33.5 | 257.6 | [ link] ( https://huggingface.co/timm/maxvit_tiny_tf_512.in1k ) |
79+ | maxvit_small_tf_384.in1k | 85.5 | 69 | 35.9 | 183.6 | [ link] ( https://huggingface.co/timm/maxvit_small_tf_384.in1k ) |
80+ | maxvit_tiny_tf_384.in1k | 85.1 | 31 | 17.5 | 123.4 | [ link] ( https://huggingface.co/timm/maxvit_tiny_tf_384.in1k ) |
81+ | maxvit_large_tf_224.in1k | 84.9 | 211.8 | 43.7 | 127.4 | [ link] ( https://huggingface.co/timm/maxvit_large_tf_224.in1k ) |
82+ | maxvit_base_tf_224.in1k | 84.9 | 119.5 | 24 | 95 | [ link] ( https://huggingface.co/timm/maxvit_base_tf_224.in1k ) |
83+ | maxvit_small_tf_224.in1k | 84.4 | 68.9 | 11.7 | 53.2 | [ link] ( https://huggingface.co/timm/maxvit_small_tf_224.in1k ) |
84+ | maxvit_tiny_tf_224.in1k | 83.4 | 30.9 | 5.6 | 35.8 | [ link] ( https://huggingface.co/timm/maxvit_tiny_tf_224.in1k ) |
85+
2486### Oct 15, 2022
2587* Train and validation script enhancements
2688* Non-GPU (ie CPU) device support
0 commit comments