PlantAIM: A New Baseline Model Integrating Global Attention and Local Features for Enhanced Plant Disease Identification

Abstract

This paper is accepted on Smart Agricultural Technology 2025 [Link]

Plant diseases significantly affect the quality and yield of agricultural production. Conventionally, detection has relied on plant pathologists, but recent advances in deep learning, particularly the Vision Transformer (ViT) and Convolutional Neural Network (CNN), have made it feasible for automated plant disease identification. Despite their prominence, there are still significant gaps in our understanding of how these models differ in feature extraction and representation, particularly in complex multi-crop disease identification tasks. This challenge arises from the simultaneous need to learn crop-specific and disease-specific features for accurate identification of crop species and its associated diseases. To address this, we introduce Plant Disease Glocal-Local Features Fusion Attention Model (PlantAIM), a new hybrid framework that fuses global attention mechanisms of ViT with local feature extraction capabilities of CNN. PlantAIM aims to improve the model's ability to simultaneously learn and focus on crop-specific and disease-specific features. We conduct extensive evaluations to assess the robustness and generalizability of PlantAIM compared to state-of-the-art (SOTA) models, including scenarios with limited training samples and real-world environmental data. Our results show that PlantAIM achieves superior performance. This research not only deepens our understanding of feature learning for ViT and CNN models, but also sets a new benchmark in the dynamic field of plant disease identification. The code will be made available upon publication.

Contribution

We introduce novel Plant Disease Global-Local Features Fusion Attention model (PlantAIM), which combines ViT and CNN components to enhance feature extraction for multi-crop plant disease identification.
Our experimental results demonstrate PlantAIM's exceptional robustness and generalization, achieving state-of-the-art performance in both controlled environments and real-world scenarios.
Our feature visualization analysis reveals that CNNs emphasize plant patterns, while ViTs focus on disease symptoms.

Proposed model

Plant Disease Global-Local Features Fusion Attention model (PlantAIM) [code]

Key feature: combines ViT and CNN components to enhance feature extraction for multi-crop plant disease identification.

Proposed PlantAIM architecture.

Result

Grad-CAM visualization result

Dataset and pretrained weight

PV Dataset: spMohanty Github
(You can group all images into single folder to directly use the csv file provided in this repo)
PlantDoc dataset: Kaggle
IPM and Bing dataset will be release soon
download ViT pretrained weight link (From rwightman Github timm repo)

Implementations

PlantAIM (2H) >> pytorch implementation code

PlantAIM (1H) >> pytorch implementation code

Notes

The csv file (metadata of images) are here

Virtual environment dependencies

Python 3.12.9

python -m venv py
cd .\py\Scripts
activate
pip install torch==2.6.0 torchvision==0.21.0 torchaudio==2.6.0 --index-url https://download.pytorch.org/whl/cu126
pip install -r requirements_plantaim.txt

License

Creative Commons Attribution-Noncommercial-NoDerivative Works 4.0 International License (“the CC BY-NC-ND License”)

Citation

@article{chai2025plantaim,
  title={PlantAIM: A New Baseline Model Integrating Global Attention and Local Features for Enhanced Plant Disease Identification},
  author={Chai, Abel Yu Hao and Lee, Sue Han and Tay, Fei Siang and Go{\"e}au, Herv{\'e} and Bonnet, Pierre and Joly, Alexis},
  journal={Smart Agricultural Technology},
  pages={100813},
  year={2025},
  publisher={Elsevier}
}

Name		Name	Last commit message	Last commit date
Latest commit History 50 Commits
dataset/csv_plantaim		dataset/csv_plantaim
figure		figure
model		model
.gitignore		.gitignore
README.md		README.md
requirements_plantaim.txt		requirements_plantaim.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table of Contents

PlantAIM: A New Baseline Model Integrating Global Attention and Local Features for Enhanced Plant Disease Identification

Abstract

Contribution

Proposed model

Result

Grad-CAM visualization result

Dataset and pretrained weight

Implementations

Virtual environment dependencies

License

See also

Citation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Table of Contents

PlantAIM: A New Baseline Model Integrating Global Attention and Local Features for Enhanced Plant Disease Identification

Abstract

Contribution

Proposed model

Result

Grad-CAM visualization result

Dataset and pretrained weight

Implementations

Virtual environment dependencies

License

See also

Citation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages