Take it or Leaf it: Plant Disease Classification through Convolutional Neural Networks

Basak J, Chua G, Danao B, Roberto R

Abstract

One of the major threats to food security are Crop Diseases, but their rapid identification remains difficult in many parts of the world due to the lack of the necessary infrastructure and modern technologies. The combination of increasing global smartphone penetration and recent advances in computer vision made possible by deep learning has paved the way for smartphone-assisted disease diagnosis. Using a public dataset of over 4522 images of diseased and healthy plant leaves collected under controlled conditions, we train a deep convolutional neural network to identify 3 crop species and 10 diseases (or absence thereof). The trained model achieves an accuracy of 97.7% on a held-out test set, demonstrating the feasibility of this approach. Overall, the approach of training deep learning models on increasingly large and publicly available image datasets presents a clear path toward smartphone-assisted crop disease diagnosis on a massive global scale.

Introduction

In order to meet the demand of more than 7 billion people in the world, human society has been able to harness the power of modern technologies to produce enough food. However, one of the major factors shaking the food demand, viz, food security, remains threatened by a number of factors, which includes decline in pollinators (Report of the Plenary of the Intergovernmental Science-PolicyPlatform on Biodiversity Ecosystem and Services on the work of its fourth session, 2016),, climate change (Tai et al., 2014), plant diseases and others.

Plant diseases are not only a threat to food security at the global scale but can also have consequences that will turn out to be disastrous for smallholder farmers whose livelihoods depend on healthy crops. In the developing world, more than 80 percent of the agricultural production is generated by smallholder farmers (UNEP, 2013), and reports of yield loss of more than 50% due to pests and diseases are common (Harvey et al., 2014). Furthermore, the largest fraction of hungry people (50%) live in smallholder farming households (Sanchez and Swaminathan, 2005), making smallholder farmers a group that's particularly vulnerable to pathogen-derived disruptions in food supply.

There have been various innovations developed from protecting crops from diseases. Widespread application of pesticides in the past decade increasingly been supplemented by Integrated Pest Management (IPM) approaches (Ehler,2006). Apart from the approaches, it is a crucial step to identify disease correctly when it first appears for effective disease management. Agricultural extension organizations have been historically supported disease identification such as local plant clinics. In recent times, such efforts have additionally been supported by providing information disease diagnosis online, leveraging the increasing internet penetration worldwide. Even more recently, application based on smart phones have proliferated, taking advantage of the historically unparalleled rapid uptake on mobile phone technology globally. (ITU,2015).

Smartphones offers a very novel approaches to help identify diseases because of their computing power, high-resolution displays, and extensive built-in sets of accessories, such as advanced HD cameras. It is widely estimated that there will be between 5 and 6 billion smartphones on the globe by 2020. At the end of 2015, already 69% of the world's population had access to mobile broadband coverage, and mobile broadband penetration reached 47% in 2015, a 12-fold increase since 2007 (ITU, 2015). The combined factors of widespread smartphone penetration, HD cameras, and high-performance processors in mobile devices lead to a situation where disease diagnosis based on automated image recognition, if technically feasible, can be made available at an unprecedented scale. Here, we demonstrate the technical feasibility using a deep learning approach utilizing 54,306 images of 14 crop species with 26 diseases (or healthy) made openly available through the project PlantVillage (Hughes and Salathé, 2015).

Computer vision, and object recognition, has made tremendous advances in the past few years. The PASCAL VOC Challenge (Everingham et al., 2010), and more recently the Large Scale Visual Recognition Challenge (ILSVRC) (Russakovsky et al., 2015) based on the ImageNet dataset (Deng et al., 2009) have been widely used as benchmarks for numerous visualization-related problems in computer vision, including object classification. In 2012, a large, deep convolutional neural network achieved a top-5 error of 16.4% for the classification of images into 1000 possible categories (Krizhevsky et al., 2012). In the following 3 years, various advances in deep convolutional neural networks lowered the error rate to 3.57% (Krizhevsky et al., 2012; Simonyan and Zisserman, 2014; Zeiler and Fergus, 2014; He et al., 2015; Szegedy et al., 2015). While training large neural networks can be very time-consuming, the trained models can classify images very quickly, which makes them also suitable for consumer applications on smartphones.

Deep neural networks have recently been successfully applied in many diverse domains as examples of end to end learning. Neural networks provide a mapping between an input—such as an image of a diseased plant—to an output—such as a crop-disease pair. The nodes in a neural network are mathematical functions that take numerical inputs from the incoming edges and provide a numerical output as an outgoing edge. Deep neural networks are simply mapping the input layer to the output layer over a series of stacked layers of nodes. The challenge is to create a deep network in such a way that both the structure of the network as well as the functions (nodes) and edge weights correctly map the input to the output. Deep neural networks are trained by tuning the network parameters in such a way that the mapping improves during the training process. This process is computationally challenging and has in recent times been improved dramatically by several both conceptual and engineering breakthroughs (LeCun et al., 2015; Schmidhuber, 2015).

In order to develop accurate image classifiers for the purposes of plant disease diagnosis, we needed a large, verified dataset of images of diseased and healthy plants. To address this problem, the PlantVillage project has begun collecting tens of thousands of images of healthy and diseased crop plants (Hughes and Salathé, 2015), and has made them openly and freely available. Here, we report on the classification of 10 diseases in 3 crop species using 4522 images with a convolutional neural network approach. We measure the performance of our models based on their ability to predict the correct crop-diseases pair. The best performing model achieves a mean F1 score of 0.977 (overall accuracy of 97.70%), hence demonstrating the technical feasibility of our approach. Our results are a first step toward a smartphone-assisted plant disease diagnosis system.

Methods and Results

The Plant Village Dataset

Plant Village is an organization advocating for the development of technologies for the agricultural sector. The organization has collated over 50,000 images of leaves from different plants which are healthy or otherwise affected by some disease. For this study, a sample of 4522 images are taken from the Plant Village dataset that correspond to a total of 10 classes.

Table 1. Samples taken from the Plant Village Leaf Database

Plant	Status	Number
Bell Pepper	Healthy	499
Bell Pepper	Bacterial Spot	499
Potato	Healthy	152
Potato	Early Blight	500
Potato	Late Blight	500
Tomato	Healthy	500
Tomato	Bacterial Spot	500
Tomato	Target Spot	500
Tomato	Mosaic Virus	373
Tomato	Yellow Leaf Curl	499

Each image is a 256x256 pixel colored image of a leaf corresponding to both a species of a plant and its condition. These images were taken in approximately the same position, angle, and background. To generate images like those taken by equipment with poor resolution or by taking the images of leaves at a distance, the 256x256 images were downscaled to 64x64 images.

Fig.1. 256x256 image of a potato leaf afflicted with early blight and its downscaled version

Model and Training

The architecture of the entire model is composed of multiple CNNs and dense networks. The convolutional neural networks are used to obtain the features of the images whose outputs are fed into dense hidden layers with varying dropout layers in between.

Table 2. Architecture of the two models the only change being in their final max pooling layer

Hi-Res Model Layer	Activation	Low-Res Model Layer	Activation
CNN (2x2)	ReLU	CNN (2x2)	ReLU
Normalization	-	Normalization	-
Max Pooling (3x3)	-	Max Pooling (3x3)	-
Dropout (25%)	-	Dropout (25%)	-
CNN (3x3)	ReLU	CNN (3x3)	ReLU
Normalization	-	Normalization	-
CNN (3x3)	ReLU	CNN (3x3)	ReLU
Normalization	-	Normalization	-
Max Pooling (2x2)	-	Max Pooling (2x2)	-
Dropout (25%)	-	Dropout (25%)	-
CNN (3x3)	ReLU	CNN (3x3)	ReLU
Normalization	-	Normalization	-
CNN (3x3)	ReLU	CNN (3x3)	ReLU
Normalization	-	Normalization	-
CNN (3x3)	ReLU	CNN (3x3)	ReLU
Normalization	-	Normalization	-
Max Pooling (2x2)	-	Max Pooling (4x4)	-
Normalization	-	Normalization	-
Dropout (25%)	-	Dropout (25%)	-
Flatten	-	Flatten	-
Dense (1024)	ReLU	Dense (1024)	ReLU
Normalization	-	Normalization	-
Dropout (20%)	-	Dropout (20%)	-
Normalization	-	Normalization	-
Dense (10)	Softmax	Dense (10)	Softmax

Two separate models are generated for the 256x256 images and the 64x64 images. The architecture described is the same for both models with the only difference being in the final max pooling layer which was changed in the interest of simplifying computation time. The CNNs all had ReLU activation functions while the final dense output used a Softmax function since there are many classes which can be predicted.

The entire dataset was split into a training and test set with a 75:25 ratio. To improve the generalization of the model given the number of images obtained, data augmentation was performed on the training set. Images used to train the model were rotated up to 30 degrees, height shifted up to 15%, sheared horizontally up to 15%, zoomed in up to 20% and could be horizontally flipped. The training set per batch was drawn from a uniformly random distribution of these transformed images.

In training, the loss function to be optimized is the binary cross-entropy while using an Adam optimizer. Despite the problem being a multi-class classification problem generally requiring a categorical cross-entropy function to be optimized, a binary cross-entropy function proved to result in significantly better performance. The optimizer used a learning rate of 6e-4 and a decay of 1.2e-5. For each trial, the model was trained over 50 epochs with a batch size of 32.

To determine how well the model performed the metric chosen was the F1 score. In addition, the mean precision and mean recall of these models across all classes was also observed. The performance of the models was assessed by taking the highest average F1 score over 5 trials for both the 256x256 image case and the 64x64 image case.

Results

To determine how well the model performed the metric chosen was the F1-Score. In addition, the mean precision and mean recall of these models across all classes was also observed. The performance of the models was assessed by taking the highest average F1-Score over 5 trials for both the 256x256 image case and the 64x64 image case.

Table 2. Model performance for different Model F1-Score 256x25 97.7% 64x64 97.2

Model	F1-Score
256x256	97.7%
64x64	97.2%

Fig.3. Confusion matrices for the 256x256 model (left) and the 64x64 model (right)

Analyzing the results of the classification report, the model had difficulty classifying healthy potato leaves. Potato leaves tagged as healthy were being misclassified as healthy bell pepper leaves or as having late blight. The misclassification of leaves to be of a different healthy species is no real cause for alarm as it can easily be controlled when batch testing leaves. This kind of error could be attributed to leaves being in various states of folding and having irregular shapes which confuses the model into identifying potato leaves as bell pepper leaves. On the other hand, the misclassifications of the leaves as having late blight could prove to be problematic when making decisions. If plants are classified to have late blight, measures taken to mitigate its spread can be wasteful of resources and can even hamper succeeding efforts to

Significant misclassifications also occur for tomato leaves afflicted with target spot. Majority of tomatoes with target spot were deemed to be healthy by the model. This could be attributed to target spot occupying small and very irregular locations in the leaf could have made it difficult for the model to pick up on these salient features.

The misclassifications from the 64x64 model showed a similar trend as the 256x256 model when handling tomato leaves with target spot. This presents similar problems to before but are less pronounced as there are less misclassifications for target spot overall.

The model failed to differentiate well between early blight and late blight for potatoes. The appearance of the two diseases are very similar with the difference mainly being the shape of their affected area in a leaf. Due to the down sampling of the image, edges became less defined which could have resulted in poorer differentiation between the two closely related classes.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
catalog		catalog
.gitattributes		.gitattributes
OhmyGulAI-BEST-256x256.ipynb		OhmyGulAI-BEST-256x256.ipynb
OhmyGulAI-BEST-64x64.ipynb		OhmyGulAI-BEST-64x64.ipynb
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Take it or Leaf it: Plant Disease Classification through Convolutional Neural Networks

Basak J, Chua G, Danao B, Roberto R

Abstract

Introduction

Methods and Results

The Plant Village Dataset

Model and Training

Results

About

Uh oh!

Releases

Packages

Languages

jishubasak/Leaf-Prediction

Folders and files

Latest commit

History

Repository files navigation

Take it or Leaf it: Plant Disease Classification through Convolutional Neural Networks

Basak J, Chua G, Danao B, Roberto R

Abstract

Introduction

Methods and Results

The Plant Village Dataset

Model and Training

Results

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages