You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: src/content/blog/building-image-classifier-fastai.md
+18-56Lines changed: 18 additions & 56 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,6 +1,6 @@
1
1
---
2
2
title: "Building an Image Classifier Really Fast Using Fastai"
3
-
author: "Your Name"# Replace with the actual author's name
3
+
author: "Sajal Sharma"# Replace with the actual author's name
4
4
pubDatetime: 2022-10-28T00:00:00Z
5
5
slug: building-image-classifier-fastai
6
6
featured: false
@@ -10,7 +10,7 @@ tags:
10
10
- Computer Vision
11
11
- fastai
12
12
description: "In this post, I demonstrate how to quickly build an image classifier using the fastai library, a powerful tool for practical deep learning. The project involves classifying images of fruit as either rotten or fresh."
13
-
canonicalURL: ""# Add if the article is published elsewhere
13
+
canonicalURL: ""# Add if the article is published elsewhere
14
14
---
15
15
16
16
## Table of contents
@@ -19,15 +19,14 @@ canonicalURL: "" # Add if the article is published elsewhere
19
19
20
20
I recently started the [fast.ai](https://course.fast.ai/Lessons/lesson1.html) course to build up my practical deep learning skills. In order to better retain what I learn, I'm going to be writing a series of posts/notebooks, implementing my own models based on the course content. This notebook is written based on what I learned from the first week of the course.
21
21
22
-
In this notebook we'll build an image classifier using the [fastai](https://docs.fast.ai), a deep learning library built on top of Pytorch that provides both high-level and low-level components to quickly build state-of-the-art models for common deep learning domains.
22
+
In this notebook we'll build an image classifier using the [fastai](https://docs.fast.ai), a deep learning library built on top of Pytorch that provides both high-level and low-level components to quickly build state-of-the-art models for common deep learning domains.
23
23
24
24
We'll build a model that can classify images of fruit into a binary category: rotten or not. You can imagine such a model being used inside refrigerators to detect if produce kept inside it has gone bad.
25
25
26
26
When I started learning ML in 2016, building such models was a non-trivial task. Libraries to build deep neural networks were still in their infancy (Pytorch was introduced in late 2016), and building accurate image classification models required a certain degree of specialized knowledge. All that has changed and, as you'll notice in the notebook, we can build an image classifier using just a few lines of code.
27
27
28
28
Let's get started!
29
29
30
-
31
30
```python
32
31
import os
33
32
# !pip install -Uqq fastai duckduckgo_search
@@ -39,7 +38,6 @@ We'll be needing the `duckduckgo_search` package to quickly search for, and down
Now that we know what duckduckgo image search is working fine, we can download images for both rotten and fresh fruit and store them in their respective directories. We use time.sleep to avoid spamming the search API.
@@ -127,14 +114,12 @@ Searching for 'fresh banana'
127
114
Searching for 'fresh vegetables'
128
115
```
129
116
130
-
131
117
## Training our model
132
118
133
119
We have our images and the next step is to train a model. Again, it blows my mind how simple this is using fastai. I'll briefly explain what the below blocks of code are doing.
134
120
135
121
First, we check if all image files can be opened correctly using a fastai vision library utility verify_images. If it can't be opened, then we unlink it from our path so that is is not used in model training.
136
122
137
-
138
123
```python
139
124
# validate images
140
125
failed=verify_images(get_image_files(path))
@@ -146,21 +131,18 @@ len(failed)
146
131
0
147
132
```
148
133
149
-
150
-
151
134
Next, we'll use another building block from the fastai library, the `DataBlock` class, which we can use to represent our training data, the labels, data splitting criteria, and any data transformations.
152
135
153
136
`blocks=(ImageBlock, CategoryBlock)` is used to specify what kind of data is in the DataBlock. We have images, and categories - hence a tuple of ImageBlock and CategoryBlock classes.
154
137
155
-
`get_items` takes the function `get_image_files` as its parameter. `get_image_files` is used to find the paths of our input images.
138
+
`get_items` takes the function `get_image_files` as its parameter. `get_image_files` is used to find the paths of our input images.
156
139
157
-
`splitter=RandomSplitter(valid_pct=0.2, seed=42)` specifies that we want to randomly split our input data into training and validation sets, using 20% data for validation.
140
+
`splitter=RandomSplitter(valid_pct=0.2, seed=42)` specifies that we want to randomly split our input data into training and validation sets, using 20% data for validation.
158
141
159
142
`get_y=parent_label` specifies that the labels for an image file is its parent (the directory that the file belongs to).
160
143
161
144
`item_tfms=[Resize(192, method='squish')]` specifies the transformation performed on each file. Here we are resizing each image to 192x192 pixels by squishing it. Another option could be to `crop` the image.
Above you can see a batch of images from our DataBlock, along with their labels. This is a nice way of quickly knowing if a sample from our data is correct (images/labels).
184
161
185
162
To train our model we will fine-tune the resnet18, which is one of the most widely used computer vision models, on our dataset.
@@ -316,23 +282,19 @@ Probability it's fresh: 1.0000
316
282
317
283
Let's see if our model can predict if a given image is of a rotten orange or a fresh orange. We haven't explicitly downloaded images of fresh/rotten oranges for our training set, so it would be a good generalization on "unseen data".
Not bad at all. The model seems to generalize fine. Though, a more accurate measure of generalizability would involve creating a separate test set and calculating performance metrics.
323
+
Not bad at all. The model seems to generalize fine. Though, a more accurate measure of generalizability would involve creating a separate test set and calculating performance metrics.
363
324
364
325
## Summary
365
326
366
327
There you have it! With a few lines of code we have created our own image classification model by fine-tuning off the shelf models with fastai. The high level apis that the library provides makes the process of building an initial model a breeze.
367
-
If you want to run the notebook for yourself, you can check it out on Kaggle [here](https://www.kaggle.com/code/sajalsharma26/is-the-fruit-rotten-or-not). I urge you to try building your own
328
+
If you want to run the notebook for yourself, you can check it out on Kaggle [here](https://www.kaggle.com/code/sajalsharma26/is-the-fruit-rotten-or-not). I urge you to try building your own
368
329
classification model on images from duckduckgo search.
369
330
370
331
I'll be going over the rest of the fastai course in the coming weeks. Even though I have only done the first two weeks till now, I highly recommend it for anyone
371
332
interested in Machine Learning, more so for people with a coding background.
372
333
373
334
## Resources
335
+
374
336
- fastai Course: https://course.fast.ai/
375
-
- Notebook on Kaggle: https://www.kaggle.com/code/sajalsharma26/is-the-fruit-rotten-or-not
337
+
- Notebook on Kaggle: https://www.kaggle.com/code/sajalsharma26/is-the-fruit-rotten-or-not
0 commit comments