Skip to content

Commit fe898ad

Browse files
committed
initial hog descriptor
1 parent 421a929 commit fe898ad

File tree

9 files changed

+155
-0
lines changed

9 files changed

+155
-0
lines changed
81.9 KB
Loading
168 KB
Loading
5.52 KB
Loading
7.3 KB
Loading
6.54 KB
Loading
81.1 KB
Loading
5.25 KB
Loading
5.15 KB
Loading

docs/examples/image_features/hog.jl

Lines changed: 155 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,155 @@
1+
# ---
2+
# cover: assets/hog.gif
3+
# title: Object Detection using HOG
4+
# description: This demo shows HOG descriptor
5+
# author: Anchit Navelkar, Ashwani Rathee
6+
# date: 2021-07-12
7+
# ---
8+
9+
# In this tutorial, we will use Histogram of Oriented Gradient (HOG) feature
10+
# descriptor based linear SVM to create a person detector. We will first create
11+
# a person classifier and then use this classifier with a sliding window to
12+
# identify and localize people in an image.
13+
14+
# The key challenge in creating a classifier is that it needs to work with
15+
# variations in illumination, pose and occlusions in the image. To achieve this,
16+
# we will train the classifier on an intermediate representation of the image
17+
# instead of the pixel-based representation. Our ideal representation (commonly
18+
# called feature vector) captures information which is useful for classification
19+
# but is invariant to small changes in illumination and occlusions. HOG descriptor
20+
# is a gradient-based representation which is invariant to local geometric and
21+
# photometric changes (i.e. shape and illumination changes) and so is a good
22+
# choice for our problem. In fact HOG descriptors are widely used for object detection.
23+
24+
# Download the script to get the training data [here](https://drive.google.com/file/d/11G_9zh9N-0veQ2EL5WDGsnxRpihsqLX5/view?usp=sharing).
25+
# Download tutorial.zip, decompress it and run get_data.bash. (Change the
26+
# variable `path_to_tutorial` in preprocess.jl and path to julia executable
27+
# in get_data.bash). This script will download the required datasets. We will
28+
# start by loading the data and computing HOG features of all the images.
29+
30+
# ```julia
31+
# using Images, ImageFeatures
32+
33+
# path_to_tutorial = "" # specify this path
34+
# pos_examples = "$path_to_tutorial/tutorial/humans/"
35+
# neg_examples = "$path_to_tutorial/tutorial/not_humans/"
36+
37+
# n_pos = length(readdir(pos_examples)) # number of positive training examples
38+
# n_neg = length(readdir(neg_examples)) # number of negative training examples
39+
# n = n_pos + n_neg # number of training examples
40+
# data = Array{Float64}(undef, 3780, n) # Array to store HOG descriptor of each image. Each image in our training data has size 128x64 and so has a 3780 length
41+
# labels = Vector{Int}(undef, n) # Vector to store label (1=human, 0=not human) of each image.
42+
43+
# for (i, file) in enumerate([readdir(pos_examples); readdir(neg_examples)])
44+
# filename = "$(i <= n_pos ? pos_examples : neg_examples )/$file"
45+
# img = load(filename)
46+
# data[:, i] = create_descriptor(img, HOG())
47+
# labels[i] = (i <= n_pos ? 1 : 0)
48+
# end
49+
# ```
50+
51+
# Basically we now have an encoded version of images in our training data.
52+
# This encoding captures useful information but discards extraneous information
53+
# (illumination changes, pose variations etc). We will train a linear SVM on this data.
54+
55+
# ```julia
56+
# using LIBSVM
57+
58+
# #Split the dataset into train and test set. Train set = 2500 images, Test set = 294 images.
59+
# random_perm = randperm(n)
60+
# train_ind = random_perm[1:2500]
61+
# test_ind = random_perm[2501:end]
62+
63+
# model = svmtrain(data[:, train_ind], labels[train_ind]);
64+
# ```
65+
66+
# Now let's test this classifier on some images.
67+
68+
# ```julia
69+
# img = load("$pos_examples/per00003.ppm")
70+
# descriptor = Array{Float64}(3780, 1)
71+
# descriptor[:, 1] = create_descriptor(img, HOG())
72+
73+
# predicted_label, _ = svmpredict(model, descriptor);
74+
# print(predicted_label) # 1=human, 0=not human
75+
76+
# # Get test accuracy of our model
77+
# predicted_labels, decision_values = svmpredict(model, data[:, test_ind]);
78+
# @printf "Accuracy: %.2f%%\n" mean((predicted_labels .== labels[test_ind])) * 100 # test accuracy should be > 98%
79+
# ```
80+
81+
# Try testing our trained model on more images. You can see that it performs quite well.
82+
# Image
83+
84+
85+
# | ![Original](assets/human1.png) | ![Original](assets/human2.png) |
86+
# |:------:|:---:|
87+
# | predicted_label = 1 | predicted_label = 1 |
88+
89+
# | ![Original](assets/human3.png) | ![Original](assets/not-human1.jpg) |
90+
# |:------:|:---:|
91+
# | predicted_label = 1 | predicted_label = 0 |
92+
93+
# Next we will use our trained classifier with a sliding window to localize persons in an image.
94+
95+
# ![Original](assets/humans.jpg)
96+
97+
# ```julia
98+
# img = load("path_to_tutorial/tutorial/humans.jpg")
99+
# rows, cols = size(img)
100+
101+
# scores = Array{Float64}(22, 45)
102+
# descriptor = Array{Float64}(3780, 1)
103+
104+
# #Apply classifier using a sliding window approach and store classification score for not-human at every location in score array
105+
# for j = 32:10:cols-32
106+
# for i = 64:10:rows-64
107+
# box = img[i-63:i+64, j-31:j+32]
108+
# descriptor[:, 1] = create_descriptor(box, HOG())
109+
# predicted_label, s = svmpredict(model, descriptor)
110+
# scores[Int((i - 64) / 10)+1, Int((j - 32) / 10)+1] = s[1]
111+
# end
112+
# end
113+
# ```
114+
115+
# ![](assets/scores.png)
116+
117+
# You can see that classifier gave low score to not-human class (i.e.
118+
# high score to human class) at positions corresponding to humans in
119+
# the original image.
120+
# Below we threshold the image and supress non-minimal values to get
121+
# the human locations. We then plot the bounding boxes using `ImageDraw`.
122+
123+
# ```julia
124+
# using ImageDraw, ImageView
125+
126+
# scores[scores.>0] = 0
127+
# object_locations = findlocalminima(scores)
128+
129+
# rectangles = [
130+
# [
131+
# ((i[2] - 1) * 10 + 1, (i[1] - 1) * 10 + 1),
132+
# ((i[2] - 1) * 10 + 64, (i[1] - 1) * 10 + 1),
133+
# ((i[2] - 1) * 10 + 64, (i[1] - 1) * 10 + 128),
134+
# ((i[2] - 1) * 10 + 1, (i[1] - 1) * 10 + 128),
135+
# ] for i in object_locations
136+
# ];
137+
138+
# for rec in rectangles
139+
# draw!(img, Polygon(rec), RGB{N0f8}(0, 0, 1.0))
140+
# end
141+
# imshow(img)
142+
# ```
143+
144+
# ![](assets/boxes.jpg)
145+
146+
# In our example we were lucky that the persons in our image had roughly
147+
# the same size (128x64) as examples in our train set. We will generally
148+
# need to take bounding boxes across multiple scales (and multiple
149+
# aspect ratios for some object classes).
150+
151+
using FileIO #src
152+
img1 = load("assets/humans.jpg") #src
153+
img2 = load("assets/boxes.jpg") #src
154+
save("assets/hog.gif", cat(img1[1:342,1:342], img2[1:342,1:342]; dims=3); fps=2) #src
155+

0 commit comments

Comments
 (0)