CNNs compare images piece by piece. The pieces that it looks for are called features. By finding rough feature matches in roughly the same positions in two images, CNNs get a lot better at seeing similarity than whole-image matching schemes.
- CNN will try to match the feature everywhere and in every possible position.
- In calculating the match to a feature across the whole image, we call it a filter
- to calculate the match of a feature to a patch of the image, simply multiply each pixel in the feature by the value of the corresponding pixel in the image
- then add up the values and divide by the total number of pixels
- to complete convolution, repeat the process, lining up the feature with every possible image patch
- values close to 1 show strong matches, close to -1 indicate photographic negative of our feature, 0 means no match
- way to take large images and shrink them down while preserving the most important information
- stepping a small window across an image and taking the maximum value from the window as each step.
- take high level filtered images and translates them into votes
- we have decide between X categories
- some values are better than others in classifying a particular class, these get larger votes than others
- these votes are expressed as weights or connection strengths