Skip to content

Latest commit

 

History

History
36 lines (23 loc) · 1.66 KB

File metadata and controls

36 lines (23 loc) · 1.66 KB

home | copyright ©2016, tim@menzies.us

overview | syllabus | src | submit | chat


Homework4

Big Project

Write 1 page on what your big project in November. No need to have all the details worked out but what are the broad strokes?

Note, there are some project ideas at the projects page, but don't feel bound by them.

Some Stats

Nearest neighbor

Using your table class, code up a k=1NN classifier using (a)mini-batch K-means and (b)KD-trees.

Such classifiers predict that the class of the test instance is the class of the k-th nearest neighbors in the train set.

Such classifiers can be very slow unless optimized via tricks like mini-batch or KN-trees

  • kNN+KD-trees: use KD-trees to find the nearest thing.
  • kNN+mini-batch: find the nearest centroid, find the nearest item in that cluster (note use k=20 for mini-batch. most clusters will be empty but what the heh?)

Your task:

  • Compare the runtimes between raw kNN and kNN+KD-trees or kNN+mini-batch
  • Compare the performance between kNN and KD-trees (does doing it faster mean doing it wrong?)