Togaware DATA MINING
Desktop Survival Guide
by Graham Williams
Google


K-Nearest Neighbours:
Classification



The http://en.wikipedia.org/wiki/K-Nearest_Neighbor_algorithmK-Nearest Neighbour algorithm. K-nearest neighbour algorithms handle missing values, are robust to outliers, and can be good predictors. They tend to only handle numeric variables, are sensitive to monotonic transformations, are not robust to irrelevant inputs, and provide models that are not easy to interpret. K-nearest neighbour classifier, relying on a distance function, is sensitive to noise and irrelevant features, because such features have the same influence on the classification as do good and highly predictive features. A solution to this is to pre-process the data to weight features so that irrelevant and redundant features have a lower weight.



Subsections

Copyright © 2004-2006 Graham.Williams@togaware.com
Support further development through the purchase of the PDF version of the book.
Brought to you by Togaware.