Orange Forum • View topic - Hamming Distance

Hamming Distance

A place to ask questions about methods in Orange and how they are used and other general support.

Hamming Distance

Postby mxa031 » Wed Oct 23, 2013 15:33

I am using "Test learners" to evaluate my data and compare between four classifiers. Using hamming distance metric for KNN ( K nearest neighbor) is giving 100 percent classification accuracy, 100 % sensitivity, etc. I know that this is too good to be true , providing that my data is continuous. I developed a Matlab code which basically do the same thing as "Test Learner" , and ,surprisingly, the hamming distance for KNN is still giving 100 % accuracy. Is it because hamming shouldn't be used with continuous attributes? why is it giving zero error then? Thank you

Re: Hamming Distance

Postby Ales » Fri Oct 25, 2013 10:52

1) Hamming should not be used with continuous features.

2) It might misleadingly give good results in cross validation, if the continuous features are 'coarse grained', i.e. have few unique values (in relation to the dataset size). In this case it would work the same as if the feature was discretized into the same set of values. But this is very dangerous if the data presented to the classifier when in production/field could take on values not in this set.

Re: Hamming Distance

Postby mxa031 » Fri Oct 25, 2013 11:06

it makes sense. Many thanks Ales


Return to Questions & Support