Orange Forum • View topic - kNN with fixed-width binary strings

kNN with fixed-width binary strings

A place to ask questions about methods in Orange and how they are used and other general support.

kNN with fixed-width binary strings

Postby pedrobad » Wed Jun 12, 2013 15:50

Hi,
my data is encoded by fixed-width (32) binary strings ('010010...10') which represent
users preferences (yes/no).
For each user I need to find the best k best matching users.
I did some research and I found out that my problem can be modeled by the kNN problem, using
the Hamming distance metric (lower is the hamming distance, higher is the matching value).
I wrote my data to a text file and I tried to play with Orange.data.Table,
Orange.classification.knn.kNNLearner and kNNClassifier, but I failed.
How do I have to initialize Table, kNNLearner and kNNClassifier?

Thanks
Best regards

Pedro

Re: kNN with fixed-width binary strings

Postby Ales » Fri Jun 14, 2013 10:03

How (in what format) did you write the data to a file? See the data format documentation.

In short your file should be something like this
Code: Select all
A1  A2  ...  A32
d   d   ...  d

1   0   ...  1
...
where the elements are separated by tabs.

However kNNLearner/Classifier expect the data to contain a class variable, so you should use the FindNearest/FindNearestConstrictor pair.

Code: Select all
const = Orange.classification.knn.FindNearestConstructor(
            distance_constructor=Orange.distance.Hamming())
finder = const(data)

nearest_instances = finder(data[0], 4)

print nearest_instances

Re: kNN with fixed-width binary strings

Postby pedrobad » Wed Jun 19, 2013 7:48

Thanks Ales, it works.
Now I would like model the problem with knn.
I'll do some more research.


Return to Questions & Support



cron