Orange Forum • View topic - how naive Bayes handles missing values

how naive Bayes handles missing values

A place to ask questions about methods in Orange and how they are used and other general support.

how naive Bayes handles missing values

Postby dmillis » Tue Aug 24, 2010 1:46

My question is about how a classifier created with Orange's built-in naive Bayes learner handles missing data. I have data in a tab-delimited file, with three attributes, all continuous, and a discrete class variable that takes the values True and False. I use a question mark to denote missing values. I'm creating a simple classifier, as in the sample file classifier.py in the Orange for Beginners tutorial:

classifier = orange.BayesLearner(data)

As in the tutorial, I am observing how the classifier classifiers the first few examples in the data set (without worrying about separating the data into training and test sets).

I see that the classifier can make a class prediction even for instances that have missing values (question marks) for all three attributes, and can assign a probability to the class value. It is unclear to me how the unknown values are incorporated into the probability calculation. Is there some sort of imputation that happens by default, such that the unknown values are substituted with some numerical value that can be used in the probability calculation?

Thanks for any help in clarifying this -
David.

Return to Questions & Support



cron