Orange Forum • View topic - Alternative to f _ f in

Alternative to f _ f in

A place to ask questions about methods in Orange and how they are used and other general support.

Alternative to f _ f in

Postby sreastman » Tue Jan 29, 2013 17:30

The Classifier with Feature Selection section of the latest tutorial includes the following lines of code:

class SmallLearner(Orange.classification.PyLearner):
def __init__(self, base_learner=Orange.classification.bayes.NaiveLearner,
name='small', m=5):
self.name = name
self.m = m
self.base_learner = base_learner

def __call__(self, data, weight=None):
gain = Orange.feature.scoring.InfoGain()
m = min(self.m, len(data.domain.features))
best = [f for _, f in sorted((gain(x, data), x) for x in data.domain.features)[-m:]]
domain = Orange.data.Domain(best + [data.domain.class_var])

model = self.base_learner(Orange.data.Table(domain, data), weight)
return Orange.classification.PyClassifier(classifier=model, name=self.name)

The line that includes "f or _, f in" at first appeared a bit cryptic. After a little research I learneded that in Python "_" is supposed to hold the value of the last evaluation. I discovered through running various tests that it is easy for the interpreter to get hung up on a single value, although I did not experience it with this actual line of code. I submit the following line as an alternative, just to avoid possible problems:

best = [f[1] for f in sorted((gain(x, data), x) for x in data.domain.features)[-m:]]

Steve

Re: Alternative to f _ f in

Postby Ales » Thu Jan 31, 2013 10:52

sreastman wrote:After a little research I learneded that in Python "_" is supposed to hold the value of the last evaluation.

This is a feature of the interactive command line interpreter not of Python as a programming/scripting language. The use here is unrelated to that.

Using '_' is a common idiom for tuple unpacking when you are not interested in all items in a tuple (note that it is a valid variable name).
Meaning the code the same as
Code: Select all
best = [f for score, f in sorted((gain(x, data), x) for x in data.domain.features)[-m:]]
but since 'score' is not used in the rest of the expression it is simply assigned to this 'anonymous' variable.

Re: Alternative to f _ f in

Postby sreastman » Thu Jan 31, 2013 13:16

Good to know.

Thanks--Steve


Return to Questions & Support



cron