Changeset 7453:4db726423cce in orange


Ignore:
Timestamp:
02/04/11 13:43:16 (3 years ago)
Author:
tomazc <tomaz.curk@…>
Branch:
default
Convert:
94ee9af7eb7f672e92fb4e7257fdfd921f6ebbc3
Message:

Documentatio and code refactoring at Bohinj retreat.

Location:
orange
Files:
6 edited

Legend:

Unmodified
Added
Removed
  • orange/Orange/feature/discretization.py

    r7420 r7453  
    1212categorization that can be used for learning. 
    1313 
    14 .. automethod:: Orange.feature.discretization.entropyDiscretization 
     14.. class:: Orange.feature.discretization.EntropyDiscretization 
     15     
     16    Discretize the given feature's and return a discretized feature. The new 
     17    attribute's values get computed automatically when they are needed. 
     18     
     19    :param attribute: continuous feature to discretize 
     20    :type attribute: :obj:`Orange.data.feature.Feature` 
     21    :param examples: data to discretize 
     22    :type examples: :obj:`Orange.data.Table` 
     23    :param weight: meta feature that stores weights of individual data 
     24          instances 
     25    :type weight: Orange.data.feature.Feature 
     26    :rtype: :obj:`Orange.data.feature.Discrete` 
     27         
     28.. automethod:: Orange.feature.discretization.entropyDiscretization_wrapper 
    1529 
    16 .. autoclass:: Orange.feature.discretization.EntropyDiscretization 
     30.. autoclass:: Orange.feature.discretization.EntropyDiscretization_wrapper 
    1731 
    1832.. autoclass:: Orange.feature.discretization.DiscretizedLearner_Class 
     
    5165        EquiDistDiscretizer, \ 
    5266        IntervalDiscretizer, \ 
    53         ThresholdDiscretizer 
     67        ThresholdDiscretizer, \ 
     68        EntropyDiscretization 
    5469 
    5570###### 
    5671# from orngDics.py 
    57 def entropyDiscretization(table): 
     72def entropyDiscretization_wrapper(table): 
    5873    """Take the classified table set (table) and categorize all continuous 
    59     features using the entropy based discretization  
     74    features using the entropy based discretization 
    6075    :obj:`EntropyDiscretization`. 
    6176     
    6277    :param table: data to discretize. 
    6378    :type table: Orange.data.Table 
     79    :rtype: :obj:`Orange.data.Table` includes all categorical and discretized 
     80    continuous features from the original data table. 
    6481     
    6582    After categorization, features that were categorized to a single interval 
    6683    (to a constant value) are removed from table and prints their names. 
    67     Returns a table that includes all categorical and discretized 
    68     continuous features from the original data table. 
     84    Returns a table that  
    6985 
    7086    """ 
    7187    orange.setrandseed(0) 
    72     tablen=orange.Preprocessor_discretize(table, method=orange.EntropyDiscretization()) 
     88    tablen=orange.Preprocessor_discretize(table, method=EntropyDiscretization()) 
    7389     
    7490    attrlist=[] 
     
    8399 
    84100 
    85 class EntropyDiscretization: 
     101class EntropyDiscretization_wrapper: 
    86102    """This is simple wrapper class around the function  
    87103    :obj:`entropyDiscretization`.  
  • orange/Orange/feature/scoring.py

    r7424 r7453  
    502502        :param weight: meta feature that stores weights of individual data 
    503503          instances 
    504         :type weight: Orange.data.feature 
     504        :type weight: Orange.data.Feature 
    505505 
    506506        """ 
  • orange/doc/Orange/rst/code/scoring-info-iris.py

    r7374 r7453  
    11# Description: Shows how to assess the quality of attributes not in the dataset 
    22# Category:    attribute quality 
    3 # Classes:     EntropyDiscretization, MeasureAttribute, MeasureAttribute_info 
    43# Uses:        iris 
    5 # Referenced:  MeasureAttribute.htm 
     4# Referenced:  Orange.feature.html#scoring 
     5# Classes:     Orange.feature.scoring.EntropyDiscretization, Orange.feature.scoring.Measure, Orange.feature.scoring.InfoGain 
    66 
    7 import orange 
    8 data = orange.ExampleTable("iris") 
     7import Orange 
     8table = Orange.data.Table("iris") 
    99 
    10 d1 = orange.EntropyDiscretization("petal length", data) 
    11 print orange.MeasureAttribute_relief(d1, data) 
     10d1 = Orange.feature.discretization.EntropyDiscretization("petal length", table) 
     11print Orange.feature.scoring.Relief(d1, table) 
    1212 
    13 meas = orange.MeasureAttribute_relief() 
    14 for t in meas.thresholdFunction("petal length", data): 
     13meas = Orange.feature.scoring.Relief() 
     14for t in meas.thresholdFunction("petal length", table): 
    1515    print "%5.3f: %5.3f" % t 
    1616 
    17 thresh, score, distr = meas.bestThreshold("petal length", data) 
     17thresh, score, distr = meas.bestThreshold("petal length", table) 
    1818print "\nBest threshold: %5.3f (score %5.3f)" % (thresh, score) 
  • orange/doc/Orange/rst/code/scoring-info-lenses.py

    r7374 r7453  
    11# Description: Shows how to assess the quality of attributes 
    22# Category:    attribute quality 
     3# Uses:        lenses 
     4# Referenced:  Orange.feature.html#scoring 
    35# Classes:     MeasureAttribute, MeasureAttribute_info,  
    4 # Uses:        lenses 
    5 # Referenced:  MeasureAttribute.htm 
    66 
    7 import orange, random 
    8 data = orange.ExampleTable("lenses") 
     7import Orange 
     8import random 
     9table = Orange.data.Table("lenses") 
    910 
    10 meas = orange.MeasureAttribute_info() 
     11meas = Orange.feature.scoring.InfoGain() 
    1112 
    12 astigm = data.domain["astigmatic"] 
    13 print "Information gain of 'astigmatic': %6.4f" % meas(astigm, data) 
     13astigm = table.domain["astigmatic"] 
     14print "Information gain of 'astigmatic': %6.4f" % meas(astigm, table) 
    1415 
    15 classdistr = orange.Distribution(data.domain.classVar, data) 
    16 cont = orange.ContingencyAttrClass("tear_rate", data) 
     16classdistr = Orange.data.value.Distribution(table.domain.classVar, table) 
     17cont = Orange.probability.distributions.ContingencyAttrClass("tear_rate", table) 
    1718print "Information gain of 'tear_rate': %6.4f" % meas(cont, classdistr) 
    1819 
    19 dcont = orange.DomainContingency(data) 
     20dcont = Orange.probability.distributions.DomainContingency(table) 
    2021print "Information gain of the first attribute: %6.4f" % meas(0, dcont) 
    2122print 
     
    2324print "*** A set of more exhaustive tests for different way of passing arguments to MeasureAttribute ***" 
    2425 
    25 names = [a.name for a in data.domain.attributes] 
     26names = [a.name for a in table.domain.attributes] 
    2627attrs = len(names) 
    2728 
     
    3233 
    3334print "Computing information gain directly from examples" 
    34 print fstr % (("- by attribute number:",) + tuple([meas(i, data) for i in range(attrs)])) 
    35 print fstr % (("- by attribute name:",) + tuple([meas(i, data) for i in names])) 
    36 print fstr % (("- by attribute descriptor:",) + tuple([meas(i, data) for i in data.domain.attributes])) 
     35print fstr % (("- by attribute number:",) + tuple([meas(i, table) for i in range(attrs)])) 
     36print fstr % (("- by attribute name:",) + tuple([meas(i, table) for i in names])) 
     37print fstr % (("- by attribute descriptor:",) + tuple([meas(i, table) for i in table.domain.attributes])) 
    3738print 
    3839 
    39 dcont = orange.DomainContingency(data) 
     40dcont = Orange.probability.distributions.DomainContingency(table) 
    4041print "Computing information gain from DomainContingency" 
    4142print fstr % (("- by attribute number:",) + tuple([meas(i, dcont) for i in range(attrs)])) 
    4243print fstr % (("- by attribute name:",) + tuple([meas(i, dcont) for i in names])) 
    43 print fstr % (("- by attribute descriptor:",) + tuple([meas(i, dcont) for i in data.domain.attributes])) 
     44print fstr % (("- by attribute descriptor:",) + tuple([meas(i, dcont) for i in table.domain.attributes])) 
    4445print 
    4546 
    4647print "Computing information gain from DomainContingency" 
    47 cdist = orange.Distribution(data.domain.classVar, data) 
    48 print fstr % (("- by attribute number:",) + tuple([meas(orange.ContingencyAttrClass(i, data), cdist) for i in range(attrs)])) 
    49 print fstr % (("- by attribute name:",) + tuple([meas(orange.ContingencyAttrClass(i, data), cdist) for i in names])) 
    50 print fstr % (("- by attribute descriptor:",) + tuple([meas(orange.ContingencyAttrClass(i, data), cdist) for i in data.domain.attributes])) 
     48cdist = Orange.data.value.Distribution(table.domain.classVar, table) 
     49print fstr % (("- by attribute number:",) + tuple([meas(Orange.probability.distributions.ContingencyAttrClass(i, table), cdist) for i in range(attrs)])) 
     50print fstr % (("- by attribute name:",) + tuple([meas(Orange.probability.distributions.ContingencyAttrClass(i, table), cdist) for i in names])) 
     51print fstr % (("- by attribute descriptor:",) + tuple([meas(Orange.probability.distributions.ContingencyAttrClass(i, table), cdist) for i in table.domain.attributes])) 
    5152print 
    5253 
    53 values = ["v%i" % i for i in range(len(data.domain[2].values)*len(data.domain[3].values))] 
    54 cartesian = orange.EnumVariable("cart", values = values) 
    55 cartesian.getValueFrom = orange.ClassifierByLookupTable(cartesian, data.domain[2], data.domain[3], values) 
     54values = ["v%i" % i for i in range(len(table.domain[2].values)*len(table.domain[3].values))] 
     55cartesian = Orange.data.feature.Discrete("cart", values = values) 
     56cartesian.getValueFrom = Orange.classification.lookup.ClassifierByLookupTable(cartesian, table.domain[2], table.domain[3], values) 
    5657 
    57 print "Information gain of Cartesian product of %s and %s: %6.4f" % (data.domain[2].name, data.domain[3].name, meas(cartesian, data)) 
     58print "Information gain of Cartesian product of %s and %s: %6.4f" % (table.domain[2].name, table.domain[3].name, meas(cartesian, table)) 
    5859 
    59 mid = orange.newmetaid() 
    60 data.domain.addmeta(mid, orange.EnumVariable(values = ["v0", "v1"])) 
    61 data.addMetaAttribute(mid) 
     60mid = Orange.core.newmetaid() 
     61table.domain.addmeta(mid, Orange.data.feature.Discrete(values = ["v0", "v1"])) 
     62table.addMetaAttribute(mid) 
    6263 
    6364rg = random.Random() 
    6465rg.seed(0) 
    65 for ex in data: 
    66     ex[mid] = orange.Value(rg.randint(0, 1)) 
     66for ex in table: 
     67    ex[mid] = Orange.data.value.Value(rg.randint(0, 1)) 
    6768 
    68 print "Information gain for a random meta attribute: %6.4f" % meas(mid, data) 
     69print "Information gain for a random meta attribute: %6.4f" % meas(mid, table) 
  • orange/doc/Orange/rst/code/scoring-regression.py

    r7424 r7453  
    1 # Description: Shows how to measure the attribute quality in regression problems 
    2 # Category:    statistics 
    3 # Classes:     MeasureAttribute, MeasureAttribute_MSE 
     1# Description: Shows how to measure the attribute quality in regression problems. 
     2# Category:    feature scoring 
    43# Uses:        measure-c 
    5 # Referenced:  MeasureAttribute.htm 
     4# Referenced:  Orange.feature.html#scoring 
     5# Classes:     Orange.feature.scoring.MSE 
    66 
    77import Orange 
     
    3636print "MSE" 
    3737printVariants(Orange.feature.scoring.MSE()) 
    38  
    39 print "Relief" 
    40 meas = Orange.feature.scoring.Relief() 
    41 print fstr % (("- no unknowns:",) + tuple([meas(i, data) for i in range(attrs)])) 
    42 print fstr % (("- with unknowns:",) + tuple([meas(i, data2) for i in range(attrs)])) 
    43 print 
  • orange/doc/Orange/rst/code/scoring-relief-gainRatio.py

    r7319 r7453  
    33# Uses:        voting.tab 
    44# Referenced:  Orange.feature.html#scoring 
    5 # Classes:     Orange.feature.scoring.attMeasure, Orange.features.scoring.gainRatio 
     5# Classes:     Orange.feature.scoring.attMeasure, Orange.features.scoring.GainRatio 
    66 
    77import Orange 
Note: See TracChangeset for help on using the changeset viewer.