Ignore:
Files:
1 added
31 edited

Legend:

Unmodified
Added
Removed
  • docs/reference/rst/Orange.classification.rst

    r9372 r9642  
    77 
    88   Orange.classification.bayes 
     9   Orange.classification.knn 
     10   Orange.classification.logreg 
     11   Orange.classification.lookup 
    912   Orange.classification.majority 
    10    Orange.classification.lookup 
     13   Orange.classification.rules 
    1114   Orange.classification.svm 
    1215   Orange.classification.tree 
    13    Orange.classification.logreg 
    14    Orange.classification.rules 
    15    Orange.classification.knn 
  • docs/reference/rst/Orange.distance.instances.rst

    r9372 r9641  
     1.. automodule:: Orange.distance.instances 
     2 
    13######################### 
    24Instances (``instances``) 
    35######################### 
    46 
    5 .. automodule:: Orange.distance.instances 
     7########################### 
     8Distances between Instances 
     9########################### 
     10 
     11This page describes a bunch of classes for different metrics for measure 
     12distances (dissimilarities) between instances. 
     13 
     14Typical (although not all) measures of distance between instances require 
     15some "learning" - adjusting the measure to the data. For instance, when 
     16the dataset contains continuous features, the distances between continuous 
     17values should be normalized, e.g. by dividing the distance with the range 
     18of possible values or with some interquartile distance to ensure that all 
     19features have, in principle, similar impacts. 
     20 
     21Different measures of distance thus appear in pairs - a class that measures 
     22the distance and a class that constructs it based on the data. The abstract 
     23classes representing such a pair are `ExamplesDistance` and 
     24`ExamplesDistanceConstructor`. 
     25 
     26Since most measures work on normalized distances between corresponding 
     27features, there is an abstract intermediate class 
     28`ExamplesDistance_Normalized` that takes care of normalizing. 
     29The remaining classes correspond to different ways of defining the distances, 
     30such as Manhattan or Euclidean distance. 
     31 
     32Unknown values are treated correctly only by Euclidean and Relief distance. 
     33For other measure of distance, a distance between unknown and known or between 
     34two unknown values is always 0.5. 
     35 
     36.. class:: ExamplesDistance 
     37 
     38    .. method:: __call__(instance1, instance2) 
     39 
     40        Returns a distance between the given instances as floating point number. 
     41 
     42.. class:: ExamplesDistanceConstructor 
     43 
     44    .. method:: __call__([instances, weightID][, distributions][, basic_var_stat]) 
     45 
     46        Constructs an instance of ExamplesDistance. 
     47        Not all the data needs to be given. Most measures can be constructed 
     48        from basic_var_stat; if it is not given, they can help themselves 
     49        either by instances or distributions. 
     50        Some (e.g. ExamplesDistance_Hamming) even do not need any arguments. 
     51 
     52.. class:: ExamplesDistance_Normalized 
     53 
     54    This abstract class provides a function which is given two instances 
     55    and returns a list of normalized distances between values of their 
     56    features. Many distance measuring classes need such a function and are 
     57    therefore derived from this class 
     58 
     59    .. attribute:: normalizers 
     60 
     61        A precomputed list of normalizing factors for feature values 
     62 
     63        - If a factor positive, differences in feature's values 
     64          are multiplied by it; for continuous features the factor 
     65          would be 1/(max_value-min_value) and for ordinal features 
     66          the factor is 1/number_of_values. If either (or both) of 
     67          features are unknown, the distance is 0.5 
     68        - If a factor is -1, the feature is nominal; the distance 
     69          between two values is 0 if they are same (or at least 
     70          one is unknown) and 1 if they are different. 
     71        - If a factor is 0, the feature is ignored. 
     72 
     73    .. attribute:: bases, averages, variances 
     74 
     75        The minimal values, averages and variances 
     76        (continuous features only) 
     77 
     78    .. attribute:: domainVersion 
     79 
     80        Stores a domain version for which the normalizers were computed. 
     81        The domain version is increased each time a domain description is 
     82        changed (i.e. features are added or removed); this is used for a quick 
     83        check that the user is not attempting to measure distances between 
     84        instances that do not correspond to normalizers. 
     85        Since domains are practicably immutable (especially from Python), 
     86        you don't need to care about this anyway. 
     87 
     88    .. method:: attributeDistances(instance1, instance2) 
     89 
     90        Returns a list of floats representing distances between pairs of 
     91        feature values of the two instances. 
     92 
     93 
     94.. class:: HammingConstructor 
     95.. class:: Hamming 
     96 
     97    Hamming distance between two instances is defined as the number of 
     98    features in which the two instances differ. Note that this measure 
     99    is not really appropriate for instances that contain continuous features. 
     100 
     101 
     102.. class:: MaximalConstructor 
     103.. class:: Maximal 
     104 
     105    The maximal between two instances is defined as the maximal distance 
     106    between two feature values. If dist is the result of 
     107    ExamplesDistance_Normalized.attributeDistances, 
     108    then Maximal returns max(dist). 
     109 
     110 
     111.. class:: ManhattanConstructor 
     112.. class:: Manhattan 
     113 
     114    Manhattan distance between two instances is a sum of absolute values 
     115    of distances between pairs of features, e.g. ``sum(abs(x) for x in dist)`` 
     116    where dist is the result of ExamplesDistance_Normalized.attributeDistances. 
     117 
     118.. class:: EuclideanConstructor 
     119.. class:: Euclidean 
     120 
     121    Euclidean distance is a square root of sum of squared per-feature distances, 
     122    i.e. ``sqrt(sum(x*x for x in dist))``, where dist is the result of 
     123    ExamplesDistance_Normalized.attributeDistances. 
     124 
     125    .. method:: distributions 
     126 
     127        An object of type 
     128        :obj:`~Orange.statistics.distribution.Distribution` that holds 
     129        the distributions for all discrete features used for 
     130        computation of distances between known and unknown values. 
     131 
     132    .. method:: bothSpecialDist 
     133 
     134        A list containing the distance between two unknown values for each 
     135        discrete feature. 
     136 
     137    This measure of distance deals with unknown values by computing the 
     138    expected square of distance based on the distribution obtained from the 
     139    "training" data. Squared distance between 
     140 
     141        - A known and unknown continuous attribute equals squared distance 
     142          between the known and the average, plus variance 
     143        - Two unknown continuous attributes equals double variance 
     144        - A known and unknown discrete attribute equals the probability 
     145          that the unknown attribute has different value than the known 
     146          (i.e., 1 - probability of the known value) 
     147        - Two unknown discrete attributes equals the probability that two 
     148          random chosen values are equal, which can be computed as 
     149          1 - sum of squares of probabilities. 
     150 
     151    Continuous cases can be handled by averages and variances inherited from 
     152    ExamplesDistance_normalized. The data for discrete cases are stored in 
     153    distributions (used for unknown vs. known value) and in bothSpecial 
     154    (the precomputed distance between two unknown values). 
     155 
     156.. class:: ReliefConstructor 
     157.. class:: Relief 
     158 
     159    Relief is similar to Manhattan distance, but incorporates a more 
     160    correct treatment of undefined values, which is used by ReliefF measure. 
     161 
     162This class is derived directly from ExamplesDistance, not from ExamplesDistance_Normalized. 
     163 
     164 
     165.. autoclass:: PearsonR 
     166    :members: 
     167 
     168.. autoclass:: SpearmanR 
     169    :members: 
     170 
     171.. autoclass:: PearsonRConstructor 
     172    :members: 
     173 
     174.. autoclass:: SpearmanRConstructor 
     175    :members: 
  • docs/reference/rst/code/attributes.py

    r9372 r9638  
    11import Orange 
    2 data = Orange.data.Table("titanic.tab") 
    3 var = data.domain[0] 
     2titanic = Orange.data.Table("titanic.tab") 
     3var = titanic.domain[0] 
    44print var 
    55print "Attributes", var.attributes 
  • docs/reference/rst/code/bayes-mestimate.py

    r9372 r9638  
    77import Orange 
    88 
    9 table = Orange.data.Table("lenses.tab") 
     9lenses = Orange.data.Table("lenses.tab") 
    1010 
    1111bayes_L = Orange.classification.bayes.NaiveLearner(name="Naive Bayes") 
    1212bayesWithM_L = Orange.classification.bayes.NaiveLearner(m=2, name="Naive Bayes w/ m-estimate") 
    13 bayes = bayes_L(table) 
    14 bayesWithM = bayesWithM_L(table) 
     13bayes = bayes_L(lenses) 
     14bayesWithM = bayesWithM_L(lenses) 
    1515 
    1616print bayes.conditional_distributions 
  • docs/reference/rst/code/bayes-thresholdAdjustment.py

    r9372 r9638  
    99from Orange.evaluation import testing, scoring 
    1010 
    11 table = Orange.data.Table("adult_sample.tab") 
     11adult = Orange.data.Table("adult_sample.tab") 
    1212 
    1313nb = bayes.NaiveLearner(name="Naive Bayes") 
    1414adjusted_nb = bayes.NaiveLearner(adjust_threshold=True, name="Adjusted Naive Bayes") 
    1515 
    16 results = testing.cross_validation([nb, adjusted_nb], table) 
     16results = testing.cross_validation([nb, adjusted_nb], adult) 
    1717print scoring.CA(results) 
  • docs/reference/rst/code/correspondence1.py

    r9372 r9638  
    88import Orange.statistics.contingency as cont 
    99 
    10 data = Orange.data.Table("bridges") 
    11 cm = cont.VarVar("PURPOSE", "MATERIAL", data) 
     10bridges = Orange.data.Table("bridges") 
     11cm = cont.VarVar("PURPOSE", "MATERIAL", bridges) 
    1212ca = corr.CA(cm) 
    1313 
     
    1717         
    1818print "PURPOSE" 
    19 report(ca.column_factors(), data.domain["PURPOSE"].values) 
     19report(ca.column_factors(), bridges.domain["PURPOSE"].values) 
    2020print  
    2121 
    2222print "MATERIAL" 
    23 report(ca.row_factors(), data.domain["PURPOSE"].values) 
     23report(ca.row_factors(), bridges.domain["PURPOSE"].values) 
    2424print  
  • docs/reference/rst/code/data.io-read.weka.py

    r9372 r9638  
    66 
    77import Orange 
    8 table = Orange.data.io.loadARFF('iris.arff') 
    9 print table.attribute_load_status 
    10 print table.domain 
    11 print table.domain.attributes 
    12 print "\n".join(["\t".join([str(value) for value in row]) for row in table]) 
     8iris = Orange.data.io.loadARFF('iris.arff') 
     9print iris.attribute_load_status 
     10print iris.domain 
     11print iris.domain.attributes 
     12print "\n".join(["\t".join([str(value) for value in row]) for row in iris]) 
    1313print 
    1414 
    15 table = Orange.data.Table('iris.arff') 
    16 print table.attribute_load_status 
    17 print table.domain 
    18 print table.domain.attributes 
    19 print "\n".join(["\t".join([str(value) for value in row]) for row in table]) 
     15iris = Orange.data.Table('iris.arff') 
     16print iris.attribute_load_status 
     17print iris.domain 
     18print iris.domain.attributes 
     19print "\n".join(["\t".join([str(value) for value in row]) for row in iris]) 
  • docs/reference/rst/code/data.io-write.weka.py

    r9372 r9638  
    66 
    77import Orange 
    8 table = Orange.data.Table('iris.tab') 
    9 Orange.data.io.toARFF('iris.testsave.arff', table) 
    10 table.save('iris.testsave.arff') 
     8iris = Orange.data.Table('iris.tab') 
     9Orange.data.io.toARFF('iris.testsave.arff', iris) 
     10iris.save('iris.testsave.arff') 
    1111f = open('iris.testsave.arff') 
    1212for line in f: 
  • docs/reference/rst/code/distances-test.py

    r9372 r9638  
    22 
    33# Read some data 
    4 table = Orange.data.Table("iris.tab") 
     4iris = Orange.data.Table("iris.tab") 
    55 
    66# Euclidean distance constructor 
    77d2Constr = Orange.distance.instances.EuclideanConstructor() 
    8 d2 = d2Constr(table) 
     8d2 = d2Constr(iris) 
    99 
    1010# Constructs  
    11 dPears = Orange.distance.instances.PearsonRConstructor(table) 
     11dPears = Orange.distance.instances.PearsonRConstructor(iris) 
    1212 
    1313#reference instance 
    14 ref = table[0] 
     14ref = iris[0] 
    1515 
    1616print "Euclidean distances from the first data instance: " 
    1717 
    18 for ins in table[:5]: 
     18for ins in iris[:5]: 
    1919    print "%5.4f" % d2(ins, ref), 
    2020print  
     
    2222print "Pearson correlation distance from the first data instance: " 
    2323 
    24 for ins in table[:5]: 
     24for ins in iris[:5]: 
    2525    print "%5.4f" % dPears(ins, ref), 
    2626print  
  • docs/reference/rst/code/distributions-basic-stat.py

    r9372 r9638  
    11import Orange 
    22 
    3 myData = Orange.data.Table("iris.tab") 
    4 bas = Orange.statistics.basic.Domain(myData)  
     3iris = Orange.data.Table("iris.tab") 
     4bas = Orange.statistics.basic.Domain(iris)  
    55 
    66print "%20s %5s %5s %5s" % ("feature", "min", "max", "avg") 
  • docs/reference/rst/code/ensemble-forest-measure.py

    r9372 r9638  
    1212for fn in files: 
    1313    print "\nDATA:" + fn + "\n" 
    14     table = Orange.data.Table(fn) 
     14    iris = Orange.data.Table(fn) 
    1515 
    1616    measure = Orange.ensemble.forest.ScoreFeature(trees=100) 
    1717 
    1818    #call by attribute index 
    19     imp0 = measure(0, table)  
     19    imp0 = measure(0, iris)  
    2020    #call by orange.Variable 
    21     imp1 = measure(table.domain.attributes[1], table) 
     21    imp1 = measure(iris.domain.attributes[1], iris) 
    2222    print "first: %0.2f, second: %0.2f\n" % (imp0, imp1) 
    2323 
     
    2626            rand=random.Random(10)) 
    2727 
    28     imp0 = measure(0, table) 
    29     imp1 = measure(table.domain.attributes[1], table) 
     28    imp0 = measure(0, iris) 
     29    imp1 = measure(iris.domain.attributes[1], iris) 
    3030    print "first: %0.2f, second: %0.2f\n" % (imp0, imp1) 
    3131 
    3232    print "All importances:" 
    33     for at in table.domain.attributes: 
    34         print "%15s: %6.2f" % (at.name, measure(at, table)) 
     33    for at in iris.domain.attributes: 
     34        print "%15s: %6.2f" % (at.name, measure(at, iris)) 
  • docs/reference/rst/code/ensemble-forest.py

    r9372 r9638  
    1313 
    1414print "Classification: bupa.tab" 
    15 table = Orange.data.Table("bupa.tab") 
    16 results = Orange.evaluation.testing.cross_validation(learners, table, folds=3) 
     15bupa = Orange.data.Table("bupa.tab") 
     16results = Orange.evaluation.testing.cross_validation(learners, bupa, folds=3) 
    1717print "Learner  CA     Brier  AUC" 
    1818for i in range(len(learners)): 
     
    2323 
    2424print "Regression: housing.tab" 
    25 table = Orange.data.Table("housing.tab") 
    26 results = Orange.evaluation.testing.cross_validation(learners, table, folds=3) 
     25bupa = Orange.data.Table("housing.tab") 
     26results = Orange.evaluation.testing.cross_validation(learners, bupa, folds=3) 
    2727print "Learner  MSE    RSE    R2" 
    2828for i in range(len(learners)): 
  • docs/reference/rst/code/ensemble-forest2.py

    r9372 r9638  
    77import Orange 
    88 
    9 table = Orange.data.Table('bupa.tab') 
     9bupa = Orange.data.Table('bupa.tab') 
    1010 
    1111tree = Orange.classification.tree.TreeLearner() 
     
    1414 
    1515forest_learner = Orange.ensemble.forest.RandomForestLearner(base_learner=tree, trees=50, attributes=3) 
    16 forest = forest_learner(table) 
     16forest = forest_learner(bupa) 
    1717 
    1818for c in forest.classifiers: 
  • docs/reference/rst/code/ensemble.py

    r9372 r9638  
    1111bg = Orange.ensemble.bagging.BaggedLearner(tree, name="bagged tree") 
    1212 
    13 table = Orange.data.Table("lymphography.tab") 
     13lymphography = Orange.data.Table("lymphography.tab") 
    1414 
    1515learners = [tree, bs, bg] 
    16 results = Orange.evaluation.testing.cross_validation(learners, table, folds=3) 
     16results = Orange.evaluation.testing.cross_validation(learners, lymphography, folds=3) 
    1717print "Classification Accuracy:" 
    1818for i in range(len(learners)): 
  • docs/reference/rst/code/imputation-complex.py

    r9372 r9638  
    77import Orange 
    88 
    9 table = Orange.data.Table("bridges") 
     9bridges = Orange.data.Table("bridges") 
    1010 
    1111print "*** IMPUTING MINIMAL VALUES ***" 
    12 imputer = Orange.feature.imputation.ImputerConstructor_minimal(table) 
     12imputer = Orange.feature.imputation.ImputerConstructor_minimal(bridges) 
    1313print "Example w/ missing values" 
    14 print table[19] 
     14print bridges[19] 
    1515print "Imputed:" 
    16 print imputer(table[19]) 
     16print imputer(bridges[19]) 
    1717print 
    1818 
    19 impdata = imputer(table) 
     19impdata = imputer(bridges) 
    2020for i in range(20, 25): 
    21     print table[i] 
     21    print bridges[i] 
    2222    print impdata[i] 
    2323    print 
     
    2525 
    2626print "*** IMPUTING MAXIMAL VALUES ***" 
    27 imputer = Orange.feature.imputation.ImputerConstructor_maximal(table) 
     27imputer = Orange.feature.imputation.ImputerConstructor_maximal(bridges) 
    2828print "Example w/ missing values" 
    29 print table[19] 
     29print bridges[19] 
    3030print "Imputed:" 
    31 print imputer(table[19]) 
     31print imputer(bridges[19]) 
    3232print 
    3333 
    34 impdata = imputer(table) 
     34impdata = imputer(bridges) 
    3535for i in range(20, 25): 
    36     print table[i] 
     36    print bridges[i] 
    3737    print impdata[i] 
    3838    print 
     
    4040 
    4141print "*** IMPUTING AVERAGE/MAJORITY VALUES ***" 
    42 imputer = Orange.feature.imputation.ImputerConstructor_average(table) 
     42imputer = Orange.feature.imputation.ImputerConstructor_average(bridges) 
    4343print "Example w/ missing values" 
    44 print table[19] 
     44print bridges[19] 
    4545print "Imputed:" 
    46 print imputer(table[19]) 
     46print imputer(bridges[19]) 
    4747print 
    4848 
    49 impdata = imputer(table) 
     49impdata = imputer(bridges) 
    5050for i in range(20, 25): 
    51     print table[i] 
     51    print bridges[i] 
    5252    print impdata[i] 
    5353    print 
     
    5555 
    5656print "*** MANUALLY CONSTRUCTED IMPUTER ***" 
    57 imputer = Orange.feature.imputation.Imputer_defaults(table.domain) 
     57imputer = Orange.feature.imputation.Imputer_defaults(bridges.domain) 
    5858imputer.defaults["LENGTH"] = 1234 
    5959print "Example w/ missing values" 
    60 print table[19] 
     60print bridges[19] 
    6161print "Imputed:" 
    62 print imputer(table[19]) 
     62print imputer(bridges[19]) 
    6363print 
    6464 
    65 impdata = imputer(table) 
     65impdata = imputer(bridges) 
    6666for i in range(20, 25): 
    67     print table[i] 
     67    print bridges[i] 
    6868    print impdata[i] 
    6969    print 
     
    7474imputer = Orange.feature.imputation.ImputerConstructor_model() 
    7575imputer.learner_continuous = imputer.learner_discrete = Orange.classification.tree.TreeLearner(minSubset=20) 
    76 imputer = imputer(table) 
     76imputer = imputer(bridges) 
    7777print "Example w/ missing values" 
    78 print table[19] 
     78print bridges[19] 
    7979print "Imputed:" 
    80 print imputer(table[19]) 
     80print imputer(bridges[19]) 
    8181print 
    8282 
    83 impdata = imputer(table) 
     83impdata = imputer(bridges) 
    8484for i in range(20, 25): 
    85     print table[i] 
     85    print bridges[i] 
    8686    print impdata[i] 
    8787    print 
     
    9292imputer.learner_continuous = Orange.regression.mean.MeanLearner() 
    9393imputer.learner_discrete = Orange.classification.bayes.NaiveLearner() 
    94 imputer = imputer(table) 
     94imputer = imputer(bridges) 
    9595print "Example w/ missing values" 
    96 print table[19] 
     96print bridges[19] 
    9797print "Imputed:" 
    98 print imputer(table[19]) 
     98print imputer(bridges[19]) 
    9999print 
    100 impdata = imputer(table) 
     100impdata = imputer(bridges) 
    101101for i in range(20, 25): 
    102     print table[i] 
     102    print bridges[i] 
    103103    print impdata[i] 
    104104    print 
     
    107107print "*** CUSTOM IMPUTATION BY MODELS ***" 
    108108imputer = Orange.feature.imputation.Imputer_model() 
    109 imputer.models = [None] * len(table.domain) 
    110 imputer.models[table.domain.index("LANES")] = Orange.classification.ConstantClassifier(2.0) 
    111 tord = Orange.classification.ConstantClassifier(Orange.data.Value(table.domain["T-OR-D"], "THROUGH")) 
    112 imputer.models[table.domain.index("T-OR-D")] = tord 
     109imputer.models = [None] * len(bridges.domain) 
     110imputer.models[bridges.domain.index("LANES")] = Orange.classification.ConstantClassifier(2.0) 
     111tord = Orange.classification.ConstantClassifier(Orange.data.Value(bridges.domain["T-OR-D"], "THROUGH")) 
     112imputer.models[bridges.domain.index("T-OR-D")] = tord 
    113113 
    114114 
    115 len_domain = Orange.data.Domain(["MATERIAL", "SPAN", "ERECTED", "LENGTH"], table.domain) 
    116 len_data = Orange.data.Table(len_domain, table) 
     115len_domain = Orange.data.Domain(["MATERIAL", "SPAN", "ERECTED", "LENGTH"], bridges.domain) 
     116len_data = Orange.data.Table(len_domain, bridges) 
    117117len_tree = Orange.classification.tree.TreeLearner(len_data, minSubset=20) 
    118 imputer.models[table.domain.index("LENGTH")] = len_tree 
     118imputer.models[bridges.domain.index("LENGTH")] = len_tree 
    119119print len_tree 
    120120 
    121 span_var = table.domain["SPAN"] 
     121span_var = bridges.domain["SPAN"] 
    122122def compute_span(ex, rw): 
    123123    if ex["TYPE"] == "WOOD" or ex["PURPOSE"] == "WALK": 
     
    126126        return orange.Value(span_var, "MEDIUM") 
    127127 
    128 imputer.models[table.domain.index("SPAN")] = compute_span 
     128imputer.models[bridges.domain.index("SPAN")] = compute_span 
    129129 
    130130for i in range(20, 25): 
    131     print table[i] 
     131    print bridges[i] 
    132132    print impdata[i] 
    133133    print 
     
    135135 
    136136print "*** IMPUTATION WITH SPECIAL VALUES ***" 
    137 imputer = Orange.feature.imputation.ImputerConstructor_asValue(table) 
    138 original = table[19] 
    139 imputed = imputer(table[19]) 
     137imputer = Orange.feature.imputation.ImputerConstructor_asValue(bridges) 
     138original = bridges[19] 
     139imputed = imputer(bridges[19]) 
    140140print original.domain 
    141141print 
     
    151151print 
    152152 
    153 impdata = imputer(table) 
     153impdata = imputer(bridges) 
    154154for i in range(20, 25): 
    155     print table[i] 
     155    print bridges[i] 
    156156    print impdata[i] 
    157157    print 
  • docs/reference/rst/code/imputation-minimal-imputer.py

    r9372 r9638  
    1111       imputer_constructor=Orange.feature.imputation.ImputerConstructor_minimal) 
    1212 
    13 table = Orange.data.Table("voting") 
    14 res = Orange.evaluation.testing.cross_validation([ba, imba], table) 
     13voting = Orange.data.Table("voting") 
     14res = Orange.evaluation.testing.cross_validation([ba, imba], voting) 
    1515CAs = Orange.evaluation.scoring.CA(res) 
    1616 
  • docs/reference/rst/code/instance-construct.py

    r9372 r9638  
    11import Orange 
    2 data = Orange.data.Table("lenses") 
    3 domain = data.domain 
     2lenses = Orange.data.Table("lenses") 
     3domain = lenses.domain 
    44inst = Orange.data.Instance(domain, ["young", "myope", 
    55                               "yes", "reduced", "soft"]) 
  • docs/reference/rst/code/instance-metavar.py

    r9525 r9638  
    11import random 
    22import Orange 
    3 data = Orange.data.Table("lenses") 
     3lenses = Orange.data.Table("lenses") 
    44id = Orange.data.new_meta_id() 
    5 for inst in data: 
     5for inst in lenses: 
    66    inst[id] = random.random() 
    7 print data[0] 
     7print lenses[0] 
  • docs/reference/rst/code/kmeans-run-callback.py

    r9372 r9638  
    44    print "Iteration: %d, changes: %d, score: %.4f" % (km.iteration, km.nchanges, km.score) 
    55     
    6 table = Orange.data.Table("iris") 
    7 km = Orange.clustering.kmeans.Clustering(table, 3, minscorechange=0, inner_callback=callback) 
     6iris = Orange.data.Table("iris") 
     7km = Orange.clustering.kmeans.Clustering(iris, 3, minscorechange=0, inner_callback=callback) 
  • docs/reference/rst/code/kmeans-run.py

    r9372 r9638  
    11import Orange 
    22     
    3 table = Orange.data.Table("iris") 
    4 km = Orange.clustering.kmeans.Clustering(table, 3) 
     3iris = Orange.data.Table("iris") 
     4km = Orange.clustering.kmeans.Clustering(iris, 3) 
    55print km.clusters[-10:] 
  • docs/reference/rst/code/kmeans-silhouette.py

    r9372 r9638  
    11import Orange 
    22 
    3 table = Orange.data.Table("voting") 
     3voting = Orange.data.Table("voting") 
    44# table = Orange.data.Table("iris") 
    55 
    66for k in range(2, 8): 
    7     km = Orange.clustering.kmeans.Clustering(table, k, initialization=Orange.clustering.kmeans.init_diversity) 
     7    km = Orange.clustering.kmeans.Clustering(voting, k, initialization=Orange.clustering.kmeans.init_diversity) 
    88    score = Orange.clustering.kmeans.score_silhouette(km) 
    99    print k, score 
    1010 
    11 km = Orange.clustering.kmeans.Clustering(table, 3, initialization=Orange.clustering.kmeans.init_diversity) 
     11km = Orange.clustering.kmeans.Clustering(voting, 3, initialization=Orange.clustering.kmeans.init_diversity) 
    1212Orange.clustering.kmeans.plot_silhouette(km, "kmeans-silhouette.png") 
  • docs/reference/rst/code/knnExample0.py

    r9372 r9638  
    11import Orange 
    2 table = Orange.data.Table("iris") 
     2iris = Orange.data.Table("iris") 
    33 
    44knnLearner = Orange.classification.knn.kNNLearner() 
    55knnLearner.k = 10 
    6 knnClassifier = knnLearner(table) 
     6knnClassifier = knnLearner(iris) 
  • docs/reference/rst/code/knnExample1.py

    r9372 r9638  
    11import Orange 
    2 table = Orange.data.Table("iris") 
     2iris = Orange.data.Table("iris") 
    33 
    4 rndind = Orange.core.MakeRandomIndices2(table, p0=0.8) 
    5 train = table.select(rndind, 0) 
    6 test = table.select(rndind, 1) 
     4rndind = Orange.core.MakeRandomIndices2(iris, p0=0.8) 
     5train = iris.select(rndind, 0) 
     6test = iris.select(rndind, 1) 
    77 
    88knn = Orange.classification.knn.kNNLearner(train, k=10) 
  • docs/reference/rst/code/knnExample2.py

    r9372 r9638  
    11import Orange 
    2 table = Orange.data.Table("iris") 
     2iris = Orange.data.Table("iris") 
    33 
    44knn = Orange.classification.knn.kNNLearner() 
    55knn.k = 10 
    66knn.distance_constructor = Orange.core.ExamplesDistanceConstructor_Hamming() 
    7 knn = knn(table) 
     7knn = knn(iris) 
    88for i in range(5): 
    9     instance = table.randomexample() 
     9    instance = iris.randomexample() 
    1010    print instance.getclass(), knn(instance) 
  • docs/reference/rst/code/knnInstanceDistance.py

    r9525 r9638  
    11import Orange 
    22 
    3 table = Orange.data.Table("lenses") 
     3lenses = Orange.data.Table("lenses") 
    44 
    55nnc = Orange.classification.knn.FindNearestConstructor() 
     
    77 
    88did = Orange.data.new_meta_id() 
    9 nn = nnc(table, 0, did) 
     9nn = nnc(lenses, 0, did) 
    1010 
    11 print "*** Reference instance: ", table[0] 
    12 for inst in nn(table[0], 5): 
     11print "*** Reference instance: ", lenses[0] 
     12for inst in nn(lenses[0], 5): 
    1313    print inst 
  • docs/reference/rst/code/knnlearner.py

    r9372 r9638  
    66 
    77import Orange 
    8 table = Orange.data.Table("iris") 
     8iris = Orange.data.Table("iris") 
    99 
    1010print "Testing using euclidean distance" 
    11 rndind = Orange.core.MakeRandomIndices2(table, p0=0.8) 
    12 train = table.select(rndind, 0) 
    13 test = table.select(rndind, 1) 
     11rndind = Orange.core.MakeRandomIndices2(iris, p0=0.8) 
     12train = iris.select(rndind, 0) 
     13test = iris.select(rndind, 1) 
    1414 
    1515knn = Orange.classification.knn.kNNLearner(train, k=10) 
     
    2020print "\n" 
    2121print "Testing using hamming distance" 
    22 table = Orange.data.Table("iris") 
     22iris = Orange.data.Table("iris") 
    2323knn = Orange.classification.knn.kNNLearner() 
    2424knn.k = 10 
  • docs/reference/rst/code/logreg-run.py

    r9372 r9638  
    1 from Orange import * 
     1import Orange 
    22 
    3 table = data.Table("titanic") 
    4 lr = classification.logreg.LogRegLearner(table) 
     3titanic = Orange.data.Table("titanic") 
     4lr = Orange.classification.logreg.LogRegLearner(titanic) 
    55 
    66# compute classification accuracy 
    77correct = 0.0 
    8 for ex in table: 
     8for ex in titanic: 
    99    if lr(ex) == ex.getclass(): 
    1010        correct += 1 
    11 print "Classification accuracy:", correct / len(table) 
    12 classification.logreg.dump(lr) 
     11print "Classification accuracy:", correct / len(titanic) 
     12Orange.classification.logreg.dump(lr) 
  • docs/reference/rst/code/selection-best3.py

    r9372 r9644  
    66 
    77import Orange 
    8 table = Orange.data.Table("voting") 
     8voting = Orange.data.Table("voting") 
    99 
    1010n = 3 
    11 ma = Orange.feature.scoring.score_all(table) 
     11ma = Orange.feature.scoring.score_all(voting) 
    1212best = Orange.feature.selection.bestNAtts(ma, n) 
    1313print 'Best %d features:' % n 
  • orange/Orange/distance/instances.py

    r9125 r9639  
    1 """ 
    2  
    3 ########################### 
    4 Distances between Instances 
    5 ########################### 
    6  
    7 This page describes a bunch of classes for different metrics for measure 
    8 distances (dissimilarities) between instances. 
    9  
    10 Typical (although not all) measures of distance between instances require 
    11 some "learning" - adjusting the measure to the data. For instance, when 
    12 the dataset contains continuous features, the distances between continuous 
    13 values should be normalized, e.g. by dividing the distance with the range 
    14 of possible values or with some interquartile distance to ensure that all 
    15 features have, in principle, similar impacts. 
    16  
    17 Different measures of distance thus appear in pairs - a class that measures 
    18 the distance and a class that constructs it based on the data. The abstract 
    19 classes representing such a pair are `ExamplesDistance` and 
    20 `ExamplesDistanceConstructor`. 
    21  
    22 Since most measures work on normalized distances between corresponding 
    23 features, there is an abstract intermediate class 
    24 `ExamplesDistance_Normalized` that takes care of normalizing. 
    25 The remaining classes correspond to different ways of defining the distances, 
    26 such as Manhattan or Euclidean distance. 
    27  
    28 Unknown values are treated correctly only by Euclidean and Relief distance. 
    29 For other measure of distance, a distance between unknown and known or between 
    30 two unknown values is always 0.5. 
    31  
    32 .. class:: ExamplesDistance 
    33  
    34     .. method:: __call__(instance1, instance2) 
    35  
    36         Returns a distance between the given instances as floating point number.  
    37  
    38 .. class:: ExamplesDistanceConstructor 
    39  
    40     .. method:: __call__([instances, weightID][, distributions][, basic_var_stat]) 
    41  
    42         Constructs an instance of ExamplesDistance. 
    43         Not all the data needs to be given. Most measures can be constructed 
    44         from basic_var_stat; if it is not given, they can help themselves 
    45         either by instances or distributions. 
    46         Some (e.g. ExamplesDistance_Hamming) even do not need any arguments. 
    47  
    48 .. class:: ExamplesDistance_Normalized 
    49  
    50     This abstract class provides a function which is given two instances 
    51     and returns a list of normalized distances between values of their 
    52     features. Many distance measuring classes need such a function and are 
    53     therefore derived from this class 
    54  
    55     .. attribute:: normalizers 
    56      
    57         A precomputed list of normalizing factors for feature values 
    58          
    59         - If a factor positive, differences in feature's values 
    60           are multiplied by it; for continuous features the factor 
    61           would be 1/(max_value-min_value) and for ordinal features 
    62           the factor is 1/number_of_values. If either (or both) of 
    63           features are unknown, the distance is 0.5 
    64         - If a factor is -1, the feature is nominal; the distance 
    65           between two values is 0 if they are same (or at least 
    66           one is unknown) and 1 if they are different. 
    67         - If a factor is 0, the feature is ignored. 
    68  
    69     .. attribute:: bases, averages, variances 
    70  
    71         The minimal values, averages and variances 
    72         (continuous features only) 
    73  
    74     .. attribute:: domainVersion 
    75  
    76         Stores a domain version for which the normalizers were computed. 
    77         The domain version is increased each time a domain description is 
    78         changed (i.e. features are added or removed); this is used for a quick 
    79         check that the user is not attempting to measure distances between 
    80         instances that do not correspond to normalizers. 
    81         Since domains are practicably immutable (especially from Python), 
    82         you don't need to care about this anyway.  
    83  
    84     .. method:: attributeDistances(instance1, instance2) 
    85  
    86         Returns a list of floats representing distances between pairs of 
    87         feature values of the two instances. 
    88  
    89  
    90 .. class:: Hamming, HammingConstructor 
    91  
    92     Hamming distance between two instances is defined as the number of 
    93     features in which the two instances differ. Note that this measure 
    94     is not really appropriate for instances that contain continuous features. 
    95  
    96  
    97 .. class:: Maximal, MaximalConstructor 
    98  
    99     The maximal between two instances is defined as the maximal distance 
    100     between two feature values. If dist is the result of 
    101     ExamplesDistance_Normalized.attributeDistances, 
    102     then Maximal returns max(dist). 
    103  
    104  
    105 .. class:: Manhattan, ManhattanConstructor 
    106  
    107     Manhattan distance between two instances is a sum of absolute values 
    108     of distances between pairs of features, e.g. ``apply(add, [abs(x) for x in dist])`` 
    109     where dist is the result of ExamplesDistance_Normalized.attributeDistances. 
    110  
    111 .. class:: Euclidean, EuclideanConstructor 
    112  
    113  
    114     Euclidean distance is a square root of sum of squared per-feature distances, 
    115     i.e. ``sqrt(apply(add, [x*x for x in dist]))``, where dist is the result of 
    116     ExamplesDistance_Normalized.attributeDistances. 
    117  
    118     .. method:: distributions  
    119  
    120         An object of type 
    121         :obj:`Orange.statistics.distribution.Distribution` that holds 
    122         the distributions for all discrete features used for 
    123         computation of distances between known and unknown values. 
    124  
    125     .. method:: bothSpecialDist 
    126  
    127         A list containing the distance between two unknown values for each 
    128         discrete feature. 
    129  
    130     This measure of distance deals with unknown values by computing the 
    131     expected square of distance based on the distribution obtained from the 
    132     "training" data. Squared distance between 
    133  
    134         - A known and unknown continuous attribute equals squared distance 
    135           between the known and the average, plus variance 
    136         - Two unknown continuous attributes equals double variance 
    137         - A known and unknown discrete attribute equals the probabilit 
    138           that the unknown attribute has different value than the known 
    139           (i.e., 1 - probability of the known value) 
    140         - Two unknown discrete attributes equals the probability that two 
    141           random chosen values are equal, which can be computed as 
    142           1 - sum of squares of probabilities. 
    143  
    144     Continuous cases can be handled by averages and variances inherited from 
    145     ExamplesDistance_normalized. The data for discrete cases are stored in 
    146     distributions (used for unknown vs. known value) and in bothSpecial 
    147     (the precomputed distance between two unknown values). 
    148  
    149 .. class:: Relief, ReliefConstructor 
    150  
    151     Relief is similar to Manhattan distance, but incorporates a more 
    152     correct treatment of undefined values, which is used by ReliefF measure. 
    153  
    154 This class is derived directly from ExamplesDistance, not from ExamplesDistance_Normalized.         
    155              
    156  
    157 .. autoclass:: PearsonR 
    158     :members: 
    159  
    160 .. autoclass:: SpearmanR 
    161     :members: 
    162  
    163 .. autoclass:: PearsonRConstructor 
    164     :members: 
    165  
    166 .. autoclass:: SpearmanRConstructor 
    167     :members:     
    168  
    169  
    170 """ 
    171  
    1721import Orange 
    1732 
  • orange/Orange/feature/selection.py

    r9349 r9645  
    66.. index:: feature selection 
    77 
    8 .. index::  
     8.. index:: 
    99   single: feature; feature selection 
    1010 
    11 Some machine learning methods may perform better if they learn only from a  
    12 selected subset of "best" features.  
    13  
    14 The performance of some machine learning method can be improved by learning  
    15 only from a selected subset of data, which includes the most informative or  
    16 "best" features. This so-called filter approaches can boost the performance  
    17 of learner both in terms of predictive accuracy, speed-up induction, and 
    18 simplicity of resulting models. Feature scores are estimated prior to the 
    19 modelling, that is, without knowing of which machine learning method will be 
     11Some machine learning methods perform better if they learn only from a 
     12selected subset of the most informative or "best" features. 
     13 
     14This so-called filter approach can boost the performance 
     15of learner in terms of predictive accuracy, speed-up induction and 
     16simplicity of resulting models. Feature scores are estimated before 
     17modeling, without knowing  which machine learning method will be 
    2018used to construct a predictive model. 
    2119 
    22 :download:`selection-best3.py <code/selection-best3.py>` (uses :download:`voting.tab <code/voting.tab>`): 
     20:download:`Example script:<code/selection-best3.py>` 
    2321 
    2422.. literalinclude:: code/selection-best3.py 
     
    3230    synfuels-corporation-cutback 
    3331 
    34 .. automethod:: Orange.feature.selection.FilterAttsAboveThresh 
    35  
    36 .. autoclass:: Orange.feature.selection.FilterAttsAboveThresh_Class 
     32.. autoclass:: Orange.feature.selection.FilterAboveThreshold 
    3733   :members: 
    3834 
    39 .. automethod:: Orange.feature.selection.FilterBestNAtts 
    40  
    41 .. autoclass:: Orange.feature.selection.FilterBestNAtts_Class 
     35.. autoclass:: Orange.feature.selection.FilterBestN 
    4236   :members: 
    4337 
    44 .. automethod:: Orange.feature.selection.FilterRelief 
    45  
    46 .. autoclass:: Orange.feature.selection.FilterRelief_Class 
     38.. autoclass:: Orange.feature.selection.FilterRelief 
    4739   :members: 
    4840 
     
    5547   :members: 
    5648 
    57 These functions support in the design of feature subset selection for 
     49These functions support the design of feature subset selection for 
    5850classification problems. 
    5951 
    60 .. automethod:: Orange.feature.selection.bestNAtts 
    61  
    62 .. automethod:: Orange.feature.selection.attsAboveThreshold 
    63  
    64 .. automethod:: Orange.feature.selection.selectBestNAtts 
    65  
    66 .. automethod:: Orange.feature.selection.selectAttsAboveThresh 
    67  
    68 .. automethod:: Orange.feature.selection.filterRelieff 
     52.. automethod:: Orange.feature.selection.best_n 
     53 
     54.. automethod:: Orange.feature.selection.above_threshold 
     55 
     56.. automethod:: Orange.feature.selection.select_best_n 
     57 
     58.. automethod:: Orange.feature.selection.select_above_threshold 
     59 
     60.. automethod:: Orange.feature.selection.select_relief 
    6961 
    7062.. rubric:: Examples 
    7163 
    72 Following is a script that defines a new classifier that is based 
    73 on naive Bayes and prior to learning selects five best features from 
    74 the data set. The new classifier is wrapped-up in a special class (see 
     64The following script defines a new Naive Bayes classifier, that 
     65selects five best features from the data set before learning. 
     66The new classifier is wrapped-up in a special class (see 
    7567<a href="../ofb/c_pythonlearner.htm">Building your own learner</a> 
    7668lesson in <a href="../ofb/default.htm">Orange for Beginners</a>). The 
    77 script compares this filtered learner naive Bayes that uses a complete 
     69script compares this filtered learner with one that uses a complete 
    7870set of features. 
    7971 
     
    165157from Orange.feature.scoring import score_all 
    166158 
    167 # from orngFSS 
    168 def bestNAtts(scores, N): 
     159def best_n(scores, N): 
    169160    """Return the best N features (without scores) from the list returned 
    170161    by :obj:`Orange.feature.scoring.score_all`. 
     
    180171    return map(lambda x:x[0], scores[:N]) 
    181172 
    182 def attsAboveThreshold(scores, threshold=0.0): 
     173bestNAtts = best_n 
     174 
     175def above_threshold(scores, threshold=0.0): 
    183176    """Return features (without scores) from the list returned by 
    184177    :obj:`Orange.feature.scoring.score_all` with score above or 
     
    196189    return map(lambda x:x[0], pairs) 
    197190 
    198 def selectBestNAtts(data, scores, N): 
     191attsAboveThreshold = above_threshold 
     192 
     193 
     194def select_best_n(data, scores, N): 
    199195    """Construct and return a new set of examples that includes a 
    200196    class and only N best features from a list scores. 
     
    210206 
    211207    """ 
    212     return data.select(bestNAtts(scores, N)+[data.domain.classVar.name]) 
    213  
    214  
    215 def selectAttsAboveThresh(data, scores, threshold=0.0): 
     208    return data.select(best_n(scores, N) + [data.domain.classVar.name]) 
     209 
     210selectBestNAtts = select_best_n 
     211 
     212 
     213def select_above_threshold(data, scores, threshold=0.0): 
    216214    """Construct and return a new set of examples that includes a class and  
    217215    features from the list returned by  
     
    229227   
    230228    """ 
    231     return data.select(attsAboveThreshold(scores, threshold)+[data.domain.classVar.name]) 
    232  
    233 def filterRelieff(data, measure=orange.MeasureAttribute_relief(k=20, m=50), margin=0): 
     229    return data.select(above_threshold(scores, threshold) + [data.domain.classVar.name]) 
     230 
     231selectAttsAboveThresh = select_above_threshold 
     232 
     233 
     234def select_relief(data, measure=orange.MeasureAttribute_relief(k=20, m=50), margin=0): 
    234235    """Take the data set and use an attribute measure to remove the worst  
    235236    scored attribute (those below the margin). Repeats, until no attribute has 
     
    252253    """ 
    253254    measl = score_all(data, measure) 
    254     while len(data.domain.attributes)>0 and measl[-1][1]<margin: 
    255         data = selectBestNAtts(data, measl, len(data.domain.attributes)-1) 
     255    while len(data.domain.attributes) > 0 and measl[-1][1] < margin: 
     256        data = (data, measl, len(data.domain.attributes) - 1) 
    256257#        print 'remaining ', len(data.domain.attributes) 
    257258        measl = score_all(data, measure) 
    258259    return data 
    259260 
    260 ############################################################################## 
    261 # wrappers 
    262  
    263 def FilterAttsAboveThresh(data=None, **kwds): 
    264     filter = apply(FilterAttsAboveThresh_Class, (), kwds) 
    265     if data: 
    266         return filter(data) 
    267     else: 
    268         return filter 
    269    
    270 class FilterAttsAboveThresh_Class: 
    271     """Stores filter's parameters and can be later called with the data to 
    272     return the data table with only selected features.  
    273      
    274     This class is used in the function :obj:`selectAttsAboveThresh`. 
    275      
    276     :param measure: an attribute measure (derived from  
    277       :obj:`Orange.feature.scoring.Measure`). Defaults to  
    278       :obj:`Orange.feature.scoring.Relief` for k=20 and m=50.   
     261select_relief = filterRelieff 
     262 
     263 
     264class FilterAboveThreshold(object): 
     265    """Store filter parameters and can be later called with the data to 
     266    return the data table with only selected features. 
     267 
     268    This class uses the function :obj:`select_above_threshold`. 
     269 
     270    :param measure: an attribute measure (derived from 
     271      :obj:`Orange.feature.scoring.Measure`). Defaults to 
     272      :obj:`Orange.feature.scoring.Relief` for k=20 and m=50. 
    279273    :param threshold: score threshold for attribute selection. Defaults to 0. 
    280274    :type threshold: float 
    281       
     275 
    282276    Some examples of how to use this class are:: 
    283277 
    284         filter = Orange.feature.selection.FilterAttsAboveThresh(threshold=.15) 
     278        filter = Orange.feature.selection.FilterAboveThreshold(threshold=.15) 
    285279        new_data = filter(data) 
    286         new_data = Orange.feature.selection.FilterAttsAboveThresh(data) 
    287         new_data = Orange.feature.selection.FilterAttsAboveThresh(data, threshold=.1) 
    288         new_data = Orange.feature.selection.FilterAttsAboveThresh(data, threshold=.1, 
     280        new_data = Orange.feature.selection.FilterAboveThreshold(data) 
     281        new_data = Orange.feature.selection.FilterAboveThreshold(data, threshold=.1) 
     282        new_data = Orange.feature.selection.FilterAboveThreshold(data, threshold=.1, 
    289283                   measure=Orange.feature.scoring.Gini()) 
    290284 
    291285    """ 
    292     def __init__(self, measure=orange.MeasureAttribute_relief(k=20, m=50),  
    293                threshold=0.0): 
     286    def __new__(cls, data=None, 
     287                measure=orange.MeasureAttribute_relief(k=20, m=50), 
     288                threshold=0.0): 
     289 
     290        if data is None: 
     291            self = object.__new__(cls, measure=measure, threshold=threshold) 
     292            return self 
     293        else: 
     294            self = cls(measure=measure, threshold=threshold) 
     295            return self(data) 
     296 
     297    def __init__(self, measure=orange.MeasureAttribute_relief(k=20, m=50), \ 
     298                 threshold=0.0): 
     299 
    294300        self.measure = measure 
    295301        self.threshold = threshold 
     
    297303    def __call__(self, data): 
    298304        """Take data and return features with scores above given threshold. 
    299          
     305 
    300306        :param data: an data table 
    301307        :type data: Orange.data.table 
     
    303309        """ 
    304310        ma = score_all(data, self.measure) 
    305         return selectAttsAboveThresh(data, ma, self.threshold) 
    306  
    307 def FilterBestNAtts(data=None, **kwds): 
    308     """Similarly to :obj:`FilterAttsAboveThresh`, wrap around class 
    309     :obj:`FilterBestNAtts_Class`. 
    310      
    311     :param measure: an attribute measure (derived from  
    312       :obj:`Orange.feature.scoring.Measure`). Defaults to  
    313       :obj:`Orange.feature.scoring.Relief` for k=20 and m=50.   
     311        return select_above_threshold(data, ma, self.threshold) 
     312 
     313FilterAttsAboveThresh = FilterAboveThreshold 
     314FilterAttsAboveThresh_Class = FilterAboveThreshold 
     315 
     316 
     317class FilterBestN(object): 
     318    """Store filter parameters and can be later called with the data to 
     319    return the data table with only selected features. 
     320 
     321    :param measure: an attribute measure (derived from 
     322      :obj:`Orange.feature.scoring.Measure`). Defaults to 
     323      :obj:`Orange.feature.scoring.Relief` for k=20 and m=50. 
    314324    :param n: number of best features to return. Defaults to 5. 
    315325    :type n: int 
    316326 
    317327    """ 
    318     filter = apply(FilterBestNAtts_Class, (), kwds) 
    319     if data: return filter(data) 
    320     else: return filter 
    321    
    322 class FilterBestNAtts_Class: 
     328    def __new__(cls, data=None, 
     329                measure=orange.MeasureAttribute_relief(k=20, m=50), 
     330                n=5): 
     331 
     332        if data is None: 
     333            self = object.__new__(cls, measure=measure, n=n) 
     334            return self 
     335        else: 
     336            self = cls(measure=measure, n=n) 
     337            return self(data) 
     338 
    323339    def __init__(self, measure=orange.MeasureAttribute_relief(k=20, m=50), n=5): 
    324340        self.measure = measure 
    325341        self.n = n 
     342 
    326343    def __call__(self, data): 
    327344        ma = score_all(data, self.measure) 
    328345        self.n = min(self.n, len(data.domain.attributes)) 
    329         return selectBestNAtts(data, ma, self.n) 
    330  
    331 def FilterRelief(data=None, **kwds): 
     346        return (data, ma, self.n) 
     347 
     348FilterBestNAtts = FilterBestN 
     349FilterBestNAtts_Class = FilterBestN 
     350 
     351class FilterRelief(object): 
    332352    """Similarly to :obj:`FilterBestNAtts`, wrap around class  
    333353    :obj:`FilterRelief_Class`. 
     
    339359    :type margin: float 
    340360 
    341     """     
    342     filter = apply(FilterRelief_Class, (), kwds) 
    343     if data: 
    344         return filter(data) 
    345     else: 
    346         return filter 
    347    
    348 class FilterRelief_Class: 
     361    """ 
     362    def __new__(cls, data=None, 
     363                measure=orange.MeasureAttribute_relief(k=20, m=50), 
     364                margin=0): 
     365 
     366        if data is None: 
     367            self = object.__new__(cls, measure=measure, margin=margin) 
     368            return self 
     369        else: 
     370            self = cls(measure=measure, margin=margin) 
     371            return self(data) 
     372 
    349373    def __init__(self, measure=orange.MeasureAttribute_relief(k=20, m=50), margin=0): 
    350374        self.measure = measure 
    351375        self.margin = margin 
     376 
    352377    def __call__(self, data): 
    353         return filterRelieff(data, self.measure, self.margin) 
     378        return select_relief(data, self.measure, self.margin) 
     379 
     380FilterRelief_Class = FilterRelief 
    354381 
    355382############################################################################## 
    356383# wrapped learner 
    357384 
    358 def FilteredLearner(baseLearner, examples = None, weight = None, **kwds): 
     385 
     386def FilteredLearner(baseLearner, examples=None, weight=None, **kwds): 
    359387    """Return the corresponding learner that wraps  
    360388    :obj:`Orange.classification.baseLearner` and a data selection method.  
     
    393421        fdata = self.filter(data) 
    394422        model = self.baseLearner(fdata, weight) 
    395         return FilteredClassifier(classifier = model, domain = model.domain) 
     423        return FilteredClassifier(classifier=model, domain=model.domain) 
    396424 
    397425class FilteredClassifier: 
    398426    def __init__(self, **kwds): 
    399427        self.__dict__.update(kwds) 
    400     def __call__(self, example, resultType = orange.GetValue): 
     428    def __call__(self, example, resultType=orange.GetValue): 
    401429        return self.classifier(example, resultType) 
    402430    def atts(self): 
    403         return self.domain.attributes   
     431        return self.domain.attributes 
  • orange/Orange/network/network.py

    r9623 r9646  
    778778                       mdsType=MdsType.componentMDS, scalingRatio=0, \ 
    779779                       mdsFromCurrentPos=0): 
    780         """Position the network components according to similarities among  
     780        """Position the network components according to similarities among 
    781781        them. 
    782          
     782 
    783783        """ 
    784784 
Note: See TracChangeset for help on using the changeset viewer.