Changeset 8305:02a451f26d30 in orange


Ignore:
Timestamp:
06/11/11 11:13:25 (3 years ago)
Author:
Noughmad <Noughmad@…>
Branch:
default
Convert:
625adcb4fa4db8a1ff4c1d12948b62f892a1adce
Message:

Merge changes from trunk

Files:
44 added
8 deleted
71 edited

Legend:

Unmodified
Added
Removed
  • orange/MANIFEST.in

    r8264 r8305  
    1 include *.py *.txt *.c 
     1include *.py *.txt *.c *.cfg 
    22 
    33recursive-include Orange * 
  • orange/Orange/__init__.py

    r8264 r8305  
    1313    try: 
    1414        __import__(name, globals(), locals(), [], -1) 
    15     except: 
     15    except Exception: 
    1616        warnings.warn("Could not import: " + name, UserWarning, 2) 
    1717 
     
    4747 
    4848_import("projection") 
     49_import("projection.linear") 
    4950_import("projection.mds") 
    5051_import("projection.som") 
     
    8182 
    8283_import("misc") 
     84_import("misc.environ") 
    8385_import("misc.counters") 
    8486_import("misc.addons") 
  • orange/Orange/associate/__init__.py

    r8264 r8305  
    44============================== 
    55 
    6 Orange provides two algorithms for induction of `association rules <http://en.wikipedia.org/wiki/Association_rule_learning>`_. One is the basic Agrawal's algorithm with dynamic induction of supported itemsets and rules that is designed specifically for datasets with a large number of different items. This is, however, not really suitable for feature-based machine learning problems, which are at the primary focus of Orange. We have thus adapted the original algorithm to be more efficient for the latter type of data, and to induce the rules in which, for contrast to Agrawal's rules, both sides don't only contain features (like "bread, butter -> jam") but also their values ("bread = wheat, butter = yes -> jam = plum"). As a further variation, the algorithm can be limited to search only for classification rules in which the sole feature to appear on the right side of the rule is the class feature. 
    7  
    8 It is also possible to extract item sets instead of association rules. These are often more interesting than the rules themselves. 
    9  
    10 Besides association rule inducer, Orange also provides a rather simplified method for classification by association rules. 
     6Orange provides two algorithms for induction of 
     7`association rules <http://en.wikipedia.org/wiki/Association_rule_learning>`_. 
     8One is the basic Agrawal's algorithm with dynamic induction of supported 
     9itemsets and rules that is designed specifically for datasets with a 
     10large number of different items. This is, however, not really suitable 
     11for feature-based machine learning problems, which are at the primary focus 
     12of Orange. We have thus adapted the original algorithm to be more efficient 
     13for the latter type of data, and to induce the rules in which, for contrast 
     14to Agrawal's rules, both sides don't only contain features 
     15(like "bread, butter -> jam") but also their values 
     16("bread = wheat, butter = yes -> jam = plum"). As a further variation, the 
     17algorithm can be limited to search only for classification rules in which 
     18the sole feature to appear on the right side of the rule is the class feature. 
     19 
     20It is also possible to extract item sets instead of association rules. These 
     21are often more interesting than the rules themselves. 
     22 
     23Besides association rule inducer, Orange also provides a rather simplified 
     24method for classification by association rules. 
    1125 
    1226=================== 
     
    1428=================== 
    1529 
    16 The class that induces rules by Agrawal's algorithm, accepts the data examples of two forms. The first is the standard form in which each example is described by values of a fixed list of features (defined in domain). The algorithm, however, disregards the feature values and only checks whether the value is defined or not. The rule shown above ("bread, butter -> jam") actually means that if "bread" and "butter" are defined, then "jam" is defined as well. It is expected that most of values will be undefined - if this is not so, you need to use the other association rules inducer, described in the chapter :ref:`non-sparse-examples`. 
    17  
    18 Since the usual representation of examples described above is rather unsuitable for sparse examples, AssociationRulesSparseInducer can also use examples represented a bit differently. Sparse examples have no fixed features - the examples' domain is empty, there are neither ordinary nor class features. All values assigned to example are given as meta-attributes. All meta-attributes need, however, be `registered with the domain descriptor <http://orange.biolab.si/doc/reference/Domain.htm#meta-attributes>`_. If you have data of this kind, the most suitable format for it is the `basket format <http://orange.biolab.si/doc/reference/fileformats.htm#basket>`_. 
    19  
    20 In both cases, the examples are first translated into an internal AssociationRulesSparseInducer's internal format for sparse datasets. The algorithm first dynamically builds all itemsets (sets of features) that have at least the prescribed support. Each of these is then used to derive rules with requested confidence. 
    21  
    22 If examples were given in the sparse form, so are the left and right side of the induced rules. If examples were given in the standard form, so are the examples in association rules. 
     30The class that induces rules by Agrawal's algorithm, accepts the data examples 
     31of two forms. The first is the standard form in which each example is 
     32described by values of a fixed list of features (defined in domain). 
     33The algorithm, however, disregards the feature values and only checks whether 
     34the value is defined or not. The rule shown above ("bread, butter -> jam") 
     35actually means that if "bread" and "butter" are defined, then "jam" is defined 
     36as well. It is expected that most of values will be undefined - if this is not 
     37so, you need to use the other association rules inducer, described in the 
     38chapter :ref:`non-sparse-examples`. 
     39 
     40Since the usual representation of examples described above is rather unsuitable 
     41for sparse examples, AssociationRulesSparseInducer can also use examples 
     42represented a bit differently. Sparse examples have no fixed 
     43features - the examples' domain is empty, there are neither ordinary nor class 
     44features. All values assigned to example are given as meta-attributes. 
     45All meta-attributes need, however, be `registered with the domain descriptor 
     46<http://orange.biolab.si/doc/reference/Domain.htm#meta-attributes>`_. 
     47If you have data of this kind, the most suitable format for it is the 
     48`basket format <http://orange.biolab.si/doc/reference/fileformats.htm#basket>`_. 
     49 
     50In both cases, the examples are first translated into an internal 
     51AssociationRulesSparseInducer's internal format for sparse datasets. The 
     52algorithm first dynamically builds all itemsets (sets of features) that have 
     53at least the prescribed support. Each of these is then used to derive rules 
     54with requested confidence. 
     55 
     56If examples were given in the sparse form, so are the left and right side 
     57of the induced rules. If examples were given in the standard form, so are 
     58the examples in association rules. 
    2359 
    2460.. class:: Orange.associate.AssociationRulesSparseInducer 
     
    3470    .. attribute:: storeExamples 
    3571     
    36     Tells the inducer to store the examples covered by each rule and those confirming it. 
     72    Tells the inducer to store the examples covered by each rule and 
     73    those confirming it. 
    3774     
    3875    .. attribute:: maxItemSets 
     
    4279.. _maxItemSets: 
    4380 
    44 The maxItemSets attribute deserves some explanation. The algorithm's running time (and its memory consumption) depends on the minimal support; the lower the requested support, the more eligible itemsets will be found. There is no general rule for knowing the itemset in advance (generally, value should be around 0.3, but this depends upon the number of different items, the diversity of examples...) so it's very easy to set the limit too low. In this case, the algorithm can induce hundreds of thousands of itemsets until it runs out of memory. To prevent this, it will stop inducing itemsets and report an error when the prescribed maximum maxItemSets is exceeded. In this case, you should increase the required support. On the other hand, you can (reasonably) increase the maxItemSets to as high as you computer is able to handle. 
    45  
    46 We shall test the rule inducer on a dataset consisting of a brief description of Spanish Inquisition, given by Palin et al: 
     81The maxItemSets attribute deserves some explanation. The algorithm's 
     82running time (and its memory consumption) depends on the minimal support; 
     83the lower the requested support, the more eligible itemsets will be found. 
     84There is no general rule for knowing the itemset in advance (generally, value 
     85should be around 0.3, but this depends upon the number of different items, the 
     86diversity of examples...) so it's very easy to set the limit too low. In this 
     87case, the algorithm can induce hundreds of thousands of itemsets until it 
     88runs out of memory. To prevent this, it will stop inducing itemsets and 
     89report an error when the prescribed maximum maxItemSets is exceeded. 
     90In this case, you should increase the required support. On the other hand, 
     91you can (reasonably) increase the maxItemSets to as high as you computer is 
     92able to handle. 
     93 
     94We shall test the rule inducer on a dataset consisting of a brief description 
     95of Spanish Inquisition, given by Palin et al: 
    4796 
    4897    NOBODY expects the Spanish Inquisition! Our chief weapon is surprise...surprise and fear...fear and surprise.... Our two weapons are fear and surprise...and ruthless efficiency.... Our *three* weapons are fear, surprise, and ruthless efficiency...and an almost fanatical devotion to the Pope.... Our *four*...no... *Amongst* our weapons.... Amongst our weaponry...are such elements as fear, surprise.... I'll come in again. 
     
    85134    0.500   0.714   our -> surprise 
    86135 
    87 If examples are weighted, weight can be passed as an additional argument to call operator. 
    88  
    89 To get only a list of supported item sets, one should call the method getItemsets. The result is a list whose elements are tuples with two elements. The first is a tuple with indices of features in the item set. Sparse examples are usually represented with meta-attributes, so this indices will be negative. The second element is  a list of indices supporting the item set, that is, containing all the items in the set. If storeExamples is False, the second element is None. :: 
     136If examples are weighted, weight can be passed as an additional argument to 
     137call operator. 
     138 
     139To get only a list of supported item sets, one should call the method 
     140getItemsets. The result is a list whose elements are tuples with two elements. 
     141The first is a tuple with indices of features in the item set. Sparse examples 
     142are usually represented with meta-attributes, so this indices will be negative. 
     143The second element is  a list of indices supporting the item set, that is, 
     144containing all the items in the set. If storeExamples is False, the second 
     145element is None. :: 
    90146 
    91147    inducer = Orange.associate.AssociationRulesSparseInducer(support = 0.5, storeExamples = True) 
    92148    itemsets = inducer.getItemsets(data) 
    93149     
    94 Now itemsets is a list of itemsets along with the examples supporting them since we set storeExamples to True. :: 
     150Now itemsets is a list of itemsets along with the examples supporting them 
     151since we set storeExamples to True. :: 
    95152 
    96153    >>> itemsets[5] 
     
    99156    ['surprise', 'our']    
    100157     
    101 The sixth itemset contains features with indices -11 and -7, that is, the words "surprise" and "our". The examples supporting it are those with indices 1,2, 3, 6 and 9. 
    102  
    103 This way of representing the itemsets is not very programmer-friendly, but it is much more memory efficient than and faster to work with than using objects like Variable and Example. 
     158The sixth itemset contains features with indices -11 and -7, that is, the 
     159words "surprise" and "our". The examples supporting it are those with 
     160indices 1,2, 3, 6 and 9. 
     161 
     162This way of representing the itemsets is not very programmer-friendly, but 
     163it is much more memory efficient than and faster to work with than using 
     164objects like Variable and Example. 
    104165 
    105166.. _non-sparse-examples: 
     
    109170=================== 
    110171 
    111 The other algorithm for association rules provided by Orange (AssociationRulesInducer) is optimized for non-sparse examples in the usual Orange form. Each example is described by values of a fixed set of features. Unknown values are ignored, while values of features are not (as opposite to the above-described algorithm for sparse rules). In addition, the algorithm can be directed to search only for classification rules, in which the only feature on the right-hand side is the class Feature. 
     172The other algorithm for association rules provided by Orange 
     173(AssociationRulesInducer) is optimized for non-sparse examples in the usual 
     174Orange form. Each example is described by values of a fixed set of features. 
     175Unknown values are ignored, while values of features are not (as opposite to 
     176the above-described algorithm for sparse rules). In addition, the algorithm 
     177can be directed to search only for classification rules, in which the only 
     178feature on the right-hand side is the class Feature. 
    112179 
    113180.. class:: Orange.associate.AssociationRulesInducer(float asupp, float aconf) 
     
    126193    .. attribute:: classificationRules 
    127194     
    128     If 1 (default is 0), the algorithm constructs classification rules instead of general association rules. 
     195    If 1 (default is 0), the algorithm constructs classification rules instead 
     196    of general association rules. 
    129197     
    130198    .. attribute:: storeExamples 
    131199     
    132     Tells the inducer to store the examples covered by each rule and those confirming it 
     200    Tells the inducer to store the examples covered by each rule and those 
     201    confirming it 
    133202     
    134203    .. attribute:: maxItemSets 
     
    136205    The maximal number of itemsets. 
    137206 
    138 Meaning of all attributes (except the new one, classificationRules) is the same as for AssociationRulesSparseInducer. See the description of :ref:`maxItemSets <maxItemSets>` there. The example uses `lenses.tab`_: :: 
     207Meaning of all attributes (except the new one, classificationRules) is the 
     208same as for AssociationRulesSparseInducer. See the description of 
     209:ref:`maxItemSets <maxItemSets>` there. The example uses `lenses.tab`_: :: 
    139210 
    140211    import Orange 
     
    169240    0.500  1.000  tear_rate=reduced -> lenses=none 
    170241     
    171 AssociationRulesInducer can also work with weighted examples; the ID of weight feature should be passed as an additional argument in a call. 
    172  
    173 Itemsets are induced in a similar fashion as for sparse data, except that the first element of the tuple, the item set, is represented not by indices of features, as before, but with tuples (feature-index, value-index): :: 
     242AssociationRulesInducer can also work with weighted examples; the ID of weight 
     243feature should be passed as an additional argument in a call. 
     244 
     245Itemsets are induced in a similar fashion as for sparse data, except that the 
     246first element of the tuple, the item set, is represented not by indices of 
     247features, as before, but with tuples (feature-index, value-index): :: 
    174248 
    175249    inducer = Orange.associate.AssociationRulesInducer(support = 0.3, storeExamples = True) 
     
    181255    (((2, 1), (4, 0)), [2, 6, 10, 14, 15, 18, 22, 23]) 
    182256     
    183 meaning that the ninth itemset contains the second value of the third feature (2, 1), and the first value of the fifth (4, 0). 
     257meaning that the ninth itemset contains the second value of the third feature 
     258(2, 1), and the first value of the fifth (4, 0). 
    184259 
    185260================= 
     
    187262================= 
    188263 
    189 Both classes for induction of association rules return the induced rules in AssociationRules which is basically a list of instances of AssociationRule. 
     264Both classes for induction of association rules return the induced rules in 
     265AssociationRules which is basically a list of instances of AssociationRule. 
    190266 
    191267.. class:: Orange.associate.AssociationRules 
     
    193269    .. attribute:: left, right 
    194270     
    195         The left and the right side of the rule. Both are given as Example. In rules created by AssociationRulesSparseInducer from examples that contain all values as meta-values, left and right are examples in the same form. Otherwise, values in left that do not appear in the rule are "don't care", and value in right are "don't know". Both can, however, be tested by isSpecial (see documentation on  `Value <http://orange.biolab.si/doc/reference/Value.htm>`_). 
     271        The left and the right side of the rule. Both are given as Example. 
     272        In rules created by AssociationRulesSparseInducer from examples that 
     273        contain all values as meta-values, left and right are examples in the 
     274        same form. Otherwise, values in left that do not appear in the rule 
     275        are "don't care", and value in right are "don't know". Both can, 
     276        however, be tested by isSpecial (see documentation on 
     277        `Value <http://orange.biolab.si/doc/reference/Value.htm>`_). 
    196278     
    197279    .. attribute:: nLeft, nRight 
    198280     
    199         The number of features (ie defined values) on the left and on the right side of the rule. 
     281        The number of features (i.e. defined values) on the left and on the 
     282        right side of the rule. 
    200283     
    201284    .. attribute:: nAppliesLeft, nAppliesRight, nAppliesBoth 
    202285     
    203         The number of (learning) examples that conform to the left, the right and to both sides of the rule. 
     286        The number of (learning) examples that conform to the left, the right 
     287        and to both sides of the rule. 
    204288     
    205289    .. attribute:: nExamples 
     
    233317    .. attribute:: examples, matchLeft, matchBoth 
    234318     
    235         If storeExamples was True during induction, examples contains a copy of the example table used to induce the rules. Attributes matchLeft and matchBoth are lists of integers, representing the indices of examples which match the left-hand side of the rule and both sides, respectively. 
     319        If storeExamples was True during induction, examples contains a copy 
     320        of the example table used to induce the rules. Attributes matchLeft 
     321        and matchBoth are lists of integers, representing the indices of 
     322        examples which match the left-hand side of the rule and both sides, 
     323        respectively. 
    236324 
    237325    .. method:: AssociationRule(left, right, nAppliesLeft, nAppliesRight, nAppliesBoth, nExamples) 
     
    241329    .. method:: AssociationRule(left, right, support, confidence) 
    242330     
    243         Construct association rule and sets its support and confidence. If you intend to pass on such a rule you should set other attributes manually - AssociationRules's constructor cannot compute anything from arguments support and confidence. 
     331        Construct association rule and sets its support and confidence. If 
     332        you intend to pass on such a rule you should set other attributes 
     333        manually - AssociationRules's constructor cannot compute anything 
     334        from arguments support and confidence. 
    244335     
    245336    .. method:: AssociationRule(rule) 
    246337     
    247         Given an association rule as the argument, constructor copies of the rule. 
     338        Given an association rule as the argument, constructor copies of the 
     339        rule. 
    248340     
    249341    .. method:: appliesLeft(example) 
     
    253345    .. method:: appliesBoth(example) 
    254346     
    255         Tells whether the example fits into the left, right or both sides of the rule, respectively. If the rule is represented by sparse examples, the given example must be sparse as well. 
    256      
    257 Association rule inducers do not store evidence about which example supports which rule (although this information is available during induction its discarded afterwards). Let us write a function that finds the examples that confirm the rule (fit both sides of it) and those that contradict it (fit the left-hand side but not the right). The example uses the `lenses.tab`_: :: 
     347        Tells whether the example fits into the left, right or both sides of 
     348        the rule, respectively. If the rule is represented by sparse examples, 
     349        the given example must be sparse as well. 
     350     
     351Association rule inducers do not store evidence about which example supports 
     352which rule (although this information is available during induction its 
     353discarded afterwards). Let us write a function that finds the examples that 
     354confirm the rule (fit both sides of it) and those that contradict it (fit the 
     355left-hand side but not the right). The example uses the `lenses.tab`_: :: 
    258356 
    259357    import Orange 
     
    280378    print 
    281379 
    282 The latter printouts get simpler and faster if we instruct the inducer to store the examples. We can then do, for instance, this: :: 
     380The latter printouts get simpler and faster if we instruct the inducer to 
     381store the examples. We can then do, for instance, this: :: 
    283382 
    284383    print "Match left: " 
     
    287386    print "\\n".join(str(rule.examples[i]) for i in rule.matchBoth) 
    288387 
    289 The "contradicting" examples are then those whose indices are found in matchLeft but not in matchBoth. The memory friendlier and the faster way to compute this is as follows: :: 
     388The "contradicting" examples are then those whose indices are found in 
     389matchLeft but not in matchBoth. The memory friendlier and the faster way 
     390to compute this is as follows: :: 
    290391 
    291392    >>> [x for x in rule.matchLeft if not x in rule.matchBoth] 
  • orange/Orange/classification/bayes.py

    r8264 r8305  
    1717 
    1818The following example demonstrates a straightforward invocation of 
    19 this algorithm (`bayes-run.py`_, uses `iris.tab`_): 
     19this algorithm (`bayes-run.py`_, uses `titanic.tab`_): 
    2020 
    2121.. literalinclude:: code/bayes-run.py 
     
    7878.. _adult-sample.tab: code/adult-sample.tab 
    7979.. _iris.tab: code/iris.tab 
     80.. _titanic.tab: code/iris.tab 
    8081.. _lenses.tab: code/lenses.tab 
    8182 
     
    174175from Orange.core import BayesClassifier as _BayesClassifier 
    175176from Orange.core import BayesLearner as _BayesLearner 
     177 
    176178 
    177179class NaiveLearner(Orange.classification.Learner): 
     
    237239    """ 
    238240     
    239     def __new__(cls, instances = None, weightID = 0, **argkw): 
     241    def __new__(cls, instances = None, weight_id = 0, **argkw): 
    240242        self = Orange.classification.Learner.__new__(cls, **argkw) 
    241243        if instances: 
    242244            self.__init__(**argkw) 
    243             return self.__call__(instances, weightID) 
     245            return self.__call__(instances, weight_id) 
    244246        else: 
    245247            return self 
    246248         
    247     def __init__(self, adjustTreshold=False, m=0, estimatorConstructor=None, 
    248                  conditionalEstimatorConstructor=None, 
    249                  conditionalEstimatorConstructorContinuous=None,**argkw): 
    250         self.adjustThreshold = adjustTreshold 
     249    def __init__(self, adjust_threshold=False, m=0, estimator_constructor=None, 
     250                 conditional_estimator_constructor=None, 
     251                 conditional_estimator_constructor_continuous=None,**argkw): 
     252        self.adjust_threshold = adjust_threshold 
    251253        self.m = m 
    252         self.estimatorConstructor = estimatorConstructor 
    253         self.conditionalEstimatorConstructor = conditionalEstimatorConstructor 
    254         self.conditionalEstimatorConstructorContinuous = conditionalEstimatorConstructorContinuous 
     254        self.estimator_constructor = estimator_constructor 
     255        self.conditional_estimator_constructor = conditional_estimator_constructor 
     256        self.conditional_estimator_constructor_continuous = conditional_estimator_constructor_continuous 
    255257        self.__dict__.update(argkw) 
    256258 
     
    265267        """ 
    266268        bayes = _BayesLearner() 
    267         if self.estimatorConstructor: 
    268             bayes.estimatorConstructor = self.estimatorConstructor 
     269        if self.estimator_constructor: 
     270            bayes.estimator_constructor = self.estimator_constructor 
    269271            if self.m: 
    270                 if not hasattr(bayes.estimatorConstructor, "m"): 
    271                     raise AttributeError, "invalid combination of attributes: 'estimatorConstructor' does not expect 'm'" 
     272                if not hasattr(bayes.estimator_constructor, "m"): 
     273                    raise AttributeError, "invalid combination of attributes: 'estimator_constructor' does not expect 'm'" 
    272274                else: 
    273                     self.estimatorConstructor.m = self.m 
     275                    self.estimator_constructor.m = self.m 
    274276        elif self.m: 
    275             bayes.estimatorConstructor = Orange.core.ProbabilityEstimatorConstructor_m(m = self.m) 
    276         if self.conditionalEstimatorConstructor: 
    277             bayes.conditionalEstimatorConstructor = self.conditionalEstimatorConstructor 
    278         elif bayes.estimatorConstructor: 
    279             bayes.conditionalEstimatorConstructor = Orange.core.ConditionalProbabilityEstimatorConstructor_ByRows() 
    280             bayes.conditionalEstimatorConstructor.estimatorConstructor=bayes.estimatorConstructor 
    281         if self.conditionalEstimatorConstructorContinuous: 
    282             bayes.conditionalEstimatorConstructorContinuous = self.conditionalEstimatorConstructorContinuous 
    283         if self.adjustThreshold: 
    284             bayes.adjustThreshold = self.adjustThreshold 
     277            bayes.estimator_constructor = Orange.core.ProbabilityEstimatorConstructor_m(m = self.m) 
     278        if self.conditional_estimator_constructor: 
     279            bayes.conditional_estimator_constructor = self.conditional_estimator_constructor 
     280        elif bayes.estimator_constructor: 
     281            bayes.conditional_estimator_constructor = Orange.core.ConditionalProbabilityEstimatorConstructor_ByRows() 
     282            bayes.conditional_estimator_constructor.estimator_constructor=bayes.estimator_constructor 
     283        if self.conditional_estimator_constructor_continuous: 
     284            bayes.conditional_estimator_constructor_continuous = self.conditional_estimator_constructor_continuous 
     285        if self.adjust_threshold: 
     286            bayes.adjust_threshold = self.adjust_threshold 
    285287        return NaiveClassifier(bayes(instances, weight)) 
    286              
     288NaiveLearner = Orange.misc.deprecated_members( 
     289{     "adjustThreshold": "adjust_threshold", 
     290      "estimatorConstructor": "estimator_constructor", 
     291      "conditionalEstimatorConstructor": "conditional_estimator_constructor", 
     292      "conditionalEstimatorConstructorContinuous":"conditional_estimator_constructor_continuous", 
     293      "weightID": "weight_id" 
     294}, in_place=False)(NaiveLearner) 
     295 
     296 
    287297class NaiveClassifier(Orange.classification.Classifier): 
    288298    """ 
     
    368378        classes=" "*20+ ((' %10s'*nValues) % tuple([i[:10] for i in self.classVar.values])) 
    369379         
    370         return "\n".join( 
     380        return "\n".join([ 
    371381            classes, 
    372382            "class probabilities "+(frmtStr % tuple(self.distribution)), 
    373383            "", 
    374             "\n".join(["\n".join( 
     384            "\n".join(["\n".join([ 
    375385                "Attribute " + i.variable.name, 
    376386                classes, 
    377387                "\n".join( 
    378388                    ("%20s" % i.variable.values[v][:20]) + (frmtStr % tuple(i[v])) 
    379                     for v in xrange(len(i.variable.values))) 
    380                 ) for i in self.conditionalDistributions])) 
     389                    for v in xrange(len(i.variable.values)))] 
     390                ) for i in self.conditionalDistributions])]) 
    381391             
    382392 
  • orange/Orange/classification/knn.py

    r8264 r8305  
    88******************* 
    99 
    10 The module includes implementation of `nearest neighbors  
     10The module includes implementation of the `nearest neighbors  
    1111algorithm <http://en.wikipedia.org/wiki/K-nearest_neighbor_algorithm>`_ and classes 
    12 for finding nearest instances according to chosen distance metrics. 
     12for finding the nearest instances according to chosen distance metrics. 
    1313 
    1414k-nearest neighbor algorithm 
    1515============================ 
    1616 
    17 Nearest neighbors algorithm is one of most basic,  
     17The nearest neighbors algorithm is one of the most basic,  
    1818`lazy <http://en.wikipedia.org/wiki/Lazy_learning>`_ machine learning algorithms. 
    19 The learner only needs to store the training data instances, while the classifier 
    20 does all the work by searching this list for the most similar instances for  
     19The learner only needs to store the instances of training data, while the classifier 
     20does all the work by searching this list for the instances most similar to  
    2121the data instance being classified: 
    2222 
     
    2828    :type instances: Orange.data.Table 
    2929     
    30     :param k: number of nearest neighbours used in classification 
     30    :param k: number of nearest neighbors used in classification 
    3131    :type k: int 
    3232     
     
    6464instances. distance_constructor is used if given; otherwise, Euclidean  
    6565metrics will be used. :class:`kNNLearner` then constructs an instance of  
    66 :class:`FindNearest_BruteForce`. Together with ID of meta feature with  
     66:class:`FindNearest_BruteForce`. Together with the ID of the meta feature with  
    6767weights of instances, :attr:`kNNLearner.k` and :attr:`kNNLearner.rank_weight`, 
    6868it is passed to a :class:`kNNClassifier`. 
     
    8787    .. method:: find_nearest(instance) 
    8888     
    89     A component that finds nearest neighbors of a given instance. 
     89    A component which finds the nearest neighbors of a given instance. 
    9090         
    9191    :param instance: given instance 
     
    102102    .. attribute:: rank_weight 
    103103     
    104         Enables weighting by ranks (default: :obj:`true`). 
     104        Enables weighting by rank (default: :obj:`true`). 
    105105     
    106106    .. attribute:: weight_ID 
     
    111111     
    112112        The number of learning instances. It is used to compute the number of  
    113         neighbours if :attr:`kNNClassifier.k` is zero. 
     113        neighbors if the value of :attr:`kNNClassifier.k` is zero. 
    114114 
    115115When called to classify an instance, the classifier first calls  
     
    123123If :meth:`kNNClassifier.find_nearest` returns only one neighbor  
    124124(this is the case if :obj:`k=1`), :class:`kNNClassifier` returns the 
    125 neighbour's class. 
    126  
    127 Otherwise, the retrieved neighbours vote about the class prediction 
     125neighbor's class. 
     126 
     127Otherwise, the retrieved neighbors vote about the class prediction 
    128128(or probability of classes). Voting has double weights. As first, if 
    129129instances are weighted, their weights are respected. Secondly, nearer 
    130 neighbours have greater impact on the prediction; weight of instance 
     130neighbors have a greater impact on the prediction; the weight of instance 
    131131is computed as exp(-t:sup:`2`/s:sup:`2`), where the meaning of t depends 
    132132on the setting of :obj:`rank_weight`. 
    133133 
    134 * if :obj:`rank_weight` is :obj:`false`, :obj:`t` is a distance from the 
     134* if :obj:`rank_weight` is :obj:`false`, :obj:`t` is the distance from the 
    135135  instance being classified 
    136136* if :obj:`rank_weight` is :obj:`true`, neighbors are ordered and :obj:`t` 
     
    141141is 0.001. 
    142142 
    143 Weighting gives the classifier certain insensitivity to the number of 
     143Weighting gives the classifier a certain insensitivity to the number of 
    144144neighbors used, making it possible to use large :obj:`k`'s. 
    145145 
     
    151151-------- 
    152152 
    153 We will test the learner on 'iris' data set. We shall split it onto train 
    154 (80%) and test (20%) sets, learn on training instances and test on five 
    155 randomly selected test instances, in part of  
    156 (`knnlearner.py`_, uses `iris.tab`_): 
     153The learner will be tested on an 'iris' data set. The data will be split  
     154into training (80%) and testing (20%) instances. We will use the former  
     155for "training" the classifier and test it on five testing instances  
     156randomly selected from a part of (`knnlearner.py`_, uses `iris.tab`_): 
    157157 
    158158.. literalinclude:: code/knnExample1.py 
     
    166166    Iris-setosa Iris-setosa 
    167167 
    168 The secret of kNN's success is that the instances in iris data set appear in 
     168The secret to kNN's success is that the instances in the iris data set appear in 
    169169three well separated clusters. The classifier's accuracy will remain 
    170 excellent even with very large or small number of neighbors. 
    171  
    172 As many experiments have shown, a selection of instances distance measure 
    173 does not have a greater and predictable effect on the performance of kNN 
    174 classifiers. So there is not much point in changing the default. If you 
    175 decide to do so, you need to set the distance_constructor to an instance 
     170excellent even with a very large or very small number of neighbors. 
     171 
     172As many experiments have shown, a selection of instances of distance measures 
     173has neither a greater nor more predictable effect on the performance of kNN 
     174classifiers. Therefore there is not much point in changing the default. If you 
     175decide to do so, the distance_constructor must be set to an instance 
    176176of one of the classes for distance measuring. This can be seen in the following 
    177177part of (`knnlearner.py`_, uses `iris.tab`_): 
     
    198198========================= 
    199199 
    200 Orange provides classes for finding the nearest neighbors of the given 
    201 reference instance. While we might add some smarter classes in future, we 
    202 now have only two - abstract classes that defines the general behavior of 
     200Orange provides classes for finding the nearest neighbors of a given 
     201reference instance. While we might add some smarter classes in the future, we 
     202now have only two - abstract classes that define the general behavior of 
    203203neighbor searching classes, and classes that implement brute force search. 
    204204 
    205 As usually in Orange, there is a pair of classes: a class that does the work 
     205As is the norm in Orange, there are a pair of classes: a class that does the work 
    206206(:class:`FindNearest`) and a class that constructs it ("learning" - getting the 
    207207instances and arranging them in an appropriate data structure that allows for 
     
    210210.. class:: FindNearest 
    211211 
    212     A class for brute force search for nearest neighbours. It stores a table  
     212    A class for a brute force search for nearest neighbors. It stores a table  
    213213    of instances (it's its own copy of instances, not only Orange.data.Table 
    214     with references to another Orange.data.Table). When asked for neighbours, 
     214    with references to another Orange.data.Table). When asked for neighbors, 
    215215    it measures distances to all instances, stores them in a heap and returns  
    216216    the first k as an Orange.data.Table with references to instances stored in 
     
    219219    .. attribute:: distance 
    220220     
    221         a component that measures distance between examples 
     221        a component that measures the distance between examples 
    222222     
    223223    .. attribute:: examples 
     
    234234    :type instance: Orange.data.Instance 
    235235     
    236     :param n: number of neighbours 
     236    :param n: number of neighbors 
    237237    :type n: int 
    238238     
     
    241241.. class:: FindNearestConstructor() 
    242242 
    243     A class that constructs FindNearest. It calls the inherited  
    244     distance_constructor and then passes the constructed distance measure, 
    245     among with instances, weight_ID and distance_ID, to the just constructed 
    246     instance of FindNearest_BruteForce. 
    247      
     243     
     244    A class that constructs FindNearest. It calls the inherited 
     245    distance_constructor, which constructs a distance measure. 
     246    The distance measure, along with the instances weight_ID and 
     247    distance_ID, is then passed to the just constructed instance 
     248    of FindNearest_BruteForce. 
     249 
    248250    If there are more instances with the same distance fighting for the last 
    249251    places, the tie is resolved by randomly picking the appropriate number of 
    250252    instances. A local random generator is constructed and initiated by a 
    251253    constant computed from the reference instance. The effect of this is that 
    252     same random neighbours will be chosen for the instance each time 
     254    the same random neighbors will be chosen for the instance each time 
    253255    FindNearest_BruteForce 
    254256    is called. 
     
    257259     
    258260        A component of class ExamplesDistanceConstructor that "learns" to 
    259         measure distances between instances. Learning can be, for instances, 
    260         storing the ranges of continuous features or the number of value of 
     261        measure distances between instances. Learning can mean, for instances, 
     262        storing the ranges of continuous features or the number of values of 
    261263        a discrete feature (see the page about measuring distances for more 
    262264        information). The result of learning is an instance of  
     
    266268    .. attribute:: include_same 
    267269     
    268         Tells whether to include the examples that are same as the reference; 
    269         default is true. 
     270        Tells whether or not to include the examples that are same as the reference; 
     271        the default is true. 
    270272     
    271273    .. method:: __call__(table, weightID, distanceID) 
    272274     
    273         Constructs an instance of FindNearest that would return neighbours of 
     275        Constructs an instance of FindNearest that would return neighbors of 
    274276        a given instance, obeying weight_ID when counting them (also, some  
    275         measures of distance might consider weights as well) and store the  
     277        measures of distance might consider weights as well) and storing the  
    276278        distances in a meta attribute with ID distance_ID. 
    277279     
     
    291293 
    292294The following script (`knnInstanceDistance.py`_, uses `lenses.tab`_)  
    293 shows how to find the five nearest neighbours of the first instance 
     295shows how to find the five nearest neighbors of the first instance 
    294296in the lenses dataset. 
    295297 
  • orange/Orange/classification/tree.py

    r8264 r8305  
    22312231        in a node belong to the same class. 
    22322232 
    2233     .. attribute:: mForPruning 
     2233    .. attribute:: m_pruning 
    22342234 
    22352235        If non-zero, invokes an error-based bottom-up post-pruning, 
     
    22952295        if getattr(self, "sameMajorityPruning", 0): 
    22962296            tree = Pruner_SameMajority(tree) 
    2297         if getattr(self, "mForPruning", 0): 
    2298             tree = Pruner_m(tree, m=self.mForPruning) 
     2297        if getattr(self, "m_pruning", 0): 
     2298            tree = Pruner_m(tree, m=self.m_pruning) 
    22992299 
    23002300        return TreeClassifier(baseClassifier=tree)  
     
    23942394 
    23952395        return learner 
     2396 
     2397 
     2398TreeLearner = Orange.misc.deprecated_members({ 
     2399          "mForPruning": "m_pruning", 
     2400}, wrap_methods=[])(TreeLearner) 
    23962401 
    23972402# 
     
    28742879            for i, branch in enumerate(node.branches): 
    28752880                if branch: 
    2876                     internalBranchName = internalName+chr(i+65) 
     2881                    internalBranchName = "%s-%d" % (internalName,i) 
    28772882                    self.fle.write('%s -> %s [ label="%s" ]\n' % \ 
    28782883                        (_quoteName(internalName),  
     
    28832888        else: 
    28842889            self.fle.write('%s [ shape=%s label="%s"]\n' % \ 
    2885                 (internalName, self.leafShape,  
     2890                (_quoteName(internalName), self.leafShape,  
    28862891                self.formatString(self.leafStr, node, parent))) 
    28872892 
     
    29672972 
    29682973    def dot(self, fileName, leafStr = "", nodeStr = "", leafShape="plaintext", nodeShape="plaintext", **argkw): 
    2969         """ Prints the tree to a file in a format used by  
     2974        """ Print the tree to a file in a format used by  
    29702975        `GraphViz <http://www.research.att.com/sw/tools/graphviz>`_. 
    29712976        Uses the same parameters as :meth:`dump` defined above 
    29722977        plus two parameters which define the shape used for internal 
    2973         nodes and laves of the tree: 
    2974  
    2975         :param leafShape: Shape of the outline around leves of the tree.  
     2978        nodes and leaves of the tree: 
     2979 
     2980        :param leafShape: Shape of the outline around leaves of the tree.  
    29762981            If "plaintext", no outline is used (default: "plaintext"). 
    29772982        :type leafShape: string 
     
    29832988        for various outlines supported by GraphViz. 
    29842989        """ 
    2985         fle = type(fileName) == str and file(fileName, "wt") or fileName 
     2990        fle = type(fileName) == str and open(fileName, "wt") or fileName 
    29862991 
    29872992        _TreeDumper(leafStr, nodeStr, argkw.get("userFormats", []) +  
  • orange/Orange/clustering/hierarchical.py

    r8264 r8305  
    1313An example. 
    1414 
    15 .. automethod:: Orange.clustering.hierarchical.clustering 
     15Utility Functions 
     16================= 
     17 
     18.. autofunction:: clustering 
     19.. autofunction:: clustering_features 
     20.. autofunction:: cluster_to_list 
     21.. autofunction:: top_clusters 
     22.. autofunction:: top_cluster_membership 
     23.. autofunction:: order_leaves 
     24 
     25.. autofunction:: postorder 
     26.. autofunction:: preorder 
     27.. autofunction:: dendrogram_layout 
     28.. autofunction:: dendrogram_draw 
     29.. autofunction:: prune 
     30.. autofunction:: pruned 
     31.. autofunction:: cluster_depths 
     32.. autofunction:: instance_distance_matrix 
     33.. autofunction:: feature_distance_matrix 
     34.. autofunction:: joining_cluster 
     35.. autofunction:: cophenetic_distances 
     36.. autofunction:: cophenetic_correlation 
    1637 
    1738""" 
     39 
    1840import orange 
    1941import Orange 
     
    2648import sys 
    2749 
     50SINGLE = HierarchicalClustering.Single 
     51AVERAGE = HierarchicalClustering.Average 
     52COMPLETE = HierarchicalClustering.Complete 
     53WARD = HierarchicalClustering.Ward 
    2854 
    2955def clustering(data, 
    3056               distanceConstructor=orange.ExamplesDistanceConstructor_Euclidean, 
    31                linkage=orange.HierarchicalClustering.Average, 
     57               linkage=AVERAGE, 
    3258               order=False, 
    3359               progressCallback=None): 
    34     """Return a hierarhical clustering of the data set.""" 
     60    """ Return a hierarchical clustering of the instances in a data set. 
     61     
     62    :param data: Input data table for clustering. 
     63    :type data: :class:`Orange.data.Table` 
     64    :param distance_constructor: Instance distance constructor 
     65    :type distance_constructor: :class:`Orange.distances.ExamplesDistanceConstructor` 
     66    :param linkage: Linkage flag. Must be one of global module level flags: 
     67     
     68        - SINGLE 
     69        - AVERAGE 
     70        - COMPLETE 
     71        - WARD 
     72         
     73    :type linkage: int 
     74    :param order: If `True` run `order_leaves` on the resulting clustering. 
     75    :type order: bool 
     76    :param progress_callback: A function (taking one argument) to use for 
     77        reporting the on the progress. 
     78    :type progress_callback: function 
     79     
     80    """ 
    3581    distance = distanceConstructor(data) 
    3682    matrix = orange.SymMatrix(len(data)) 
     
    3884        for j in range(i+1): 
    3985            matrix[i, j] = distance(data[i], data[j]) 
    40     root = orange.HierarchicalClustering(matrix, linkage=linkage, progressCallback=(lambda value, obj=None: progressCallback(value*100.0/(2 if order else 1))) if progressCallback else None) 
     86    root = HierarchicalClustering(matrix, linkage=linkage, progressCallback=(lambda value, obj=None: progressCallback(value*100.0/(2 if order else 1))) if progressCallback else None) 
    4187    if order: 
    4288        order_leaves(root, matrix, progressCallback=(lambda value: progressCallback(50.0 + value/2)) if progressCallback else None) 
     
    4490 
    4591def clustering_features(data, distance=None, linkage=orange.HierarchicalClustering.Average, order=False, progressCallback=None): 
    46     """Return hierarhical clustering of attributes in the data set.""" 
     92    """ Return hierarchical clustering of attributes in a data set. 
     93     
     94    :param data: Input data table for clustering. 
     95    :type data: :class:`Orange.data.Table` 
     96    :param distance: Attribute distance constructor  
     97        .. note:: currently not used. 
     98    :param linkage: Linkage flag. Must be one of global module level flags: 
     99     
     100        - SINGLE 
     101        - AVERAGE 
     102        - COMPLETE 
     103        - WARD 
     104         
     105    :type linkage: int 
     106    :param order: If `True` run `order_leaves` on the resulting clustering. 
     107    :type order: bool 
     108    :param progress_callback: A function (taking one argument) to use for 
     109        reporting the on the progress. 
     110    :type progress_callback: function 
     111     
     112    """ 
    47113    matrix = orange.SymMatrix(len(data.domain.attributes)) 
    48114    for a1 in range(len(data.domain.attributes)): 
     
    55121 
    56122def cluster_to_list(node, prune=None): 
    57     """Return a list of clusters down from the node of hierarchical clustering.""" 
     123    """ Return a list of clusters down from the node of hierarchical clustering. 
     124     
     125    :param node: Cluster node. 
     126    :type node: :class:`HierarchicalCluster` 
     127    :param prune: If not `None` it must be a positive integer. Any cluster 
     128        with less then `prune` items will be left out of the list. 
     129    :type node: int or `NoneType` 
     130     
     131    """ 
    58132    if prune: 
    59133        if len(node) <= prune: 
     
    64138 
    65139def top_clusters(root, k): 
    66     """Return k topmost clusters from hierarchical clustering.""" 
     140    """ Return k topmost clusters from hierarchical clustering. 
     141     
     142    :param root: Root cluster. 
     143    :type root: :class:`HierarchicalCluster` 
     144    :param k: Number of top clusters. 
     145    :type k: int 
     146     
     147    """ 
    67148    candidates = set([root]) 
    68149    while len(candidates) < k: 
     
    74155 
    75156def top_cluster_membership(root, k): 
    76     """Return data instances' cluster membership (list of indices) to k topmost clusters.""" 
     157    """ Return data instances' cluster membership (list of indices) to k topmost clusters. 
     158     
     159    :param root: Root cluster. 
     160    :type root: :class:`HierarchicalCluster` 
     161    :param k: Number of top clusters. 
     162    :type k: int 
     163     
     164    """ 
    77165    clist = top_clusters(root, k) 
    78166    cmap = [None] * len(root) 
     
    84172def order_leaves(tree, matrix, progressCallback=None): 
    85173    """Order the leaves in the clustering tree. 
    86  
     174     
    87175    (based on Ziv Bar-Joseph et al. (Fast optimal leaf ordering for hierarchical clustering') 
    88     Arguments: 
    89         tree   --binary hierarchical clustering tree of type orange.HierarchicalCluster 
    90         matrix --orange.SymMatrix that was used to compute the clustering 
    91         progressCallback --function used to report progress 
    92     """ 
    93 #    from Orange.misc import recursion_limit 
     176     
     177    :param tree: Binary hierarchical clustering tree. 
     178    :type tree: :class:`HierarchicalCluster` 
     179    :param matrix: SymMatrix that was used to compute the clustering. 
     180    :type matrix: :class:`Orange.core.SymMatrix` 
     181    :param progress_callback: Function used to report on progress. 
     182    :type progress_callback: function 
     183     
     184    .. note:: The ordering is done inplace.  
     185     
     186    """ 
    94187     
    95188    objects = getattr(tree.mapping, "objects", None) 
     
    270363        tree.mapping.setattr("objects", objects) 
    271364 
     365""" Matplotlib dendrogram ploting. 
     366""" 
    272367try: 
    273368    import numpy 
     
    505600         
    506601         
     602""" Dendrogram ploting using Orange.misc.reander 
     603""" 
     604 
    507605from orngMisc import ColorPalette, EPSRenderer 
    508606class DendrogramPlot(object): 
     
    643741         
    644742def dendrogram_draw(filename, *args, **kwargs): 
     743    """ Plot the dendrogram to `filename`. 
     744     
     745    .. todo:: Finish documentation. 
     746    """ 
    645747    import os 
    646748    from orngMisc import PILRenderer, EPSRenderer, SVGRenderer 
     
    650752    d = DendrogramPlot(*args, **kwargs) 
    651753    d.plot(filename) 
    652      
    653      
    654 """ 
    655 Utility functions 
    656 ================= 
    657  
    658 """ 
    659754     
    660755def postorder(cluster): 
     
    739834    return result 
    740835     
     836def clone(cluster): 
     837    """ Clone a cluster, including it's subclusters. 
     838     
     839    :param cluster: Cluster to clone 
     840    :type cluster: :class:`HierarchialCluster` 
     841    """ 
     842    import copy 
     843    clones = {} 
     844    mapping = copy.copy(cluster.mapping) 
     845    for node in postorder(cluster): 
     846        node_clone = copy.copy(node) 
     847        if node.branches: 
     848            node_clone.branches = [clones[b] for b in node.branches] 
     849        node_clone.mapping = mapping 
     850        clones[node] = node_clone 
     851         
     852    return clones[cluster] 
    741853     
    742854def pruned(root_cluster, level=None, height=None, condition=None): 
    743855    """ Return a new pruned clustering instance. 
    744856     
    745     .. note:: This uses `copy.deepcopy` to create a copy of the root_cluster 
     857    .. note:: This uses `clone` to create a copy of the root_cluster 
    746858        instance. 
    747859     
     
    762874     
    763875    """ 
    764     import copy 
    765      
    766     # XXX This is unsafe HierarchicalCluster should take care of copying 
    767     if hasattr(root_cluster.mapping, "objects"): 
    768         objects = root_cluster.mapping.objects 
    769         root_cluster.mapping.objects = None 
    770         has_objects = True 
    771     else: 
    772         has_objects = False 
    773          
    774     root_cluster = copy.deepcopy(root_cluster) 
    775      
     876    root_cluster = clone(root_cluster) 
    776877    prune(root_cluster, level, height, condition) 
    777      
    778     if has_objects: 
    779         root_cluster.mapping.objects = objects 
    780          
    781878    return root_cluster 
    782879     
  • orange/Orange/clustering/mixture.py

    r7885 r7981  
    1313import sys, os 
    1414import numpy 
     15import random 
    1516import Orange.data 
    1617 
     
    2526         
    2627    def __call__(self, instance): 
    27         """ Return the conditional probability of instance. 
     28        """ Return the probability of instance. 
    2829        """ 
    2930        return numpy.sum(prob_est([instance], self.weights, self.means, self.covariances)) 
    3031         
    3132    def __getitem__(self, index): 
    32         """ Return the index-th gaussian 
     33        """ Return the index-th gaussian. 
    3334        """  
    3435        return GMModel([1.0], self.means[index: index + 1], self.covariances[index: index + 1]) 
    35      
    36 #    def __getslice__(self, slice): 
    37 #        pass 
    3836 
    3937    def __len__(self): 
     
    4139     
    4240     
    43 def init_random(array, n_centers, *args, **kwargs): 
    44     """ Init random means 
    45     """ 
     41def init_random(data, n_centers, *args, **kwargs): 
     42    """ Init random means and correlations from a data table. 
     43     
     44    :param data: data table 
     45    :type data: :class:`Orange.data.Table` 
     46    :param n_centers: Number of centers and correlations to return. 
     47    :type n_centers: int 
     48     
     49    """ 
     50    if isinstance(data, Orange.data.Table): 
     51        array, w, c = data.toNumpyMA() 
     52    else: 
     53        array = numpy.asarray(data) 
     54         
    4655    min, max = array.max(0), array.min(0) 
    4756    dim = array.shape[1] 
     
    5261    correlations = [numpy.asmatrix(numpy.eye(dim)) for i in range(n_centers)] 
    5362    return means, correlations 
    54      
     63 
     64def init_kmeans(data, n_centers, *args, **kwargs): 
     65    """ Init with k-means algorithm. 
     66     
     67    :param data: data table 
     68    :type data: :class:`Orange.data.Table` 
     69    :param n_centers: Number of centers and correlations to return. 
     70    :type n_centers: int 
     71     
     72    """ 
     73    if not isinstance(data, Orange.data.Table): 
     74        raise TypeError("Orange.data.Table instance expected!") 
     75    from Orange.clustering.kmeans import Clustering 
     76    km = Clustering(data, centroids=n_centers, maxiters=20, nstart=3) 
     77    centers = Orange.data.Table(km.centroids) 
     78    centers, w, c = centers.toNumpyMA() 
     79    dim = len(data.domain.attributes) 
     80    correlations = [numpy.asmatrix(numpy.eye(dim)) for i in range(n_centers)] 
     81    return centers, correlations 
    5582     
    5683def prob_est1(data, mean, covariance, inv_covariance=None): 
    57     """ Return the probability of data given mean and covariance matrix  
     84    """ Return the probability of data given mean and covariance matrix 
    5885    """ 
    5986    data = numpy.asmatrix(data) 
     
    6289        inv_covariance = numpy.linalg.pinv(covariance) 
    6390         
    64     inv_covariance = numpy.asmatrix(inv_covariance)     
     91    inv_covariance = numpy.asmatrix(inv_covariance) 
    6592     
    6693    diff = data - mean 
     
    76103    assert(det != 0.0) 
    77104    p /= det 
    78 #    if det != 0.0: 
    79 #        p /= det 
    80 #    else: 
    81 #        p = numpy.ones(p.shape) / p.shape[0] 
    82105    return p 
    83106 
    84107 
    85108def prob_est(data, weights, means, covariances, inv_covariances=None): 
    86     """ Return the probability estimation of data given weighted, means and 
     109    """ Return the probability estimation of data given weights, means and 
    87110    covariances. 
    88111       
     
    103126    """ An EM solver for gaussian mixture model 
    104127    """ 
     128    _TRACE_MEAN = False 
    105129    def __init__(self, data, weights, means, covariances): 
    106130        self.data = data 
     
    182206        """ Run the EM algorithm. 
    183207        """ 
    184          
    185 #        from pylab import plot, show, draw, ion 
    186 #        ion() 
    187 #        plot(self.data[:, 0], self.data[:, 1], "ro") 
    188 #        vec_plot = plot(self.means[:, 0], self.means[:, 1], "bo")[0] 
     208        if self._TRACE_MEAN: 
     209            from pylab import plot, show, draw, ion 
     210            ion() 
     211            plot(self.data[:, 0], self.data[:, 1], "ro") 
     212            vec_plot = plot(self.means[:, 0], self.means[:, 1], "bo")[0] 
     213         
    189214        curr_iter = 0 
    190215         
     
    193218            self.one_step() 
    194219             
    195 #            vec_plot.set_xdata(self.means[:, 0]) 
    196 #            vec_plot.set_ydata(self.means[:, 1]) 
    197 #            draw() 
     220            if self._TRACE_MEAN: 
     221                vec_plot.set_xdata(self.means[:, 0]) 
     222                vec_plot.set_ydata(self.means[:, 1]) 
     223                draw() 
    198224             
    199225            curr_iter += 1 
    200             print curr_iter 
    201             print abs(old_objective - self.log_likelihood) 
     226#            print curr_iter 
     227#            print abs(old_objective - self.log_likelihood) 
    202228            if abs(old_objective - self.log_likelihood) < eps or curr_iter > max_iter: 
    203229                break 
    204230         
    205231         
    206 class GASolver(object): 
    207     """ A toy genetic algorithm solver  
    208     """ 
    209     def __init__(self, data, weights, means, covariances): 
    210         raise NotImplementedError 
    211  
    212  
    213 class PSSolver(object): 
    214     """ A toy particle swarm solver 
    215     """ 
    216     def __init__(self, data, weights, means, covariances): 
    217         raise NotImplementedError 
    218  
    219 class HybridSolver(object): 
    220     """ A hybrid solver 
    221     """ 
    222     def __init__(self, data, weights, means, covariances): 
    223         raise NotImplementedError 
     232#class GASolver(object): 
     233#    """ A toy genetic algorithm solver  
     234#    """ 
     235#    def __init__(self, data, weights, means, covariances): 
     236#        raise NotImplementedError 
     237# 
     238# 
     239#class PSSolver(object): 
     240#    """ A toy particle swarm solver 
     241#    """ 
     242#    def __init__(self, data, weights, means, covariances): 
     243#        raise NotImplementedError 
     244# 
     245#class HybridSolver(object): 
     246#    """ A hybrid solver 
     247#    """ 
     248#    def __init__(self, data, weights, means, covariances): 
     249#        raise NotImplementedError 
    224250     
    225251     
    226252class GaussianMixture(object): 
     253    """ Computes the gaussian mixture model from an Orange data-set. 
     254    """ 
    227255    def __new__(cls, data=None, weightId=None, **kwargs): 
    228256        self = object.__new__(cls) 
     
    233261            return self 
    234262         
    235     def __init__(self, n_centers=3, init_function=init_random): 
     263    def __init__(self, n_centers=3, init_function=init_kmeans): 
    236264        self.n_centers = n_centers 
    237265        self.init_function = init_function 
    238266         
    239267    def __call__(self, data, weightId=None): 
     268        means, correlations = self.init_function(data, self.n_centers) 
     269        means = numpy.asmatrix(means) 
    240270        array, _, _ = data.to_numpy_MA() 
    241271        solver = EMSolver(array, numpy.ones((self.n_centers)) / self.n_centers, 
    242                           *self.init_function(array, self.n_centers)) 
     272                          means, correlations) 
    243273        solver.run() 
    244274        return GMModel(solver.weights, solver.means, solver.covariances) 
     
    246276         
    247277def plot_model(data_array, mixture, axis=(0, 1), samples=20, contour_lines=20): 
    248      
     278    """ Plot the scaterplot of data_array and the contour lines of the 
     279    probability for the mixture. 
     280      
     281    """ 
    249282    import matplotlib 
    250283    import matplotlib.pylab as plt 
     
    257290     
    258291    weights = mixture.weights 
    259     means = [m[axis] for m in mixture.means] 
     292    means = mixture.means[:, axis] 
    260293     
    261294    covariances = [cov[axis,:][:, axis] for cov in mixture.covariances]  
     
    283316                cmap=cm.gray, extent=extent) 
    284317     
     318    plt.plot(means[:, 0], means[:, 1], "b+") 
    285319    plt.show() 
    286320     
    287 def test(): 
     321def test(seed=0): 
    288322#    data = Orange.data.Table(os.path.expanduser("../../doc/datasets/brown-selected.tab")) 
    289     data = Orange.data.Table(os.path.expanduser("~/Documents/brown-selected-fss.tab")) 
    290 #    data = Orange.data.Table("../../doc/datasets/iris.tab") 
     323#    data = Orange.data.Table(os.path.expanduser("~/Documents/brown-selected-fss.tab")) 
     324    data = Orange.data.Table(os.path.expanduser("~/Documents/brown-selected-fss-1.tab")) 
     325    data = Orange.data.Table("../../doc/datasets/iris.tab") 
    291326#    data = Orange.data.Table(Orange.data.Domain(data.domain[:2], None), data) 
    292     numpy.random.seed(0) 
    293     gmm = GaussianMixture(data, n_centers=3) 
    294     plot_model(data, gmm, axis=(0,1), samples=40, contour_lines=20) 
     327    numpy.random.seed(seed) 
     328    random.seed(seed) 
     329    gmm = GaussianMixture(data, n_centers=3, init_function=init_kmeans) 
     330    plot_model(data, gmm, axis=(0, 1), samples=40, contour_lines=100) 
    295331 
    296332     
  • orange/Orange/data/io.py

    r6848 r8305  
    1 from orange import \ 
     1import os 
     2 
     3import Orange 
     4import Orange.data.variable 
     5import Orange.misc 
     6from Orange.core import \ 
    27     BasketFeeder, FileExampleGenerator, BasketExampleGenerator, \ 
    3      C45ExampleGenerator, TabDelimExampleGenerator 
     8     C45ExampleGenerator, TabDelimExampleGenerator, registerFileType 
     9 
     10 
     11def loadARFF(filename, create_on_new = Orange.data.variable.Variable.MakeStatus.Incompatible, **kwargs): 
     12    """Return class:`Orange.data.Table` containing data from file in Weka ARFF format""" 
     13    if not os.path.exists(filename) and os.path.exists(filename + ".arff"): 
     14        filename = filename + ".arff"  
     15    f = open(filename,'r') 
     16     
     17    attributes = [] 
     18    attributeLoadStatus = [] 
     19     
     20    name = '' 
     21    state = 0 # header 
     22    data = [] 
     23    for l in f.readlines(): 
     24        l = l.rstrip("\n") # strip \n 
     25        l = l.replace('\t',' ') # get rid of tabs 
     26        x = l.split('%')[0] # strip comments 
     27        if len(x.strip()) == 0: 
     28            continue 
     29        if state == 0 and x[0] != '@': 
     30            print "ARFF import ignoring:",x 
     31        if state == 1: 
     32            dd = x.split(',') 
     33            r = [] 
     34            for xs in dd: 
     35                y = xs.strip(" ") 
     36                if len(y) > 0: 
     37                    if y[0]=="'" or y[0]=='"': 
     38                        r.append(xs.strip("'\"")) 
     39                    else: 
     40                        ns = xs.split() 
     41                        for ls in ns: 
     42                            if len(ls) > 0: 
     43                                r.append(ls) 
     44                else: 
     45                    r.append('?') 
     46            data.append(r[:len(attributes)]) 
     47        else: 
     48            y = [] 
     49            for cy in x.split(' '): 
     50                if len(cy) > 0: 
     51                    y.append(cy) 
     52            if str.lower(y[0][1:]) == 'data': 
     53                state = 1 
     54            elif str.lower(y[0][1:]) == 'relation': 
     55                name = str.strip(y[1]) 
     56            elif str.lower(y[0][1:]) == 'attribute': 
     57                if y[1][0] == "'": 
     58                    atn = y[1].strip("' ") 
     59                    idx = 1 
     60                    while y[idx][-1] != "'": 
     61                        idx += 1 
     62                        atn += ' '+y[idx] 
     63                    atn = atn.strip("' ") 
     64                else: 
     65                    atn = y[1] 
     66                z = x.split('{') 
     67                w = z[-1].split('}') 
     68                if len(z) > 1 and len(w) > 1: 
     69                    # there is a list of values 
     70                    vals = [] 
     71                    for y in w[0].split(','): 
     72                        sy = y.strip(" '\"") 
     73                        if len(sy)>0: 
     74                            vals.append(sy) 
     75                    a, s = Orange.data.variable.Variable.make(atn, Orange.data.Type.Discrete, vals, [], create_on_new) 
     76                else: 
     77                    # real... 
     78                    a, s = Orange.data.variable.Variable.make(atn, Orange.data.Type.Continuous, [], [], create_on_new) 
     79                     
     80                attributes.append(a) 
     81                attributeLoadStatus.append(s) 
     82    # generate the domain 
     83    d = Orange.data.Domain(attributes) 
     84    lex = [] 
     85    for dd in data: 
     86        e = Orange.data.Instance(d,dd) 
     87        lex.append(e) 
     88    t = Orange.data.Table(d,lex) 
     89    t.name = name 
     90    t.attribute_load_status = attributeLoadStatus 
     91    return t 
     92loadARFF = Orange.misc.deprecated_keywords( 
     93{"createOnNew": "create_on_new"} 
     94)(loadARFF) 
     95 
     96 
     97def toARFF(filename,table,try_numericize=0): 
     98    """Save class:`Orange.data.Table` to file in Weka's ARFF format""" 
     99    t = table 
     100    if filename[-5:] == ".arff": 
     101        filename = filename[:-5] 
     102    #print filename 
     103    f = open(filename+'.arff','w') 
     104    f.write('@relation %s\n'%t.domain.classVar.name) 
     105    # attributes 
     106    ats = [i for i in t.domain.attributes] 
     107    ats.append(t.domain.classVar) 
     108    for i in ats: 
     109        real = 1 
     110        if i.varType == 1: 
     111            if try_numericize: 
     112                # try if all values numeric 
     113                for j in i.values: 
     114                    try: 
     115                        x = float(j) 
     116                    except: 
     117                        real = 0 # failed 
     118                        break 
     119            else: 
     120                real = 0 
     121        iname = str(i.name) 
     122        if iname.find(" ") != -1: 
     123            iname = "'%s'"%iname 
     124        if real==1: 
     125            f.write('@attribute %s real\n'%iname) 
     126        else: 
     127            f.write('@attribute %s { '%iname) 
     128            x = [] 
     129            for j in i.values: 
     130                s = str(j) 
     131                if s.find(" ") == -1: 
     132                    x.append("%s"%s) 
     133                else: 
     134                    x.append("'%s'"%s) 
     135            for j in x[:-1]: 
     136                f.write('%s,'%j) 
     137            f.write('%s }\n'%x[-1]) 
     138 
     139    # examples 
     140    f.write('@data\n') 
     141    for j in t: 
     142        x = [] 
     143        for i in range(len(ats)): 
     144            s = str(j[i]) 
     145            if s.find(" ") == -1: 
     146                x.append("%s"%s) 
     147            else: 
     148                x.append("'%s'"%s) 
     149        for i in x[:-1]: 
     150            f.write('%s,'%i) 
     151        f.write('%s\n'%x[-1]) 
     152 
     153def toC50(filename,table): 
     154    """Save class:`Orange.data.Table` to file in C50 format""" 
     155    t = table 
     156    # export names 
     157    f = open('%s.names' % filename,'w') 
     158    f.write('%s.\n\n' % t.domain.class_var.name) 
     159    # attributes 
     160    ats = [i for i in t.domain.attributes] 
     161    ats.append(t.domain.classVar) 
     162    for i in ats: 
     163        real = 1 
     164        # try if real 
     165        if i.varType == 1 and try_numericize: 
     166            # try if all values numeric 
     167            for j in i.values: 
     168                try: 
     169                    x = float(j) 
     170                except: 
     171                    real = 0 # failed 
     172                    break 
     173        if real==1: 
     174            f.write('%s: continuous.\n'%i.name) 
     175        else: 
     176            f.write('%s: '%i.name) 
     177            x = [] 
     178            for j in i.values: 
     179                x.append('%s'%j) 
     180            for j in x[:-1]: 
     181                f.write('%s,'%j) 
     182            f.write('%s.\n'%x[-1]) 
     183    # examples 
     184    f.close() 
     185     
     186    f = open('%s.data'%n,'w') 
     187    for j in t: 
     188        x = [] 
     189        for i in range(len(ats)): 
     190            x.append('%s'%j[i]) 
     191        for i in x[:-1]: 
     192            f.write('%s,'%i) 
     193        f.write('%s\n'%x[-1]) 
     194 
     195def toR(filename,t): 
     196    """Save class:`Orange.data.Table` to file in R format""" 
     197    if str.upper(filename[-2:]) == ".R": 
     198        filename = filename[:-2] 
     199    f = open(filename+'.R','w') 
     200 
     201    atyp = [] 
     202    aord = [] 
     203    labels = [] 
     204    as0 = [] 
     205    for a in t.domain.attributes: 
     206        as0.append(a) 
     207    as0.append(t.domain.class_var) 
     208    for a in as0: 
     209        labels.append(str(a.name)) 
     210        atyp.append(a.var_ype) 
     211        aord.append(a.ordered) 
     212 
     213    f.write('data <- data.frame(\n') 
     214    for i in xrange(len(labels)): 
     215        if atyp[i] == 2: # continuous 
     216            f.write('"%s" = c('%(labels[i])) 
     217            for j in xrange(len(t)): 
     218                if t[j][i].isSpecial(): 
     219                    f.write('NA') 
     220                else: 
     221                    f.write(str(t[j][i])) 
     222                if (j == len(t)-1): 
     223                    f.write(')') 
     224                else: 
     225                    f.write(',') 
     226        elif atyp[i] == 1: # discrete 
     227            if aord[i]: # ordered 
     228                f.write('"%s" = ordered('%labels[i]) 
     229            else: 
     230                f.write('"%s" = factor('%labels[i]) 
     231            f.write('levels=c(') 
     232            for j in xrange(len(as0[i].values)): 
     233                f.write('"x%s"'%(as0[i].values[j])) 
     234                if j == len(as0[i].values)-1: 
     235                    f.write('),c(') 
     236                else: 
     237                    f.write(',') 
     238            for j in xrange(len(t)): 
     239                if t[j][i].isSpecial(): 
     240                    f.write('NA') 
     241                else: 
     242                    f.write('"x%s"'%str(t[j][i])) 
     243                if (j == len(t)-1): 
     244                    f.write('))') 
     245                else: 
     246                    f.write(',') 
     247        else: 
     248            raise "Unknown attribute type." 
     249        if (i < len(labels)-1): 
     250            f.write(',\n') 
     251    f.write(')\n') 
     252     
     253def toLibSVM(filename, example): 
     254    """Save class:`Orange.data.Table` to file in LibSVM format""" 
     255    import Orange.classification.svm 
     256    Orange.classification.svm.tableToSVMFormat(example, open(filename, "wb")) 
     257     
     258def loadLibSVM(filename, create_on_new=Orange.data.variable.Variable.MakeStatus.Incompatible, **kwargs): 
     259    """Return class:`Orange.data.Table` containing data from file in LibSVM format""" 
     260    attributeLoadStatus = {} 
     261    def make_float(name): 
     262        attr, s = orange.Variable.make(name, orange.VarTypes.Continuous, [], [], createOnNew) 
     263        attributeLoadStatus[attr] = s 
     264        return attr 
     265     
     266    def make_disc(name, unordered): 
     267        attr, s = orange.Variable.make(name, orange.VarTypes.Discrete, [], unordered, createOnNew) 
     268        attributeLoadStatus[attr] = s 
     269        return attr 
     270     
     271    data = [line.split() for line in open(filename, "rb").read().splitlines() if line.strip()] 
     272    vars = type("attr", (dict,), {"__missing__": lambda self, key: self.setdefault(key, make_float(key))})() 
     273    item = lambda i, v: (vars[i], vars[i](v)) 
     274    values = [dict([item(*val.split(":"))  for val in ex[1:]]) for ex in data] 
     275    classes = [ex[0] for ex in data] 
     276    disc = all(["." not in c for c in classes]) 
     277    attributes = sorted(vars.values(), key=lambda var: int(var.name)) 
     278    classVar = make_disc("class", sorted(set(classes))) if disc else make_float("target") 
     279    attributeLoadStatus = [attributeLoadStatus[attr] for attr in attributes] + \ 
     280                          [attributeLoadStatus[classVar]] 
     281    domain = orange.Domain(attributes, classVar) 
     282    table = orange.ExampleTable([orange.Example(domain, [ex.get(attr, attr("?")) for attr in attributes] + [c]) for ex, c in zip(values, classes)]) 
     283    table.attribute_load_status = attributeLoadStatus 
     284    return table 
     285loadLibSVM = Orange.misc.deprecated_keywords( 
     286{"createOnNew": "create_on_new"} 
     287)(loadLibSVM) 
     288 
     289registerFileType("R", None, toR, ".R") 
     290registerFileType("Weka", loadARFF, toARFF, ".arff") 
     291registerFileType("C50", None, toC50, [".names", ".data", ".test"]) 
     292registerFileType("libSVM", loadLibSVM, toLibSVM, ".svm") 
  • orange/Orange/data/sample.py

    r8264 r8305  
     1""" 
     2Example sampling is one of the basic procedures in machine learning. If 
     3for nothing else, everybody needs to split dataset into training and 
     4testing examples.  
     5  
     6It is easy to select a subset of examples in Orange. The key idea is the 
     7use of indices: first construct a list of indices, one corresponding 
     8to each example. Then you can select examples by indices, say take 
     9all examples with index 3. Or with index other than 3. It is obvious 
     10that this is useful for many typical setups, such as 70-30 splits or 
     11cross-validation.  
     12  
     13Orange provides methods for making such selections, such as 
     14:obj:`Orange.data.Table.select`.  And, of course, it provides methods 
     15for constructing indices for different kinds of splits. For instance, 
     16for the most common used sampling method, cross-validation, the Orange's 
     17class :obj:`SubsetIndicesCV` prepares a list of indices that assign a 
     18fold to each example. 
     19 
     20Classes that construct such indices are derived from a basic 
     21abstract :obj:`SubsetIndices`. There are three different classes 
     22provided. :obj:`SubsetIndices2` constructs a list of 0's and 1's in 
     23prescribed proportion; it can be used for, for instance, 70-30 divisions 
     24on training and testing examples. A more general :obj:`SubsetIndicesN` 
     25construct a list of indices from 0 to N-1 in given proportions. Finally, 
     26the most often used :obj:`SubsetIndicesCV` prepares indices for 
     27cross-validation. 
     28 
     29Subset indices are more deterministic than in versions of Orange prior to 
     30September 2003. See examples in the section about :obj:`SubsetIndices2` 
     31for details. 
     32  
     33.. class:: SubsetIndices 
     34 
     35    .. data:: Stratified 
     36 
     37    .. data:: NotStratified 
     38 
     39    .. data:: StratifiedIfPossible 
     40         
     41        Constants for setting :obj:`stratified`. If 
     42        :obj:`StratifiedIfPossible`, Orange will try to construct 
     43        stratified indices, but fall back to non-stratified if anything 
     44        goes wrong. For stratified indices, it needs to see the example 
     45        table (see the calling operator below), and the class should be 
     46        discrete and have no unknown values. 
     47 
     48 
     49    .. attribute:: stratified 
     50 
     51        Defines whether the division should be stratified, that is, 
     52        whether all subset should have approximatelly equal class 
     53        distributions. Possible values are :obj:`Stratified`, 
     54        :obj:`NotStratified` and :obj:`StratifiedIfPossible` (default). 
     55 
     56    .. attribute:: randseed 
     57     
     58    .. attribute:: random_generator 
     59 
     60        These two fields deal with the way :obj:`SubsetIndices` generates 
     61        random numbers. 
     62 
     63        If :obj:`random_generator` (of type :obj:`orange.RandomGenerator`) 
     64        is set, it is used. The same random generator can be shared 
     65        between different objects; this can be useful when constructing an 
     66        experiment that depends on a single random seed. If you use this, 
     67        :obj:`SubsetIndices` will return a different set of indices each 
     68        time it's called, even if with the same arguments. 
     69 
     70        If :obj:`random_generator` is not given, but :attr:`randseed` is 
     71        (positive values denote a defined :obj:`randseed`), the value is 
     72        used to initiate a new, temporary local random generator. This 
     73        way, the indices generator will always give same indices for 
     74        the same data. 
     75 
     76        If none of the two is defined, a new random generator 
     77        is constructed each time the object is called (note that 
     78        this is unlike some other classes, such as :obj:`Variable`, 
     79        :obj:`Distribution` and :obj:`Orange.data.Table`, that store 
     80        such generators for future use; the generator constructed by 
     81        :obj:`SubsetIndices` is disposed after use) and initialized 
     82        with random seed 0. This thus has the same effect as setting 
     83        :obj:`randseed` to 0. 
     84 
     85        The example for :obj:`SubsetIndices2` shows the difference 
     86        between those options. 
     87 
     88    .. method:: __call__(examples) 
     89 
     90        :obj:`SubsetIndices` can be called to return a list of 
     91        indices. The argument can be either the desired length of the list 
     92        (presumably corresponding to a length of some list of examples) 
     93        or a set of examples, given as :obj:`Orange.data.Table` or plain 
     94        Python list. It is obvious that in the former case, indices 
     95        cannot correspond to a stratified division; if :obj:`stratified` 
     96        is set to :obj:`Stratified`, an exception is raised. 
     97 
     98.. class:: SubsetIndices2 
     99 
     100    This object prepares a list of 0's and 1's. 
     101  
     102    .. attribute:: p0 
     103 
     104        The proportion or a number of 0's. If :obj:`p0` is less than 
     105        1, it's a proportion. For instance, if :obj:`p0` is 0.2, 20% 
     106        of indices will be 0's and 80% will be 1's. If :obj:`p0` 
     107        is 1 or more, it gives the exact number of 0's. For instance, 
     108        with :obj:`p0` of 10, you will get a list with 10 0's and 
     109        the rest of the list will be 1's. 
     110  
     111Say that you have loaded the lenses domain into ``data``. We'll split 
     112it into two datasets, the first containing only 6 examples and the other 
     113containing the rest (from `randomindices2.py`_): 
     114  
     115.. _randomindices2.py: code/randomindices2.py 
     116.. _lenses.tab: code/lenses.tab 
     117 
     118.. literalinclude:: code/randomindices2.py 
     119    :lines: 11-17 
     120 
     121Output:: 
     122 
     123    <1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1> 
     124    6 18 
     125  
     126No surprises here. Let's now see what's with those random seeds and generators. First, we shall simply construct and print five lists of random indices.  
     127  
     128.. literalinclude:: code/randomindices2.py 
     129    :lines: 19-21 
     130 
     131Output:: 
     132 
     133    Indices without playing with random generator 
     134    <0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1> 
     135    <0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1> 
     136    <0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1> 
     137    <0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1> 
     138    <0, 0, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1> 
     139 
     140 
     141We ran it for five times and got the same result each time. 
     142 
     143.. literalinclude:: code/randomindices2.py 
     144    :lines: 23-26 
     145 
     146Output:: 
     147 
     148    Indices with random generator 
     149    <1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1> 
     150    <1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 0, 0, 1, 1, 1> 
     151    <1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 1, 1, 1, 0, 1> 
     152    <1, 1, 1, 1, 1, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0, 1, 1, 0> 
     153    <1, 1, 0, 1, 0, 1, 1, 0, 1, 1, 1, 1, 0, 1, 1, 0, 1, 1, 1, 1, 1, 0, 1, 1> 
     154 
     155We have constructed a private random generator for random indices. And 
     156got five different lists but if you run the whole script again, you'll 
     157get the same five sets, since the generator will be constructed again 
     158and start generating number from the beginning. Again, you should have 
     159got this same indices on any operating system. 
     160 
     161.. literalinclude:: code/randomindices2.py 
     162    :lines: 28-32 
     163 
     164Output:: 
     165 
     166    Indices with randseed 
     167    <1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1> 
     168    <1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1> 
     169    <1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1> 
     170    <1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1> 
     171    <1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1> 
     172 
     173 
     174Here we have set the random seed and removed the random generator 
     175(otherwise the seed would have no effect as the generator has the 
     176priority). Each time we run the indices generator, it constructs a 
     177private random generator and initializes it with the given seed, and 
     178consequentially always returns the same indices. 
     179 
     180Let's play with :obj:`SubsetIndices2.p0`. There are 24 examples in the 
     181dataset. Setting :obj:`SubsetIndices2.p0` to 0.25 instead of 6 shouldn't 
     182alter the indices. Let's check it. 
     183 
     184.. literalinclude:: code/randomindices2.py 
     185    :lines: 35-37 
     186 
     187Output:: 
     188 
     189    Indices with p0 set as probability (not 'a number of') 
     190    <1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1> 
     191 
     192Finally, let's observe the effects of :obj:`~SubsetIndices.stratified`. By 
     193default, indices are stratified if it's possible and, in our case, 
     194it is and they are. 
     195 
     196.. literalinclude:: code/randomindices2.py 
     197    :lines: 39-49 
     198 
     199Output:: 
     200 
     201    ... with stratification 
     202    <1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1> 
     203    <0.625, 0.167, 0.208> 
     204    <0.611, 0.167, 0.222> 
     205 
     206We explicitly requested stratication and got the same indices as 
     207before. That's OK. We also printed out the distribution for the whole 
     208dataset and for the selected dataset (as we gave no second parameter, 
     209the examples with no-null indices got selected). They are not same, but 
     210they are pretty close. :obj:`SubsetIndices2` did what it could. Now let's 
     211try without stratification. The script is pretty same except for changing 
     212:obj:`~SubsetIndices.stratified` to :obj:`~SubsetIndices.NotStratified`. 
     213 
     214.. literalinclude:: code/randomindices2.py 
     215    :lines: 51-62 
     216 
     217Output:: 
     218     
     219    ... and without stratification 
     220    <0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1> 
     221    <0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1> 
     222    <0.625, 0.167, 0.208> 
     223    <0.611, 0.167, 0.222> 
     224 
     225 
     226Different indices and ... just look at the distribution. Could be worse 
     227but, well, :obj:`~SubsetIndices.NotStratified` doesn't mean that Orange 
     228will make an effort to get uneven distributions. It just won't mind 
     229about them. 
     230 
     231For a final test, you can set the class of one of the examples to unknown 
     232and rerun the last script with setting :obj:`~SubsetIndices.stratified` 
     233once to :obj:`~SubsetIndices.Stratified` and once to 
     234:obj:`~SubsetIndices.StratifiedIfPossible`. In the first case you'll 
     235get an error and in the second you'll have a non-stratified indices. 
     236 
     237.. literalinclude:: code/randomindices2.py 
     238    :lines: 64-70 
     239 
     240Output:: 
     241 
     242    ... stratified 'if possible' 
     243    <1, 1, 1, 1, 1, 0, 0, 1, 0, 1, 1, 1, 0, 1, 1, 1, 1, 0, 1, 0, 1, 1, 1, 1> 
     244 
     245    ... stratified 'if possible', after removing the first example's class 
     246    <0, 1, 1, 0, 1, 1, 1, 1, 1, 1, 0, 1, 1, 0, 1, 0, 1, 1, 1, 1, 0, 1, 1, 1> 
     247  
     248.. class:: SubsetIndicesN 
     249 
     250    A straight generalization of :obj:`RandomIndices2`, so there's not 
     251    much to be told about it. 
     252 
     253    .. attribute:: p 
     254 
     255        A list of proportions of examples that go to each fold. If 
     256        :obj:`p` has a length of 3, the returned list will have four 
     257        different indices, the first three will have probabilities as 
     258        defined in :obj:`p` while the last will have a probability of 
     259        (1 - sum of elements of :obj:`p`). 
     260 
     261:obj:`SubsetIndicesN` does not support stratification; setting 
     262:obj:`stratified` to :obj:`Stratified` will yield an error. 
     263 
     264.. _randomindicesn.py: code/randomindicesn.py 
     265 
     266Let us construct a list of indices that would assign half of examples 
     267to the first set and a quarter to the second and third (part of 
     268`randomindicesn.py`_, uses `lenses.tab`_): 
     269 
     270.. literalinclude:: code/randomindicesn.py 
     271    :lines: 9-14 
     272 
     273Output: 
     274 
     275    <1, 0, 0, 2, 0, 1, 1, 0, 2, 0, 2, 2, 1, 0, 0, 0, 2, 0, 0, 0, 1, 2, 1, 0> 
     276 
     277Count them and you'll see there are 12 zero's and 6 one's and two's out of 24. 
     278  
     279.. class:: SubsetIndicesCV 
     280  
     281    :obj:`SubsetIndicesCV` computes indices for cross-validation. 
     282 
     283    It constructs a list of indices between 0 and :obj:`folds` -1 
     284    (inclusive), with an equal number of each (if the number of examples 
     285    is not divisible by :obj:`folds`, the last folds will have one 
     286    example less). 
     287 
     288    .. attribute:: folds 
     289 
     290        Number of folds. Default is 10. 
     291  
     292.. _randomindicescv.py: code/randomindicescv.py 
     293  
     294We shall prepare indices for an ordinary ten-fold cross validation and 
     295indices for 10 examples for 5-fold cross validation. For the latter, 
     296we shall only pass the number of examples, which, of course, prevents 
     297the stratification. Part of `randomindicescv.py`_, uses `lenses.tab`_): 
     298 
     299.. literalinclude:: code/randomindicescv.py 
     300    :lines: 7-12 
     301 
     302Output:: 
     303 
     304    Indices for ordinary 10-fold CV 
     305    <1, 1, 3, 8, 8, 3, 2, 7, 5, 0, 1, 5, 2, 9, 4, 7, 4, 9, 3, 6, 0, 2, 0, 6> 
     306    Indices for 5 folds on 10 examples 
     307    <3, 0, 1, 0, 3, 2, 4, 4, 1, 2> 
     308 
     309 
     310Since examples don't divide evenly into ten folds, the first four folds 
     311have one example more - there are three 0's, 1's, 2's and 3's, but only 
     312two 4's, 5's.. 
     313 
     314""" 
     315 
     316pass 
     317 
    1318from orange import \ 
    2319     MakeRandomIndices as SubsetIndices, \ 
  • orange/Orange/data/variable.py

    r8264 r8305  
    1111-------------------- 
    1212 
    13 Variable descriptors can be constructed directly, using constructors and passing 
    14 attributes as parameters, or by a factory function 
    15 :func:`Orange.data.variable.make`, which either retrieves an existing descriptor 
    16 or constructs a new one. 
     13Variable descriptors can be constructed either directly, using  
     14constructors and passing attributes as parameters, or by a  
     15factory function :func:`Orange.data.variable.make`, which either  
     16retrieves an existing descriptor or constructs a new one. 
    1717 
    1818.. class:: Variable 
     
    2525        variables are considered the same only if they have the same descriptor 
    2626        (e.g. even multiple variables in the same table can have the same name). 
    27         This should however be avoided since it may result in unpredictable 
    28         behaviour. 
     27        This should, however, be avoided since it may result in unpredictable 
     28        behavior. 
    2929     
    3030    .. attribute:: var_type 
     
    4444     
    4545        A flag telling whether the values of a discrete variable are ordered. At 
    46         the moment, no builtin method treats ordinal variables differently than 
    47         nominal. 
     46        the moment, no built-in method treats ordinal variables differently than 
     47        nominal ones. 
    4848     
    4949    .. attribute:: distributed 
    5050     
    51         A flag telling whether the values of this variables are distributions. 
    52         As for flag ordered, no methods treat such variables in any special 
     51        A flag telling whether the values of the variables are distributions. 
     52        As for the flag ordered, no methods treat such variables in any special 
    5353        manner. 
    5454     
     
    7373    .. method:: __call__(obj) 
    7474     
    75            Convert a string, number or other suitable object into a variable 
     75           Convert a string, number, or other suitable object into a variable 
    7676           value. 
    7777            
     
    8282    .. method:: randomvalue() 
    8383 
    84            Return a random value of the variable. 
     84           Return a random value for the variable. 
    8585        
    8686           :rtype: :class:`Orange.data.Value` 
     
    103103    .. attribute:: values 
    104104     
    105         A list with symbolic names for variable's values. Values are stored as 
     105        A list with symbolic names for variables' values. Values are stored as 
    106106        indices referring to this list. Therefore, modifying this list  
    107         instantly changes (symbolic) names of values as they are printed out or 
     107        instantly changes the (symbolic) names of values as they are printed out or 
    108108        referred to by user. 
    109109     
     
    111111         
    112112            The size of the list is also used to indicate the number of 
    113             possible values for this variable. Changing the size, especially 
    114             shrinking the list can have disastrous effects and is therefore not 
    115             really recommendable. Also, do not add values to the list by 
     113            possible values for this variable. Changing the size - especially 
     114            shrinking the list - can have disastrous effects and is therefore not 
     115            really recommended. Also, do not add values to the list by 
    116116            calling its append or extend method: call the :obj:`add_value` 
    117117            method instead. 
     
    122122    .. attribute:: base_value 
    123123 
    124             Stores the base value for the variable as an index into `values`. 
     124            Stores the base value for the variable as an index in `values`. 
    125125            This can be, for instance, a "normal" value, such as "no 
    126126            complications" as opposed to abnormal "low blood pressure". The 
    127127            base value is used by certain statistics, continuization etc. 
    128             potentially, learning algorithms. Default is -1 and means that 
     128            potentially, learning algorithms. The default is -1 which means that 
    129129            there is no base value. 
    130130     
     
    156156        Tells Orange to monitor the number of decimals when the value is 
    157157        converted from a string (when the values are read from a file or 
    158         converted by, e.g. ``inst[0]="3.14"``). The value of ``0`` means that 
    159         the number of decimals should not be adjusted, while 1 and 2 mean that 
    160         adjustments are on, with 2 denoting that no values have been converted 
    161         yet. 
    162  
    163         By default, adjustment of number of decimals goes as follows. 
     158        converted by, e.g. ``inst[0]="3.14"``):  
     159        0: the number of decimals is not adjusted automatically; 
     160        1: the number of decimals is (and has already) been adjusted; 
     161        2: automatic adjustment is enabled, but no values have been converted yet. 
     162 
     163        By default, adjustment of the number of decimals goes as follows: 
    164164     
    165165        If the variable was constructed when data was read from a file, it will  
     
    170170     
    171171        If the variable is created in a script, it will have, by default, three 
    172         decimals places. This can be changed either by setting the value 
     172        decimal places. This can be changed either by setting the value 
    173173        from a string (e.g. ``inst[0]="3.14"``, but not ``inst[0]=3.14``) or by 
    174174        manually setting the `number_of_decimals`. 
     
    183183    Bases: :class:`Variable` 
    184184 
    185     Descriptor for variables that contains strings. No method can use them for  
    186     learning; some will complain and other will silently ignore them when they  
     185    Descriptor for variables that contain strings. No method can use them for  
     186    learning; some will complain and others will silently ignore them when they  
    187187    encounter them. They can be, however, useful for meta-attributes; if  
    188     instances in dataset have unique id's, the most efficient way to store them  
     188    instances in a dataset have unique IDs, the most efficient way to store them  
    189189    is to read them as meta-attributes. In general, never use discrete  
    190190    attributes with many (say, more than 50) values. Such attributes are  
     
    194194    When converting strings into values and back, empty strings are treated  
    195195    differently than usual. For other types, an empty string can be used to 
    196     denote undefined values, while :obj:`StringVariable` will take empty string 
    197     as an empty string -- that is, except when loading or saving into file. 
     196    denote undefined values, while :obj:`StringVariable` will take empty strings 
     197    as empty strings -- except when loading or saving into file. 
    198198    Empty strings in files are interpreted as undefined; to specify an empty 
    199     string, enclose the string into double quotes; these get removed when the 
     199    string, enclose the string in double quotes; these are removed when the 
    200200    string is loaded. 
    201201 
     
    205205    Bases: :class:`Variable` 
    206206 
    207     Base class for descriptors defined in Python. It is fully functional, 
     207    Base class for descriptors defined in Python. It is fully functional 
    208208    and can be used as a descriptor for attributes that contain arbitrary Python 
    209209    values. Since this is an advanced topic, PythonVariables are described on a  
     
    215215 
    216216Values of variables are often computed from other variables, such as in 
    217 discretization. The mechanism described below usually occurs behind the scenes, 
     217discretization. The mechanism described below usually functions behind the scenes, 
    218218so understanding it is required only for implementing specific transformations. 
    219219 
     
    226226    :lines: 7-17 
    227227     
    228 The new variable is named ``e2``; we define it by descriptor of type  
     228The new variable is named ``e2``; we define it with a descriptor of type  
    229229:obj:`Discrete`, with appropriate name and values ``"not 1"`` and ``1`` (we  
    230230chose this order so that the ``not 1``'s index is ``0``, which can be, if  
     
    234234 
    235235``checkE`` is a function that is passed an instance and another argument we  
    236 don't care about here. If the instance's ``e`` equals ``1``, the function  
     236do not care about here. If the instance's ``e`` equals ``1``, the function  
    237237returns value ``1``, otherwise it returns ``not 1``. Both are returned as  
    238238values, not plain strings. 
    239239 
    240 In most circumstances, value of ``e2`` can be computed on the fly - we can  
    241 pretend that the variable exists in the data, although it doesn't (but  
     240In most circumstances the value of ``e2`` can be computed on the fly - we can  
     241pretend that the variable exists in the data, although it does not (but  
    242242can be computed from it). For instance, we can compute the information gain of 
    243243variable ``e2`` or its distribution without actually constructing data containing 
     
    254254    new_data = Orange.data.Table(new_domain, data)  
    255255 
    256 Automatic computation is useful when the data is split onto training and  
    257 testing examples. Training instanced can be modified by adding, removing  
     256Automatic computation is useful when the data is split into training and  
     257testing examples. Training instances can be modified by adding, removing  
    258258and transforming variables (in a typical setup, continuous variables  
    259259are discretized prior to learning, therefore the original variables are  
    260 replaced by new ones), while test instances are left as they  
     260replaced by new ones). Test instances, on the other hand, are left as they  
    261261are. When they are classified, the classifier automatically converts the  
    262262testing instances into the new domain, which includes recomputation of  
     
    271271----------------------------- 
    272272 
    273 All variables have a field :obj:`~Variable.attributes`. It is a dictionary 
     273All variables have a field :obj:`~Variable.attributes`, a dictionary 
    274274which can contain strings. Although the current implementation allows all 
    275275types of value we strongly advise to use only strings. An example: 
     
    277277.. literalinclude:: code/attributes.py 
    278278 
    279 The attributes can only be saved to a .tab file. They are listed in the 
     279These attributes can only be saved to a .tab file. They are listed in the 
    280280third line in <name>=<value> format, after other attribute specifications 
    281281(such as "meta" or "class"), and are separated by spaces.  
     
    285285 
    286286There are situations when variable descriptors need to be reused. Typically, the  
    287 user loads some training examples, trains a classifier and then loads a separate 
     287user loads some training examples, trains a classifier, and then loads a separate 
    288288test set. For the classifier to recognize the variables in the second data set, 
    289289the descriptors, not just the names, need to be the same.  
    290290 
    291 When constructing new descriptors for data read from a file or at unpickling, 
     291When constructing new descriptors for data read from a file or during unpickling, 
    292292Orange checks whether an appropriate descriptor (with the same name and, in case 
    293293of discrete variables, also values) already exists and reuses it. When new 
     
    296296the same name may already exist. 
    297297 
    298 The search for existing variable is based on four attributes: the variable's name, 
    299 type, ordered values and unordered values. As for the latter two, the values can  
     298The search for an existing variable is based on four attributes: the variable's name, 
     299type, ordered values, and unordered values. As for the latter two, the values can  
    300300be explicitly ordered by the user, e.g. in the second line of the tab-delimited  
    301 file, for instance to order sizes as small-medium-big. 
     301file. For instance, sizes can be ordered as small, medium, or big. 
    302302 
    303303The search for existing variables can end with one of the following statuses. 
     
    307307 
    308308Orange.data.variable.Variable.MakeStatus.Incompatible (3) 
    309     There is (or are) variables with matching name and type, but their 
     309    There are variables with matching name and type, but their 
    310310    values are incompatible with the prescribed ordered values. For example, 
    311311    if the existing variable already has values ["a", "b"] and the new one 
    312312    wants ["b", "a"], the old variable cannot be reused. The existing list can, 
    313     however be appended the new values, so searching for ["a", "b", "c"] would 
    314     succeed. So will also the search for ["a"], since the extra existing value 
    315     does not matter. The formal rule is thus that the values are compatible if ``existing_values[:len(ordered_values)] == ordered_values[:len(existing_values)]``. 
     313    however be appended with the new values, so searching for ["a", "b", "c"] would 
     314    succeed. Likewise a search for ["a"] would be successful, since the extra existing value 
     315    does not matter. The formal rule is thus that the values are compatible iff ``existing_values[:len(ordered_values)] == ordered_values[:len(existing_values)]``. 
    316316 
    317317Orange.data.variable.Variable.MakeStatus.NoRecognizedValues (2) 
     
    322322    name with values "M" and "F" (or, well, "no" and "yes" :). Reuse of this  
    323323    variable is possible, though this should probably be a new variable since it  
    324     obviously comes from a different data set. If we do decide for reuse, the  
     324    obviously comes from a different data set. If we do decide to reuse the variable, the  
    325325    old variable will get some unneeded new values and the new one will inherit  
    326326    some from the old. 
     
    340340 
    341341When loading the data using :obj:`Orange.data.Table`, Orange takes the safest  
    342 approach and, by default, reuses everything that is compatible, that is, up to  
     342approach and, by default, reuses everything that is compatible up to  
    343343and including ``NoRecognizedValues``. Unintended reuse would be obvious from the 
    344344variable having too many values, which the user can notice and fix. More on that  
     
    347347There are two functions for reusing the attributes instead of creating new ones. 
    348348 
    349 .. function:: Orange.data.variable.make(name, type, ordered_values, unordered_values[, create_new_on]) 
    350  
    351     Find and return an existing variable or create a new one if none existing 
     349.. function:: Orange.data.variable.make(name, type, ordered_values, unordered_values[, createNewOn]) 
     350 
     351    Find and return an existing variable or create a new one if none of the existing 
    352352    variables matches the given name, type and values. 
    353353     
    354     The optional `create_new_on` specifies the status at which a new variable is 
     354    The optional `create_on_new` specifies the status at which a new variable is 
    355355    created. The status must be at most ``Incompatible`` since incompatible (or 
    356356    non-existing) variables cannot be reused. If it is set lower, for instance  
    357357    to ``MissingValues``, a new variable is created even if there exists 
    358     a variable which only misses same values. If set to ``OK``, the function 
     358    a variable which is only missing the same values. If set to ``OK``, the function 
    359359    always creates a new variable. 
    360360     
    361361    The function returns a tuple containing a variable descriptor and the 
    362     status of the best matching variable. So, if ``create_new_on`` is set to 
     362    status of the best matching variable. So, if ``create_on_new`` is set to 
    363363    ``MissingValues``, and there exists a variable whose status is, say, 
    364364    ``UnrecognizedValues``, a variable would be created, while the second  
     
    368368    indicator whether the returned variable is reused or not. This can be, 
    369369    however, read from the status code: if it is smaller than the specified 
    370     ``create_new_on``, the variable is reused, otherwise we got a new descriptor. 
     370    ``create_new_on``, the variable is reused, otherwise a new descriptor has been constructed. 
    371371 
    372372    The exception to the rule is when ``create_new_on`` is OK. In this case, the  
     
    380380    :param unordered_values: a list of values, for which the order does not 
    381381        matter 
    382     :param create_new_on: gives condition for constructing a new variable instead 
     382    :param create_new_on: gives the condition for constructing a new variable instead 
    383383        of using the new one 
    384384     
    385385    :return_type: a tuple (:class:`Orange.data.variable.Variable`, int) 
    386386     
    387 .. function:: Orange.data.variable.retrieve(name, type, ordered_values, onordered_values[, create_new_on]) 
     387.. function:: Orange.data.variable.retrieve(name, type, ordered_values, onordered_values[, createNewOn]) 
    388388 
    389389    Find and return an existing variable, or :obj:`None` if no match is found. 
     
    395395    :param unordered_values: a list of values, for which the order does not 
    396396        matter 
    397     :param create_new_on: gives condition for constructing a new variable instead 
     397    :param create_new_on: gives the condition for constructing a new variable instead 
    398398        of using the new one 
    399399 
     
    405405executed only once (in a Python session) and in this order. 
    406406 
    407 :func:`Orange.data.variable.make` can be used for construction of new variables. :: 
     407:func:`Orange.data.variable.make` can be used for the construction of new variables. :: 
    408408     
    409409    >>> v1, s = Orange.data.variable.make("a", Orange.data.Type.Discrete, ["a", "b"]) 
     
    411411    4 <a, b> 
    412412 
    413 No surprises here: new variable is created and the status is ``NotFound``. :: 
     413No surprises here: a new variable is created and the status is ``NotFound``. :: 
    414414 
    415415    >>> v2, s = Orange.data.variable.make("a", Orange.data.Type.Discrete, ["a"], ["c"]) 
     
    419419The status is 1 (``MissingValues``), yet the variable is reused (``v2 is v1``). 
    420420``v1`` gets a new value, ``"c"``, which was given as an unordered value. It does 
    421 not matter that the new variable does not need value ``b``. :: 
     421not matter that the new variable does not need the value ``b``. :: 
    422422 
    423423    >>> v3, s = Orange.data.variable.make("a", Orange.data.Type.Discrete, ["a", "b", "c", "d"]) 
     
    425425    1 True <a, b, c, d> 
    426426 
    427 This is similar as before, except that the new value, ``d`` is not among the 
     427This is like before, except that the new value, ``d`` is not among the 
    428428ordered values. :: 
    429429 
     
    440440    0 True <a, b, c, d> <a, b, c, d> 
    441441 
    442 The new variable has values ``c`` and ``a``, but does not 
    443 mind about the order, so the existing attribute is ``OK``. :: 
     442The new variable has values ``c`` and ``a``, but the order is not important,  
     443so the existing attribute is ``OK``. :: 
    444444 
    445445    >>> v6, s = Orange.data.variable.make("a", Orange.data.Type.Discrete, None, ["e"]) "a"]) 
     
    447447    2 True <a, b, c, d, e> <a, b, c, d, e> 
    448448 
    449 The new variable has different values than the existing (status is 2, 
    450 ``NoRecognizedValues``), but the existing is reused nevertheless. Note that we 
     449The new variable has different values than the existing variable (status is 2, 
     450``NoRecognizedValues``), but the existing one is nonetheless reused. Note that we 
    451451gave ``e`` in the list of unordered values. If it was among the ordered, the 
    452452reuse would fail. :: 
     
    468468Finally, this is a perfect match, but any reuse is prohibited, so a new  
    469469variable is created. 
    470  
    471  
    472470 
    473471""" 
  • orange/Orange/ensemble/bagging.py

    r8264 r8305  
    117117            for i in range(len(freq)): 
    118118                freq[i] = freq[i]/len(self.classifiers) 
     119            freq = Orange.statistics.distribution.Discrete(freq) 
    119120            if resultType == orange.GetProbabilities: 
    120121                return freq 
     122            elif resultType == orange.GetBoth: 
     123                return (value, freq) 
    121124            else: 
    122                 return (value, freq) 
     125                return value 
     126             
    123127        elif self.classVar.varType ==Orange.data.Type.Continuous: 
    124128            votes = [c(instance, orange.GetBoth if resultType==\ 
     
    132136                prob = defaultdict(float) 
    133137                for c, p in votes: 
    134                     try: 
     138                    try:  
    135139                        prob[float(c)] += p[c] / wsum 
    136140                    except IndexError: # p[c] sometimes fails with index error 
    137141                        prob[float(c)] += 1.0 / wsum 
    138142                prob = orange.ContDistribution(prob) 
    139                 return self.classVar(pred), prob if resultType == orange.GetBoth\ 
     143                return (self.classVar(pred), prob) if resultType == orange.GetBoth\ 
    140144                    else prob 
    141145            elif resultType == orange.GetValue: 
    142146                pred = sum([float(c) for c in votes]) / wsum 
    143147                return self.classVar(pred) 
     148             
     149    def __reduce__(self): 
     150        return type(self), (self.classifiers, self.name, self.classVar), dict(self.__dict__) 
     151     
  • orange/Orange/ensemble/boosting.py

    r8264 r8305  
    131131              :class:`Orange.statistics.Distribution` or a tuple with both 
    132132        """ 
    133         votes = [0.] * len(self.classVar.values) 
     133        votes = Orange.statistics.distribution.Discrete(self.classVar) 
    134134        for c, e in self.classifiers: 
    135135            votes[int(c(instance))] += e 
     
    144144        if resultType == orange.GetProbabilities: 
    145145            return votes 
     146        elif resultType == orange.GetBoth: 
     147            return (value, votes) 
    146148        else: 
    147             return (value, votes) 
     149            return value 
     150         
     151    def __reduce__(self): 
     152        return type(self), (self.classifiers, self.name, self.classVar), dict(self.__dict__) 
     153     
  • orange/Orange/ensemble/forest.py

    r8264 r8305  
    240240            elif resultType == orange.GetProbabilities: return cprob 
    241241            else: return (cvalue, cprob) 
     242             
     243    def __reduce__(self): 
     244        return type(self), (self.classifiers, self.name, self.domain, self.classVar), dict(self.__dict__) 
    242245 
    243246### MeasureAttribute_randomForests 
  • orange/Orange/evaluation/testing.py

    r8264 r8305  
    865865        if callback: 
    866866            callback() 
    867     classifiers = [learner(learnset, learnweight) for learner in learners] 
    868     for i in range(len(learners)): classifiers[i].name = getattr(learners[i], 'name', 'noname') 
     867    for i in range(len(learners)): 
     868        classifiers[i].name = getattr(learners[i], 'name', 'noname') 
    869869    testResults = test_on_data(classifiers, (testset, testweight), testResults, iterationNumber, storeExamples) 
    870870    if storeclassifiers: 
     
    917917        if callback: 
    918918            callback() 
    919     for i in range(len(learners)): classifiers[i].name = getattr(learners[i], "name", "noname") 
     919    for i in range(len(learners)): 
     920        classifiers[i].name = getattr(learners[i], "name", "noname") 
    920921    testResults = test_on_data(classifiers, (testset, learnweight), testResults, iterationNumber, storeExamples) 
    921922    if storeclassifiers: 
  • orange/Orange/misc/__init__.py

    r8264 r8305  
    106106.. automodule:: Orange.misc.serverfiles 
    107107 
     108========= 
     109`environ` 
     110========= 
     111 
     112.. index:: environment 
     113 
     114.. automodule:: Orange.misc.environ 
     115 
    108116""" 
    109  
     117import environ 
    110118import counters 
    111119import selection 
    112120import render 
    113121import serverfiles 
     122 
    114123# addons is intentionally not imported; if it were, add-ons' directories would 
    115124# be added to the python path. If that sounds OK, this can be changed ... 
     
    231240        cache = {} 
    232241         
    233         functools.wraps(func) 
     242        @functools.wraps(func) 
    234243        def wrapped(*args, **kwargs): 
    235244            key = args + tuple(sorted(kwargs.items())) 
     
    250259         
    251260        wrapped.clear = clear 
     261        wrapped._cache = cache 
    252262         
    253263        return wrapped 
    254264    return decorating_function 
     265 
     266 
     267class recursion_limit(object): 
     268    """ A context manager that sets a new recursion limit.  
     269     
     270    """ 
     271    def __init__(self, limit=1000): 
     272        self.limit = limit 
     273         
     274    def __enter__(self): 
     275        self.old_limit = sys.getrecursionlimit() 
     276        sys.setrecursionlimit(self.limit) 
     277     
     278    def __exit__(self, exc_type, exc_val, exc_tb): 
     279        sys.setrecursionlimit(self.old_limit) 
    255280 
    256281 
     
    316341    Example :: 
    317342             
    318         >>> @deprecated_members({"fooBar": "foo_bar", "setFooBar":"set_foo_bar"}, 
    319         ...                    wrap_methods=["set_foo_bar", "__init__"]) 
    320         ... class A(object): 
     343        >>> class A(object): 
    321344        ...     def __init__(self, foo_bar="bar"): 
    322345        ...         self.set_foo_bar(foo_bar) 
     
    324347        ...     def set_foo_bar(self, foo_bar="bar"): 
    325348        ...         self.foo_bar = foo_bar 
    326         ...          
     349        ... 
     350        ... A = deprecated_members( 
     351        ... {"fooBar": "foo_bar",  
     352        ...  "setFooBar":"set_foo_bar"}, 
     353        ... wrap_methods=["set_foo_bar", "__init__"])(A) 
     354        ...  
    327355        ... 
    328356        >>> a = A(fooBar="foo") 
     
    333361        __main__:1: DeprecationWarning: 'setFooBar' is deprecated. Use 'set_foo_bar' instead! 
    334362         
    335     """ 
     363    .. note:: This decorator does nothing if 
     364        :obj:`Orange.misc.environ.orange_no_deprecated_members` environment 
     365        variable is set to `True`. 
     366         
     367    """ 
     368    if environ.orange_no_deprecated_members: 
     369        return lambda cls: cls 
     370     
    336371    def is_wrapped(method): 
    337372        """ Is member method already wrapped. 
     
    387422        Arg 
    388423         
    389     """ 
     424    .. note:: This decorator does nothing if 
     425        :obj:`Orange.misc.environ.orange_no_deprecated_members` environment 
     426        variable is set to `True`. 
     427         
     428    """ 
     429    if environ.orange_no_deprecated_members: 
     430        return lambda func: func 
     431     
    390432    def decorator(func): 
    391433        @wraps(func) 
     
    418460        123 
    419461         
    420     """ 
     462    .. note:: This decorator does nothing and returns None if 
     463        :obj:`Orange.misc.environ.orange_no_deprecated_members` environment 
     464        variable is set to `True`. 
     465         
     466    """ 
     467    if environ.orange_no_deprecated_members: 
     468        return None 
     469     
    421470    def fget(self): 
    422471        deprecation_warning(old_name, new_name, stacklevel=3) 
     
    433482    prop = property(fget, fset, fdel, 
    434483                    doc="A deprecated member '%s'. Use '%s' instead." % (old_name, new_name)) 
    435     return prop 
    436  
     484    return prop  
    437485 
    438486""" 
     
    477525    """  
    478526    return type(self), (), dict(self.__dict__) 
    479      
     527 
     528 
     529     
  • orange/Orange/misc/addons.py

    r8264 r8305  
    108108import re 
    109109import os 
     110import sys 
    110111import glob 
    111112import time 
     
    116117import platform 
    117118 
    118 import orngEnviron 
     119import Orange.misc.environ 
    119120import widgetParser 
    120121from fileutil import * 
    121122from fileutil import _zip_open 
    122123from zipfile import ZipFile 
     124 
     125import warnings 
    123126 
    124127socket.setdefaulttimeout(120)  # In seconds. 
     
    300303            xmldoc = xml.dom.minidom.parse(addon_xml_path) 
    301304        except Exception, e: 
    302             print "Could not load addon.xml because \"%s\"; a new one will be"+\ 
    303                 " created." % e 
     305            warnings.warn("Could not load addon.xml because \"%s\"; a new one "+ 
     306                          "will be created." % e, Warning, 0) 
    304307            impl = xml.dom.minidom.getDOMImplementation() 
    305308            xmldoc = impl.createDocument(None, "OrangeAddOn", None) 
     
    369372        xmldoc.writexml(codecs.open(addon_xml_path, 'w', "utf-8"), 
    370373                        encoding="UTF-8") 
    371         print "Updated addon.xml written." 
     374        sys.stderr.write("Updated addon.xml written.\n") 
    372375 
    373376        ########################## 
     
    375378        ########################## 
    376379        localcss = os.path.join(self.directory_documentation(), "style.css") 
    377         orangecss = os.path.join(orngEnviron.orangeDocDir, "style.css") 
     380        orangecss = os.path.join(Orange.misc.environ.doc_install_dir, "style.css") 
    378381        if not os.path.isfile(localcss): 
    379382            if os.path.isfile(orangecss): 
    380383                import shutil 
    381384                shutil.copy(orangecss, localcss) 
    382                 print "doc/style.css created." 
     385                sys.stderr.write("doc/style.css created.\n") 
    383386            else: 
    384387                raise PackingException("Could not find style.css in orange"+\ 
     
    407410                            " be.")) 
    408411            indexFile.close() 
    409             print "doc/index.html written." 
     412            sys.stderr.write("doc/index.html written.\n") 
    410413             
    411414        ########################## 
     
    415418        if not os.path.isdir(wdocdir): os.mkdir(wdocdir) 
    416419        open(os.path.join(wdocdir, "index.html"), 'w').write(self.iconlist_html()) 
    417         print "Widget list (doc/widgets/index.html) written." 
     420        sys.stderr.write("Widget list (doc/widgets/index.html) written.\n") 
    418421 
    419422        ########################## 
     
    427430 
    428431        import shutil 
    429         iconbg_file = os.path.join(orngEnviron.picsDir, "background_32.png") 
    430         iconun_file = os.path.join(orngEnviron.picsDir, "Unknown.png") 
     432        iconbg_file = os.path.join(Orange.misc.environ.icons_install_dir, "background_32.png") 
     433        iconun_file = os.path.join(Orange.misc.environ.icons_install_dir, "Unknown.png") 
    431434        if not os.path.isdir(icondocdir): os.mkdir(icondocdir) 
    432435        if os.path.isfile(iconbg_file): shutil.copy(iconbg_file, icondocdir) 
     
    442445            if not os.path.isdir(proticondocdir): os.mkdir(proticondocdir) 
    443446            distutils.dir_util.copy_tree(proticondir, proticondocdir) 
    444         print "Widget icons copied to doc/widgets/." 
     447        sys.stderr.write("Widget icons copied to doc/widgets/.\n") 
    445448 
    446449 
     
    11781181            except Exception, e: 
    11791182                if force: 
    1180                     print "Couldn't load data from repository '%s': %s" % (self.name, e) 
     1183                    warnings.warn("Couldn't load data from repository '%s': %s" 
     1184                                  % (self.name, e), Warning, 0) 
    11811185                    return 
    11821186                raise e 
     
    12281232            for version in versions: 
    12291233                if version.version == addon.version: 
    1230                     print "Ignoring the second occurence of addon '%s', version '%s'." % (addon.name, addon.version_str) 
     1234                    warnings.warn("Ignoring the second occurence of addon '%s'"+ 
     1235                                  ", version '%s'." % (addon.name, 
     1236                                                       addon.version_str), 
     1237                                  Warning, 0) 
    12311238                    return 
    12321239            versions.append(addon) 
     
    13021309                                    self._add_addon(addon) 
    13031310                                except Exception, e: 
    1304                                     print "Ignoring node nr. %d in repository '%s' because of an error: %s" % (i+1, self.name, e) 
     1311                                    warnings.warn("Ignoring node nr. %d in "+ 
     1312                                                  "repository '%s' because of"+ 
     1313                                                  " an error: %s" % (i+1, 
     1314                                                                     self.name, 
     1315                                                                     e), 
     1316                                                  Warning, 0) 
    13051317                        self.has_web_script = True 
    13061318                        return True 
    13071319                    except Exception, e: 
    1308                         print "Warning: a problem occurred using server-side script on repository '%s': %s.\nAll add-ons need to be downloaded for their metadata to be extracted!" % (self.name, e) 
     1320                        warnings.warn("A problem occurred using server-side script on repository '%s': %s.\nAll add-ons need to be downloaded for their metadata to be extracted!" 
     1321                                      % (self.name, str(e)), Warning, 0) 
    13091322 
    13101323                    # Invoking script failed - trying to get and parse a directory listing 
     
    13191332                    if len(addOnFiles)==0: 
    13201333                        if firstload: 
    1321                             raise RepositoryException("Unable to load repository data: this is not an Orange add-on repository!") 
     1334                            raise RepositoryException("Unable to load reposito"+ 
     1335                                                      "ry data: this is not an"+ 
     1336                                                      " Orange add-on "+ 
     1337                                                      "repository!") 
    13221338                        else: 
    1323                             print "Repository '%s' is empty ..." % self.name 
     1339                            warnings.warn("Repository '%s' is empty ..." % 
     1340                                          self.name, Warning, 0) 
    13241341                    self.addons = {} 
    13251342                    for addOnFile in addOnFiles: 
     
    13281345                            self._add_packed_addon(addOnTmpFile, addOnFile) 
    13291346                        except Exception, e: 
    1330                             print "Ignoring '%s' in repository '%s' because of an error: %s" % (addOnFile, self.name, e) 
     1347                            warnings.warn("Ignoring '%s' in repository '%s' "+ 
     1348                                          "because of an error: %s" % 
     1349                                          (addOnFile, self.name, e), 
     1350                                          Warning, 0) 
    13311351                elif protocol == "file": # A local repository: open each and every archive to obtain data 
    13321352                    dir = self.url.replace("file://","") 
     
    13391359                                                  os.path.split(addOnFile)[1]) 
    13401360                        except Exception, e: 
    1341                             print "Ignoring '%s' in repository '%s' because of an error: %s" % (addOnFile, self.name, e) 
     1361                            warnings.warn("Ignoring '%s' in repository '%s' "+ 
     1362                                          "because of an error: %s" % 
     1363                                          (addOnFile, self.name, e), 
     1364                                          Warning, 0) 
    13421365                return True 
    13431366            finally: 
     
    14401463                addOn = OrangeAddOnInstalled(addOnDir) 
    14411464            except Exception, e: 
    1442                 print "Add-on in directory '%s' has no valid descriptor (addon.xml): %s" % (addOnDir, e) 
     1465                warnings.warn("Add-on in directory '%s' has no valid descriptor (addon.xml): %s" % (addOnDir, e), Warning, 0) 
    14431466                continue 
    14441467            if addOn.id in installed_addons: 
    1445                 print "Add-on in directory '%s' has the same ID as the addon in '%s'!" % (addOnDir, installed_addons[addOn.id].directory) 
     1468                warnings.warn("Add-on in directory '%s' has the same ID as the addon in '%s'!" % (addOnDir, installed_addons[addOn.id].directory), Warning, 0) 
    14461469                continue 
    14471470            installed_addons[addOn.id] = addOn 
     
    14521475    within Canvas settings directory.  
    14531476    """ 
    1454     canvasSettingsDir = os.path.realpath(orngEnviron.directoryNames["canvasSettingsDir"]) 
     1477    canvasSettingsDir = os.path.realpath(Orange.misc.environ.canvas_settings_dir) 
    14551478    listFileName = os.path.join(canvasSettingsDir, "repositoryList.pickle") 
    14561479    return listFileName 
     
    14801503            file.close() 
    14811504        except Exception, e: 
    1482             print "Unable to load repository list! Error: %s" % e 
     1505            warnings.warn("Unable to load repository list! Error: %s" % e, Warning, 0) 
    14831506    try: 
    14841507        update_default_repositories(refresh=refresh) 
    14851508    except Exception, e: 
    1486         print "Unable to refresh default repositories: %s" % (e) 
     1509        warnings.warn("Unable to refresh default repositories: %s" % (e), Warning, 0) 
    14871510 
    14881511    if refresh: 
     
    14921515                r.refreshdata(force=False) 
    14931516            except Exception, e: 
    1494                 print "Unable to refresh repository %s! Error: %s" % (r.name, e) 
     1517                warnings.warn("Unable to refresh repository %s! Error: %s" % (r.name, e), Warning, 0) 
    14951518    save_repositories() 
    14961519 
     
    15061529        cPickle.dump(available_repositories, open(listFileName, 'wb')) 
    15071530    except Exception, e: 
    1508         print "Unable to save repository list! Error: %s" % e 
     1531        warnings.warn("Unable to save repository list! Error: %s" % e, Warning, 0) 
    15091532     
    15101533 
     
    15701593                                                           else platform.machine(), 
    15711594                                                           ".".join(map(str, sys.version_info[:2])) )) )]: 
    1572             if os.path.isdir(p) and not any([orngEnviron.samepath(p, x) 
     1595            if os.path.isdir(p) and not any([Orange.misc.environ.samepath(p, x) 
    15731596                                             for x in sys.path]): 
    15741597                if p not in sys.path: 
     
    15961619    """ 
    15971620    Install an add-on from given .oao package. Installation means unpacking the 
    1598     .oao file to an appropriate directory (:obj:`orngEnviron.addOnsDirUser` or 
    1599     :obj:`orngEnviron.addOnsDirSys`, depending on the 
     1621    .oao file to an appropriate directory (:obj:`Orange.misc.environ.add_ons_dir_user` or 
     1622    :obj:`Orange.misc.environ.add_ons_dir_sys`, depending on the 
    16001623    :obj:`global_install` parameter), creating an 
    16011624    :class:`OrangeAddOnInstalled` instance and adding this object into the 
     
    16231646                raise InstallationException("Refusing to install unsafe package: it contains file named '%s'!" % filename) 
    16241647         
    1625         root = orngEnviron.addOnsDirSys if global_install else orngEnviron.addOnsDirUser 
     1648        root = Orange.misc.environ.add_ons_dir if global_install else Orange.misc.environ.add_ons_dir_user 
    16261649         
    16271650        try: 
     
    17141737    add-on installation directory (:obj:`orngEnviron.addOnsDirUser`). 
    17151738    """ 
    1716     load_installed_addons_from_dir(orngEnviron.addOnsDirSys) 
    1717     load_installed_addons_from_dir(orngEnviron.addOnsDirUser) 
     1739    load_installed_addons_from_dir(Orange.misc.environ.add_ons_dir) 
     1740    load_installed_addons_from_dir(Orange.misc.environ.add_ons_dir_user) 
    17181741 
    17191742def refresh_addons(reload_path=False): 
     
    17431766     
    17441767def __read_addon_lists(userOnly=False): 
    1745     return __read_addons_list(os.path.join(orngEnviron.orangeSettingsDir, "add-ons.txt"), 
     1768    return __read_addons_list(os.path.join(Orange.misc.environ.orange_settings_dir, "add-ons.txt"), 
    17461769                              False) + ([] if userOnly else 
    1747                                         __read_addons_list(os.path.join(orngEnviron.orangeDir, "add-ons.txt"), 
     1770                                        __read_addons_list(os.path.join(Orange.misc.environ.install_dir, "add-ons.txt"), 
    17481771                                                           True)) 
    17491772 
    17501773def __write_addon_lists(addons, user_only=False): 
    1751     file(os.path.join(orngEnviron.orangeSettingsDir, "add-ons.txt"), "wt").write("\n".join(["%s\t%s" % (a.name, a.directory) for a in addons if not a.systemwide])) 
     1774    file(os.path.join(Orange.misc.environ.orange_settings_dir, "add-ons.txt"), "wt").write("\n".join(["%s\t%s" % (a.name, a.directory) for a in addons if not a.systemwide])) 
    17521775    if not user_only: 
    1753         file(os.path.join(orngEnviron.orangeDir        , "add-ons.txt"), "wt").write("\n".join(["%s\t%s" % (a.name, a.directory) for a in addons if     a.systemwide])) 
     1776        file(os.path.join(Orange.misc.environ.install_dir        , "add-ons.txt"), "wt").write("\n".join(["%s\t%s" % (a.name, a.directory) for a in addons if     a.systemwide])) 
    17541777 
    17551778def register_addon(name, path, add = True, refresh=True, systemwide=False): 
  • orange/Orange/network/__init__.py

    r8264 r8305  
    570570    import warnings 
    571571    warnings.warn("Warning: install networkx to use the 'Orange.network' module.")  
     572 
     573import community 
    572574 
    573575class MdsTypeClass(): 
  • orange/Orange/network/network.py

    r8264 r8305  
     1import copy 
     2import math 
     3import numpy 
    14import networkx as nx 
     5 
     6import Orange 
     7import orangeom 
     8 
    29import readwrite 
    3 import Orange 
    410 
    511from networkx import algorithms  
    612from networkx.classes import function 
     13 
     14 
     15class MdsTypeClass(): 
     16    def __init__(self): 
     17        self.componentMDS = 0 
     18        self.exactSimulation = 1 
     19        self.MDS = 2 
     20 
     21MdsType = MdsTypeClass() 
     22 
    723 
    824class BaseGraph(): 
     
    1329         
    1430    def items(self): 
    15         if len(self._items) != self.number_of_nodes(): 
     31        if self._items is not None and \ 
     32                        len(self._items) != self.number_of_nodes(): 
    1633            print "Warning: items length does not match the number of nodes." 
    1734             
     
    1936     
    2037    def set_items(self, items=None): 
    21         if items: 
     38        if items is not None: 
    2239            if not isinstance(items, Orange.data.Table): 
    2340                raise TypeError('items must be of type \'Orange.data.Table\'') 
     
    2845         
    2946    def links(self): 
    30         if len(self._links) != self.number_of_edges(): 
     47        if self._links is not None \ 
     48                    and len(self._links) != self.number_of_edges(): 
    3149            print "Warning: links length does not match the number of edges." 
    3250             
    3351        return self._links 
    3452     
    35     def set_links(self, links): 
    36         if links: 
     53    def set_links(self, links=None): 
     54        if links is not None: 
    3755            if not isinstance(links, Orange.data.Table): 
    3856                raise TypeError('links must be of type \'Orange.data.Table\'') 
     
    4260        self._links = links 
    4361         
     62    def to_orange_network(self): 
     63        """Convert the network to Orange NetworkX standard. All node IDs are transformed to range [0, no_of_nodes - 1]."""  
     64        if isinstance(self, Orange.network.Graph): 
     65            G = Orange.network.Graph() 
     66        elif isinstance(self, Orange.network.DiGraph): 
     67            G = Orange.network.DiGraph() 
     68        elif isinstance(self, Orange.network.MultiGraph): 
     69            G = Orange.network.MultiGraph() 
     70        elif isinstance(self, Orange.network.MultiDiGraph): 
     71            G = Orange.network.DiGraph() 
     72        else: 
     73            raise TypeError('WTF!?') 
     74         
     75        node_list = sorted(self.nodes()) 
     76        node_to_index = dict(zip(node_list, range(self.number_of_nodes()))) 
     77        index_to_node = dict(zip(range(self.number_of_nodes()), node_list)) 
     78         
     79        G.add_nodes_from(zip(range(self.number_of_nodes()), [copy.deepcopy(self.node[nid]) for nid in node_list])) 
     80        G.add_edges_from(((node_to_index[u], node_to_index[v], copy.deepcopy(self.edge[u][v])) for u,v in self.edges())) 
     81         
     82        for id in G.node.keys(): 
     83            G.node[id]['old_id'] = index_to_node[id] 
     84         
     85        if self.items(): 
     86            G.set_items(self.items()) 
     87 
     88        if self.links(): 
     89            G.set_links(self.links()) 
     90         
     91        return G 
     92         
    4493    ### TODO: OVERRIDE METHODS THAT CHANGE GRAPH STRUCTURE, add warning prints 
     94     
     95    def items_vars(self): 
     96        """Return a list of features in network items.""" 
     97        vars = [] 
     98        if (self._items is not None): 
     99            if isinstance(self._items, Orange.data.Table): 
     100                vars = list(self._items.domain.variables) 
     101             
     102                metas = self._items.domain.getmetas(0) 
     103                vars.extend(var for i, var in metas.iteritems()) 
     104        return vars 
     105     
     106    def links_vars(self): 
     107        """Return a list of features in network links.""" 
     108        vars = [] 
     109        if (self._links is not None): 
     110            if isinstance(self._links, Orange.data.Table): 
     111                vars = list(self._links.domain.variables) 
     112             
     113                metas = self._links.domain.getmetas(0) 
     114                vars.extend(var for i, var in metas.iteritems()) 
     115        return [x for x in vars if str(x.name) != 'u' and str(x.name) != 'v']     
    45116     
    46117class Graph(BaseGraph, nx.Graph): 
     
    75146         
    76147    __init__.__doc__ = nx.MultiDiGraph.__init__.__doc__ 
     148 
     149class GraphLayout(orangeom.GraphLayout): 
     150     
     151    """A graph layout optimization class.""" 
     152     
     153    def __init__(self): 
     154        self.graph = None 
     155        self.items_matrix = None 
     156         
     157    def set_graph(self, graph=None, positions=None): 
     158        """Initialize graph structure 
     159         
     160        :param graph: NetworkX graph 
     161         
     162        """ 
     163        self.graph = graph 
     164         
     165        if positions is not None and len(positions) == graph.number_of_nodes(): 
     166            orangeom.GraphLayout.set_graph(self, graph, positions) 
     167        else: 
     168            orangeom.GraphLayout.set_graph(self, graph) 
     169             
     170    def random(self): 
     171        orangeom.GraphLayout.random(self) 
     172         
     173    def fr(self, steps, temperature, coolFactor=0, weighted=False): 
     174        return orangeom.GraphLayout.fr(self, steps, temperature, coolFactor, weighted) 
     175         
     176    def fr_radial(self, center, steps, temperature): 
     177        return orangeom.GraphLayout.fr_radial(self, center, steps, temperature) 
     178     
     179    def circular_original(self): 
     180        orangeom.GraphLayout.circular_original(self) 
     181     
     182    def circular_random(self): 
     183        orangeom.GraphLayout.circular_random(self) 
     184     
     185    def circular_crossing_reduction(self): 
     186        orangeom.GraphLayout.circular_crossing_reduction(self) 
     187     
     188    def get_vertices_in_rect(self, x1, y1, x2, y2): 
     189        return orangeom.GraphLayout.get_vertices_in_rect(self, x1, y1, x2, y2) 
     190     
     191    def closest_vertex(self, x, y): 
     192        return orangeom.GraphLayout.closest_vertex(self, x, y) 
     193     
     194    def vertex_distances(self, x, y): 
     195        return orangeom.GraphLayout.vertex_distances(self, x, y) 
     196     
     197    def rotate_vertices(self, components, phi):  
     198        """Rotate network components for a given angle. 
     199         
     200        :param components: list of network components 
     201        :type components: list of lists of vertex indices 
     202        :param phi: list of component rotation angles (unit: radians) 
     203        """   
     204        #print phi  
     205        for i in range(len(components)): 
     206            if phi[i] == 0: 
     207                continue 
     208             
     209            component = components[i] 
     210             
     211            x = self.coors[0][component] 
     212            y = self.coors[1][component] 
     213             
     214            x_center = x.mean() 
     215            y_center = y.mean() 
     216             
     217            x = x - x_center 
     218            y = y - y_center 
     219             
     220            r = numpy.sqrt(x ** 2 + y ** 2) 
     221            fi = numpy.arctan2(y, x) 
     222             
     223            fi += phi[i] 
     224            #fi += factor * M[i] * numpy.pi / 180 
     225                 
     226            x = r * numpy.cos(fi) 
     227            y = r * numpy.sin(fi) 
     228             
     229            self.coors[0][component] = x + x_center 
     230            self.coors[1][component] = y + y_center 
     231     
     232    def rotate_components(self, maxSteps=100, minMoment=0.000000001,  
     233                          callbackProgress=None, callbackUpdateCanvas=None): 
     234        """Rotate the network components using a spring model.""" 
     235        if self.items_matrix == None: 
     236            return 1 
     237         
     238        if self.graph == None: 
     239            return 1 
     240         
     241        if self.items_matrix.dim != self.graph.number_of_nodes(): 
     242            return 1 
     243         
     244        self.stopRotate = 0 
     245         
     246        # rotate only components with more than one vertex 
     247        components = [component for component \ 
     248            in Orange.network.nx.algorithms.components.connected_components(self.graph) \ 
     249            if len(component) > 1] 
     250        vertices = set(range(self.graph.number_of_nodes())) 
     251        step = 0 
     252        M = [1] 
     253        temperature = [[30.0, 1] for i in range(len(components))] 
     254        dirChange = [0] * len(components) 
     255        while step < maxSteps and (max(M) > minMoment or \ 
     256                                min(M) < -minMoment) and not self.stopRotate: 
     257            M = [0] * len(components)  
     258             
     259            for i in range(len(components)): 
     260                component = components[i] 
     261                 
     262                outer_vertices = vertices - set(component) 
     263                 
     264                x = self.coors[0][component] 
     265                y = self.coors[1][component] 
     266                 
     267                x_center = x.mean() 
     268                y_center = y.mean() 
     269                 
     270                for j in range(len(component)): 
     271                    u = component[j] 
     272 
     273                    for v in outer_vertices: 
     274                        d = self.items_matrix[u, v] 
     275                        u_x = self.coors[0][u] 
     276                        u_y = self.coors[1][u] 
     277                        v_x = self.coors[0][v] 
     278                        v_y = self.coors[1][v] 
     279                        L = [(u_x - v_x), (u_y - v_y)] 
     280                        R = [(u_x - x_center), (u_y - y_center)] 
     281                        e = math.sqrt((v_x - x_center) ** 2 + \ 
     282                                      (v_y - y_center) ** 2) 
     283                         
     284                        M[i] += (1 - d) / (e ** 2) * numpy.cross(R, L) 
     285             
     286            tmpM = numpy.array(M) 
     287            #print numpy.min(tmpM), numpy.max(tmpM),numpy.average(tmpM),numpy.min(numpy.abs(tmpM)) 
     288             
     289            phi = [0] * len(components) 
     290            #print "rotating", temperature, M 
     291            for i in range(len(M)): 
     292                if M[i] > 0: 
     293                    if temperature[i][1] < 0: 
     294                        temperature[i][0] = temperature[i][0] * 5 / 10 
     295                        temperature[i][1] = 1 
     296                        dirChange[i] += 1 
     297                         
     298                    phi[i] = temperature[i][0] * numpy.pi / 180 
     299                elif M[i] < 0:   
     300                    if temperature[i][1] > 0: 
     301                        temperature[i][0] = temperature[i][0] * 5 / 10 
     302                        temperature[i][1] = -1 
     303                        dirChange[i] += 1 
     304                     
     305                    phi[i] = -temperature[i][0] * numpy.pi / 180 
     306             
     307            # stop rotating when phi is to small to notice the rotation 
     308            if max(phi) < numpy.pi / 1800: 
     309                #print "breaking" 
     310                break 
     311             
     312            self.rotate_vertices(components, phi) 
     313            if callbackUpdateCanvas: callbackUpdateCanvas() 
     314            if callbackProgress : callbackProgress(min([dirChange[i] for i \ 
     315                                    in range(len(dirChange)) if M[i] != 0]), 9) 
     316            step += 1 
     317     
     318    def mds_update_data(self, components, mds, callbackUpdateCanvas): 
     319        """Translate and rotate the network components to computed positions.""" 
     320        component_props = [] 
     321        x_mds = [] 
     322        y_mds = [] 
     323        phi = [None] * len(components) 
     324        self.diag_coors = math.sqrt(( \ 
     325                    min(self.coors[0]) - max(self.coors[0]))**2 + \ 
     326                    (min(self.coors[1]) - max(self.coors[1]))**2) 
     327         
     328        if self.mdsType == MdsType.MDS: 
     329            x = [mds.points[u][0] for u in range(self.graph.number_of_nodes())] 
     330            y = [mds.points[u][1] for u in range(self.graph.number_of_nodes())] 
     331            self.coors[0][range(self.graph.number_of_nodes())] =  x 
     332            self.coors[1][range(self.graph.number_of_nodes())] =  y 
     333            if callbackUpdateCanvas: 
     334                callbackUpdateCanvas() 
     335            return 
     336         
     337        for i in range(len(components)):     
     338            component = components[i] 
     339             
     340            if len(mds.points) == len(components):  # if average linkage before 
     341                x_avg_mds = mds.points[i][0] 
     342                y_avg_mds = mds.points[i][1] 
     343            else:                                   # if not average linkage before 
     344                x = [mds.points[u][0] for u in component] 
     345                y = [mds.points[u][1] for u in component] 
     346         
     347                x_avg_mds = sum(x) / len(x)  
     348                y_avg_mds = sum(y) / len(y) 
     349                # compute rotation angle 
     350                c = [numpy.linalg.norm(numpy.cross(mds.points[u], \ 
     351                            [self.coors[0][u],self.coors[1][u]])) \ 
     352                            for u in component] 
     353                n = [numpy.vdot([self.coors[0][u], \ 
     354                                 self.coors[1][u]], \ 
     355                                 [self.coors[0][u], \ 
     356                                  self.coors[1][u]]) for u in component] 
     357                phi[i] = sum(c) / sum(n) 
     358                #print phi 
     359             
     360            x = self.coors[0][component] 
     361            y = self.coors[1][component] 
     362             
     363            x_avg_graph = sum(x) / len(x) 
     364            y_avg_graph = sum(y) / len(y) 
     365             
     366            x_mds.append(x_avg_mds)  
     367            y_mds.append(y_avg_mds) 
     368 
     369            component_props.append((x_avg_graph, y_avg_graph, \ 
     370                                    x_avg_mds, y_avg_mds, phi)) 
     371         
     372        w = max(self.coors[0]) - min(self.coors[0]) 
     373        h = max(self.coors[1]) - min(self.coors[1]) 
     374        d = math.sqrt(w**2 + h**2) 
     375        #d = math.sqrt(w*h) 
     376        e = [math.sqrt((self.coors[0][u] - self.coors[0][v])**2 +  
     377                  (self.coors[1][u] - self.coors[1][v])**2) for  
     378                  (u, v) in self.graph.edges()] 
     379         
     380        if self.scalingRatio == 0: 
     381            pass 
     382        elif self.scalingRatio == 1: 
     383            self.mdsScaleRatio = d 
     384        elif self.scalingRatio == 2: 
     385            self.mdsScaleRatio = d / sum(e) * float(len(e)) 
     386        elif self.scalingRatio == 3: 
     387            self.mdsScaleRatio = 1 / sum(e) * float(len(e)) 
     388        elif self.scalingRatio == 4: 
     389            self.mdsScaleRatio = w * h 
     390        elif self.scalingRatio == 5: 
     391            self.mdsScaleRatio = math.sqrt(w * h) 
     392        elif self.scalingRatio == 6: 
     393            self.mdsScaleRatio = 1 
     394        elif self.scalingRatio == 7: 
     395            e_fr = 0 
     396            e_count = 0 
     397            for i in range(self.graph.number_of_nodes()): 
     398                for j in range(i + 1, self.graph.number_of_nodes()): 
     399                    x1 = self.coors[0][i] 
     400                    y1 = self.coors[1][i] 
     401                    x2 = self.coors[0][j] 
     402                    y2 = self.coors[1][j] 
     403                    e_fr += math.sqrt((x1-x2)**2 + (y1-y2)**2) 
     404                    e_count += 1 
     405            self.mdsScaleRatio = e_fr / e_count 
     406        elif self.scalingRatio == 8: 
     407            e_fr = 0 
     408            e_count = 0 
     409            for i in range(len(components)): 
     410                for j in range(i + 1, len(components)): 
     411                    x_avg_graph_i, y_avg_graph_i, x_avg_mds_i, \ 
     412                    y_avg_mds_i, phi_i = component_props[i] 
     413                    x_avg_graph_j, y_avg_graph_j, x_avg_mds_j, \ 
     414                    y_avg_mds_j, phi_j = component_props[j] 
     415                    e_fr += math.sqrt((x_avg_graph_i-x_avg_graph_j)**2 + \ 
     416                                      (y_avg_graph_i-y_avg_graph_j)**2) 
     417                    e_count += 1 
     418            self.mdsScaleRatio = e_fr / e_count        
     419        elif self.scalingRatio == 9: 
     420            e_fr = 0 
     421            e_count = 0 
     422            for i in range(len(components)):     
     423                component = components[i] 
     424                x = self.coors[0][component] 
     425                y = self.coors[1][component] 
     426                for i in range(len(x)): 
     427                    for j in range(i + 1, len(y)): 
     428                        x1 = x[i] 
     429                        y1 = y[i] 
     430                        x2 = x[j] 
     431                        y2 = y[j] 
     432                        e_fr += math.sqrt((x1-x2)**2 + (y1-y2)**2) 
     433                        e_count += 1 
     434            self.mdsScaleRatio = e_fr / e_count 
     435         
     436        diag_mds =  math.sqrt((max(x_mds) - min(x_mds))**2 + (max(y_mds) - \ 
     437                                                              min(y_mds))**2) 
     438        e = [math.sqrt((self.coors[0][u] - self.coors[0][v])**2 +  
     439                  (self.coors[1][u] - self.coors[1][v])**2) for  
     440                  (u, v) in self.graph.edges()] 
     441        e = sum(e) / float(len(e)) 
     442         
     443        x = [mds.points[u][0] for u in range(len(mds.points))] 
     444        y = [mds.points[u][1] for u in range(len(mds.points))] 
     445        w = max(x) - min(x) 
     446        h = max(y) - min(y) 
     447        d = math.sqrt(w**2 + h**2) 
     448         
     449        if len(x) == 1: 
     450            r = 1 
     451        else: 
     452            if self.scalingRatio == 0: 
     453                r = self.mdsScaleRatio / d * e 
     454            elif self.scalingRatio == 1: 
     455                r = self.mdsScaleRatio / d 
     456            elif self.scalingRatio == 2: 
     457                r = self.mdsScaleRatio / d * e 
     458            elif self.scalingRatio == 3: 
     459                r = self.mdsScaleRatio * e 
     460            elif self.scalingRatio == 4: 
     461                r = self.mdsScaleRatio / (w * h) 
     462            elif self.scalingRatio == 5: 
     463                r = self.mdsScaleRatio / math.sqrt(w * h) 
     464            elif self.scalingRatio == 6: 
     465                r = 1 / math.sqrt(self.graph.number_of_nodes()) 
     466            elif self.scalingRatio == 7: 
     467                e_mds = 0 
     468                e_count = 0 
     469                for i in range(len(mds.points)): 
     470                    for j in range(i): 
     471                        x1 = mds.points[i][0] 
     472                        y1 = mds.points[i][1] 
     473                        x2 = mds.points[j][0] 
     474                        y2 = mds.points[j][1] 
     475                        e_mds += math.sqrt((x1-x2)**2 + (y1-y2)**2) 
     476                        e_count += 1 
     477                r = self.mdsScaleRatio / e_mds * e_count 
     478            elif self.scalingRatio == 8: 
     479                e_mds = 0 
     480                e_count = 0 
     481                for i in range(len(components)): 
     482                    for j in range(i + 1, len(components)): 
     483                        x_avg_graph_i, y_avg_graph_i, x_avg_mds_i, \ 
     484                        y_avg_mds_i, phi_i = component_props[i] 
     485                        x_avg_graph_j, y_avg_graph_j, x_avg_mds_j, \ 
     486                        y_avg_mds_j, phi_j = component_props[j] 
     487                        e_mds += math.sqrt((x_avg_mds_i-x_avg_mds_j)**2 + \ 
     488                                           (y_avg_mds_i-y_avg_mds_j)**2) 
     489                        e_count += 1 
     490                r = self.mdsScaleRatio / e_mds * e_count 
     491            elif self.scalingRatio == 9: 
     492                e_mds = 0 
     493                e_count = 0 
     494                for i in range(len(mds.points)): 
     495                    for j in range(i): 
     496                        x1 = mds.points[i][0] 
     497                        y1 = mds.points[i][1] 
     498                        x2 = mds.points[j][0] 
     499                        y2 = mds.points[j][1] 
     500                        e_mds += math.sqrt((x1-x2)**2 + (y1-y2)**2) 
     501                        e_count += 1 
     502                r = self.mdsScaleRatio / e_mds * e_count 
     503                 
     504            #r = self.mdsScaleRatio / d 
     505            #print "d", d, "r", r 
     506            #r = self.mdsScaleRatio / math.sqrt(self.graph.number_of_nodes()) 
     507             
     508        for i in range(len(components)): 
     509            component = components[i] 
     510            x_avg_graph, y_avg_graph, x_avg_mds, \ 
     511            y_avg_mds, phi = component_props[i] 
     512             
     513#            if phi[i]:  # rotate vertices 
     514#                #print "rotate", i, phi[i] 
     515#                r = numpy.array([[numpy.cos(phi[i]), -numpy.sin(phi[i])], [numpy.sin(phi[i]), numpy.cos(phi[i])]])  #rotation matrix 
     516#                c = [x_avg_graph, y_avg_graph]  # center of mass in FR coordinate system 
     517#                v = [numpy.dot(numpy.array([self.coors[0][u], self.coors[1][u]]) - c, r) + c for u in component] 
     518#                self.coors[0][component] = [u[0] for u in v] 
     519#                self.coors[1][component] = [u[1] for u in v] 
     520                 
     521            # translate vertices 
     522            if not self.rotationOnly: 
     523                self.coors[0][component] = \ 
     524                (self.coors[0][component] - x_avg_graph) / r + x_avg_mds 
     525                self.coors[1][component] = \ 
     526                (self.coors[1][component] - y_avg_graph) / r + y_avg_mds 
     527                
     528        if callbackUpdateCanvas: 
     529            callbackUpdateCanvas() 
     530     
     531    def mds_callback(self, a, b=None): 
     532        """Refresh the UI when running  MDS on network components.""" 
     533        if not self.mdsStep % self.mdsRefresh: 
     534            self.mds_update_data(self.mdsComponentList,  
     535                               self.mds,  
     536                               self.callbackUpdateCanvas) 
     537             
     538            if self.mdsType == MdsType.exactSimulation: 
     539                self.mds.points = [[self.coors[0][i], \ 
     540                                    self.coors[1][i]] \ 
     541                                    for i in range(len(self.coors))] 
     542                self.mds.freshD = 0 
     543             
     544            if self.callbackProgress != None: 
     545                self.callbackProgress(self.mds.avg_stress, self.mdsStep) 
     546                 
     547        self.mdsStep += 1 
     548 
     549        if self.stopMDS: 
     550            return 0 
     551        else: 
     552            return 1 
     553             
     554    def mds_components(self, mdsSteps, mdsRefresh, callbackProgress=None, \ 
     555                       callbackUpdateCanvas=None, torgerson=0, \ 
     556                       minStressDelta=0, avgLinkage=False, rotationOnly=False,\ 
     557                       mdsType=MdsType.componentMDS, scalingRatio=0, \ 
     558                       mdsFromCurrentPos=0): 
     559        """Position the network components according to similarities among  
     560        them. 
     561         
     562        """ 
     563 
     564        if self.items_matrix == None: 
     565            self.information('Set distance matrix to input signal') 
     566            return 1 
     567         
     568        if self.graph == None: 
     569            return 1 
     570         
     571        if self.items_matrix.dim != self.graph.number_of_nodes(): 
     572            return 1 
     573         
     574        self.mdsComponentList = Orange.network.nx.algorithms.components.connected_components(self.graph) 
     575        self.mdsRefresh = mdsRefresh 
     576        self.mdsStep = 0 
     577        self.stopMDS = 0 
     578        self.items_matrix.matrixType = Orange.core.SymMatrix.Symmetric 
     579        self.diag_coors = math.sqrt((min(self.coors[0]) -  \ 
     580                                     max(self.coors[0]))**2 + \ 
     581                                     (min(self.coors[1]) - \ 
     582                                      max(self.coors[1]))**2) 
     583        self.rotationOnly = rotationOnly 
     584        self.mdsType = mdsType 
     585        self.scalingRatio = scalingRatio 
     586 
     587        w = max(self.coors[0]) - min(self.coors[0]) 
     588        h = max(self.coors[1]) - min(self.coors[1]) 
     589        d = math.sqrt(w**2 + h**2) 
     590        #d = math.sqrt(w*h) 
     591        e = [math.sqrt((self.coors[0][u] - self.coors[0][v])**2 +  
     592                  (self.coors[1][u] - self.coors[1][v])**2) for  
     593                  (u, v) in self.graph.edges()] 
     594        self.mdsScaleRatio = d / sum(e) * float(len(e)) 
     595        #print d / sum(e) * float(len(e)) 
     596         
     597        if avgLinkage: 
     598            matrix = self.items_matrix.avgLinkage(self.mdsComponentList) 
     599        else: 
     600            matrix = self.items_matrix 
     601         
     602        #if self.mds == None:  
     603        self.mds = Orange.projection.mds.MDS(matrix) 
     604         
     605        if mdsFromCurrentPos: 
     606            if avgLinkage: 
     607                for u, c in enumerate(self.mdsComponentList): 
     608                    x = sum(self.coors[0][c]) / len(c) 
     609                    y = sum(self.coors[1][c]) / len(c) 
     610                    self.mds.points[u][0] = x 
     611                    self.mds.points[u][1] = y 
     612            else: 
     613                for u in range(self.graph.number_of_nodes()): 
     614                    self.mds.points[u][0] = self.coors[0][u]  
     615                    self.mds.points[u][1] = self.coors[1][u] 
     616             
     617        # set min stress difference between 0.01 and 0.00001 
     618        self.minStressDelta = minStressDelta 
     619        self.callbackUpdateCanvas = callbackUpdateCanvas 
     620        self.callbackProgress = callbackProgress 
     621         
     622        if torgerson: 
     623            self.mds.Torgerson()  
     624 
     625        self.mds.optimize(mdsSteps, Orange.projection.mds.SgnRelStress, self.minStressDelta,\ 
     626                          progress_callback=self.mds_callback) 
     627        self.mds_update_data(self.mdsComponentList, self.mds, callbackUpdateCanvas) 
     628         
     629        if callbackProgress != None: 
     630            callbackProgress(self.mds.avg_stress, self.mdsStep) 
     631         
     632        del self.rotationOnly 
     633        del self.diag_coors 
     634        del self.mdsRefresh 
     635        del self.mdsStep 
     636        #del self.mds 
     637        del self.mdsComponentList 
     638        del self.minStressDelta 
     639        del self.callbackUpdateCanvas 
     640        del self.callbackProgress 
     641        del self.mdsType 
     642        del self.mdsScaleRatio 
     643        del self.scalingRatio 
     644        return 0 
     645 
     646    def mds_components_avg_linkage(self, mdsSteps, mdsRefresh, \ 
     647                                   callbackProgress=None, \ 
     648                                   callbackUpdateCanvas=None, torgerson=0, \ 
     649                                   minStressDelta = 0, scalingRatio=0,\ 
     650                                   mdsFromCurrentPos=0): 
     651        return self.mds_components(mdsSteps, mdsRefresh, callbackProgress, \ 
     652                                   callbackUpdateCanvas, torgerson, \ 
     653                                   minStressDelta, True, \ 
     654                                   scalingRatio=scalingRatio, \ 
     655                                   mdsFromCurrentPos=mdsFromCurrentPos) 
     656     
     657    ########################################################################## 
     658    ### BEGIN: DEPRECATED METHODS (TO DELETE IN ORANGE 3.0)                ### 
     659    ########################################################################## 
     660     
     661     
     662     
     663     
  • orange/Orange/network/readwrite.py

    r8264 r8305  
    1 import Orange 
     1import os.path 
     2import warnings 
     3import itertools 
     4 
    25import networkx as nx 
    36import networkx.readwrite as rw 
    4 import warnings 
    5 import itertools 
     7from networkx.utils import _get_fh 
     8 
     9import orangeom 
     10import Orange 
    611import Orange.network 
    712 
    8 from networkx.utils import _get_fh 
    9  
    10 __all__ = ['generate_pajek', 'write_pajek', 'read_pajek', 'parse_pajek'] 
     13__all__ = ['read', 'generate_pajek', 'write_pajek', 'read_pajek', 'parse_pajek'] 
    1114 
    1215def _wrap(g): 
     
    1922    return g 
    2023 
    21 def read_pajek(path,encoding='UTF-8'): 
     24def read(path, encoding='UTF-8'): 
     25    #supported = ['.net', '.gml', '.gpickle', '.gz', '.bz2', '.graphml'] 
     26    supported = ['.net', '.gml', '.gpickle'] 
     27     
     28    if not os.path.isfile(path): 
     29        raise OSError('File %s does not exist.' % path) 
     30     
     31    root, ext = os.path.splitext(path) 
     32    if not ext in supported: 
     33        raise ValueError('Extension %s is not supported.' % ext) 
     34     
     35    if ext == '.net': 
     36        return read_pajek(path, encoding) 
     37     
     38    if ext == '.gml': 
     39        return read_gml(path, encoding) 
     40     
     41    if ext == '.gpickle': 
     42        return read_gpickle(path) 
     43 
     44def write(G, path, encoding='UTF-8'): 
     45    #supported = ['.net', '.gml', '.gpickle', '.gz', '.bz2', '.graphml'] 
     46    supported = ['.net', '.gml', '.gpickle'] 
     47     
     48    root, ext = os.path.splitext(path) 
     49    if not ext in supported: 
     50        raise ValueError('Extension %s is not supported. Use %s.' % (ext, ', '.join(supported))) 
     51     
     52    if ext == '.net': 
     53        write_pajek(G, path, encoding) 
     54         
     55    if ext == '.gml': 
     56        write_gml(G, path) 
     57         
     58    if ext == '.gpickle': 
     59        write_gpickle(G, path) 
     60         
     61    if G.items() is not None: 
     62        G.items().save(root + '_items.tab') 
     63         
     64    if G.links() is not None: 
     65        G.links().save(root + '_links.tab') 
     66 
     67def read_gpickle(path): 
     68    return _wrap(rw.read_gpickle(path)) 
     69 
     70def write_gpickle(G, path): 
     71    rw.write_gpickle(G, path) 
     72 
     73def read_pajek(path, encoding='UTF-8'): 
     74    """  
     75    Read Pajek file. 
    2276    """ 
    23     A copy&paste of networkx's function. Calls the local parse_pajek(). 
     77    edges, arcs, items = orangeom.GraphLayout().readPajek(path) 
     78    if len(arcs) > 0: 
     79        # directed graph 
     80        G = Orange.network.DiGraph() 
     81        G.add_nodes_from(range(len(items))) 
     82        G.add_edges_from(((u,v,{'weight':d}) for u,v,d in edges)) 
     83        G.add_edges_from(((v,u,{'weight':d}) for u,v,d in edges)) 
     84        G.add_edges_from(((u,v,{'weight':d}) for u,v,d in arcs)) 
     85        G.set_items(items) 
     86    else: 
     87        G = Orange.network.Graph() 
     88        G.add_nodes_from(range(len(items))) 
     89        G.add_edges_from(((u,v,{'weight':d}) for u,v,d in edges)) 
     90        G.set_items(items) 
     91         
     92    return G 
     93    #fh=_get_fh(path, 'rb') 
     94    #lines = (line.decode(encoding) for line in fh) 
     95    #return parse_pajek(lines) 
     96 
     97def write_pajek(G, path, encoding='UTF-8'): 
    2498    """ 
    25     fh=_get_fh(path, 'rb') 
    26     lines = (line.decode(encoding) for line in fh) 
    27     return parse_pajek(lines) 
     99    A copy&paste of networkx's function with some bugs fixed: 
     100     - call the new generate_pajek. 
     101    """ 
     102    fh=_get_fh(path, 'wb') 
     103    for line in generate_pajek(G): 
     104        line+='\n' 
     105        fh.write(line.encode(encoding)) 
    28106 
    29107def parse_pajek(lines): 
    30108    """ 
    31     A copy&paste of networkx's function with some bugs fixed: 
    32       - make it a Graph or DiGraph if there is no reason to have a Multi*, 
    33       - do not lose graph's name during its conversion. 
     109    Parse string in Pajek file format. 
    34110    """ 
    35     import shlex 
    36     from networkx.utils import is_string_like 
    37     multigraph=False 
    38     if is_string_like(lines): lines=iter(lines.split('\n')) 
    39     lines = iter([line.rstrip('\n') for line in lines]) 
    40     G=nx.MultiDiGraph() # are multiedges allowed in Pajek? assume yes 
    41     directed=True # assume this is a directed network for now 
    42     while lines: 
    43         try: 
    44             l=next(lines) 
    45         except: #EOF 
    46             break 
    47         if l.lower().startswith("*network"): 
    48             label,name=l.split(None, 1) 
    49             G.name=name 
    50         if l.lower().startswith("*vertices"): 
    51             nodelabels={} 
    52             l,nnodes=l.split() 
    53             for i in range(int(nnodes)): 
    54                 splitline=shlex.split(str(next(lines))) 
    55                 id,label=splitline[0:2] 
    56                 G.add_node(label) 
    57                 nodelabels[id]=label 
    58                 G.node[label]={'id':id} 
    59                 try:  
    60                     x,y,shape=splitline[2:5] 
    61                     G.node[label].update({'x':float(x), 
    62                                           'y':float(y), 
    63                                           'shape':shape}) 
    64                 except: 
    65                     pass 
    66                 extra_attr=zip(splitline[5::2],splitline[6::2]) 
    67                 G.node[label].update(extra_attr) 
    68         if l.lower().startswith("*edges") or l.lower().startswith("*arcs"): 
    69             if l.lower().startswith("*edge"): 
    70                # switch from multi digraph to multi graph 
    71                 G=nx.MultiGraph(G, name=G.name) 
    72             for l in lines: 
    73                 splitline=shlex.split(str(l)) 
    74                 ui,vi=splitline[0:2] 
    75                 u=nodelabels.get(ui,ui) 
    76                 v=nodelabels.get(vi,vi) 
    77                 # parse the data attached to this edge and put in a dictionary  
    78                 edge_data={} 
    79                 try: 
    80                     # there should always be a single value on the edge? 
    81                     w=splitline[2:3] 
    82                     edge_data.update({'weight':float(w[0])}) 
    83                 except: 
    84                     pass 
    85                     # if there isn't, just assign a 1 
    86 #                    edge_data.update({'value':1}) 
    87                 extra_attr=zip(splitline[3::2],splitline[4::2]) 
    88                 edge_data.update(extra_attr) 
    89                 if G.has_edge(u,v): 
    90                     multigraph=True 
    91                 G.add_edge(u,v,**edge_data) 
    92  
    93     if not multigraph: # use Graph/DiGraph if no parallel edges 
    94         if G.is_directed(): 
    95             G=nx.DiGraph(G, name=G.name) 
    96         else: 
    97             G=nx.Graph(G, name=G.name) 
    98     return _wrap(G) 
     111    return read_pajek(lines) 
    99112 
    100113def generate_pajek(G): 
     
    126139        s = ' '.join(map(make_str,(id,n,x,y,shape))) 
    127140        for k,v in na.items(): 
    128             s += ' %s %s'%(k,v) 
     141            if k != 'x' and k != 'y': 
     142                s += ' %s %s'%(k,v) 
    129143        yield s 
    130144 
     
    147161        yield s 
    148162         
    149 def write_pajek(G, path, encoding='UTF-8'): 
    150     """ 
    151     A copy&paste of networkx's function with some bugs fixed: 
    152      - call the new generate_pajek. 
    153     """ 
    154     fh=_get_fh(path, 'wb') 
    155     for line in generate_pajek(G): 
    156         line+='\n' 
    157         fh.write(line.encode(encoding)) 
     163def read_gml(path, encoding='latin-1', relabel=False): 
     164    return _wrap(rw.read_gml(path, encoding, relabel)) 
    158165 
    159 def parse_pajek_project(lines): 
    160     network_lines = [] 
    161     result = [] 
    162     for i, line in enumerate(itertools.chain(lines, ["*"])): 
    163         line_low = line.strip().lower() 
    164         if not line_low: 
    165             continue 
    166         if line_low[0] == "*" and not any(line_low.startswith(x) 
    167                                           for x in ["*vertices", "*arcs", "*edges"]): 
    168             if network_lines != []: 
    169                 result.append(parse_pajek(network_lines)) 
    170                 network_lines = [] 
    171         if line_low.startswith("*network") or network_lines != []: 
    172             network_lines.append(line) 
    173     return result 
    174  
    175 def read_pajek_project(path, encoding='UTF-8'): 
    176     fh = _get_fh(path, 'rb') 
    177     lines = (line.decode(encoding) for line in fh) 
    178     return parse_pajek_project(lines) 
     166def write_gml(G, path): 
     167    rw.write_gml(G, path) 
    179168 
    180169#read_pajek.__doc__ = rw.read_pajek.__doc__ 
  • orange/OrangeCanvas/orngCanvasItems.py

    r8264 r8305  
    366366            self.instance = None 
    367367             
    368             import gc 
    369             gc.collect() 
    370              
    371368            self.scene().removeItem(self) 
    372369                 
  • orange/OrangeCanvas/orngOutput.py

    r8264 r8305  
    1616except NameError: 
    1717    __DISABLE_OUTPUT__ = False 
    18      
    19 def thread_safe_discard(func): 
    20     """  
    21     """ 
    22     from functools import wraps 
    23     @wraps(func) 
    24     def safe_wrapper(self, *args, **kwargs): 
    25         if not hasattr(self, "_thread_safe_thread"): 
    26             self._thread_safe_thread = self.thread() 
    27         if QThread.currentThread() is not self._thread_safe_thread: 
    28             print >> sys.stderr, "Calling", func, "from the wrong thread.", "Discarding the call!" 
    29         else: 
    30             return func(self, *args, **kwargs) 
    31     return safe_wrapper 
    3218 
    3319def thread_safe_queue(func): 
     
    3723    @wraps(func) 
    3824    def safe_wrapper(self, *args, **kwargs): 
    39         if not hasattr(self, "_thread_safe_queue"): 
     25        if not hasattr(self, "_thread_safe_thread"):  
    4026            self._thread_safe_thread = self.thread() 
    4127        if QThread.currentThread() is not self._thread_safe_thread: 
    42             print >> sys.stderr, "Calling", func, "from the wrong thread.", "Queuing the call!" 
     28            print >> sys.stderr, "Calling", func, "with", args, kwargs, "from the wrong thread.", "Queuing the call!" 
    4329            QMetaObject.invokeMethod(self, "queuedInvoke", Qt.QueuedConnection, Q_ARG("PyQt_PyObject", partial(safe_wrapper, self, *args, **kwargs))) 
    4430        else: 
     
    194180        for i in range(len(list)): 
    195181            (file, line, funct, code) = list[i] 
    196             if code == None: continue 
     182            if code == None: 
     183                continue 
    197184            (dir, filename) = os.path.split(file) 
    198185            text += "<nobr>" + totalSpace + "File: <b>" + filename + "</b>, line %4d" %(line) + " in <b>%s</b></nobr><br>\n" % (self.getSafeString(funct)) 
  • orange/OrangeCanvas/orngTabs.py

    r8264 r8305  
    548548        self.widgetSuggestEdit.listWidget.setIconSize(QSize(16,16))  
    549549        self.setDefaultWidget(self.widgetSuggestEdit) 
     550        self._in_callback = False 
    550551         
    551552    def callback(self): 
    552         text = str(self.widgetSuggestEdit.text()) 
    553         for action in self.actions: 
    554             if action.widgetInfo.name == text: 
    555                 self.widgetInfo = action.widgetInfo 
    556                 self.parent.setActiveAction(self) 
    557                 self.activate(QAction.Trigger) 
    558                 QApplication.sendEvent(self.widgetSuggestEdit, QKeyEvent(QEvent.KeyPress, Qt.Key_Enter, Qt.NoModifier)) 
    559                 return 
     553        if not self._in_callback: 
     554            try: 
     555                self._in_callback = True 
     556                text = str(self.widgetSuggestEdit.text()) 
     557                for action in self.actions: 
     558                    if action.widgetInfo.name == text: 
     559                        self.widgetInfo = action.widgetInfo 
     560                        self.parent.setActiveAction(self) 
     561                        self.activate(QAction.Trigger) 
     562                        QApplication.sendEvent(self.widgetSuggestEdit, QKeyEvent(QEvent.KeyPress, Qt.Key_Enter, Qt.NoModifier)) 
     563                        return 
     564            finally: 
     565                self._in_callback = False 
    560566         
    561567 
  • orange/OrangeWidgets/Associate/OWAssociationRules.py

    r8264 r8305  
    1010import OWGUI 
    1111 
     12 
     13def table_sparsness(table): 
     14    """ Return the table sparseness (the ratio of unknown values in the table) 
     15    for both the regular part (attributes) and meta attributes (basket format). 
     16     
     17    :param table: Data table. 
     18    :type table: :class:`Orange.data.Table` 
     19     
     20    """ 
     21    unknown_count = 0 
     22    all_count = 0 
     23     
     24    for ex in table: 
     25        for val in ex: 
     26            if val.isSpecial(): 
     27                unknown_count += 1 
     28            all_count += 1 
     29    regular_sparseness = float(unknown_count) / (all_count or 1) 
     30     
     31    metas = table.domain.getmetas().values() 
     32    unknown_count = 0 
     33    all_count = 0 
     34     
     35    for ex in table: 
     36        for meta in metas: 
     37            val = ex[meta] 
     38            if val.isSpecial(): 
     39                unknown_count += 1 
     40            all_count += 1 
     41    meta_sparseness = float(unknown_count) / (all_count or 1) 
     42    return regular_sparseness, meta_sparseness 
     43 
     44     
    1245class OWAssociationRules(OWWidget): 
    1346    settingsList = ["useSparseAlgorithm", "classificationRules", "minSupport", "minConfidence", "maxRules"] 
     
    2962 
    3063        box = OWGUI.widgetBox(self.space, "Algorithm", addSpace = True) 
    31         self.cbSparseAlgorithm = OWGUI.checkBox(box, self, 'useSparseAlgorithm', 'Use algorithm for sparse data', tooltip="Use original Agrawal's algorithm", callback = self.checkSparse) 
    32         self.cbClassificationRules = OWGUI.checkBox(box, self, 'classificationRules', 'Induce classification rules', tooltip="Induce classification rules") 
     64        self.cbSparseAlgorithm = OWGUI.checkBox(box, self, 'useSparseAlgorithm', 
     65                                        'Use algorithm for sparse data', 
     66                                        tooltip="Use original Agrawal's algorithm", 
     67                                        callback=self.checkSparse) 
     68         
     69        self.cbClassificationRules = OWGUI.checkBox(box, self, 'classificationRules', 
     70                                        'Induce classification rules', 
     71                                        tooltip="Induce classification rules") 
    3372        self.checkSparse() 
    3473 
     
    5998    def generateRules(self): 
    6099        self.error() 
     100        self.warning(0) 
    61101        if self.dataset: 
     102            if self.dataset and self.useSparseAlgorithm and not self.datasetIsSparse: 
     103                self.warning(0, "Using algorithm for sparse data, but data does not appear to be sparse!") 
    62104            try: 
    63105                num_steps = 20 
     
    77119            self.send("Association Rules", None) 
    78120 
    79  
    80121    def checkSparse(self): 
    81122        self.cbClassificationRules.setEnabled(not self.useSparseAlgorithm) 
     
    83124            self.cbClassificationRules.setChecked(0) 
    84125 
    85  
    86     def setData(self,dataset): 
     126    def setData(self, dataset): 
    87127        self.dataset = dataset 
     128        if dataset is not None: 
     129            regular, meta = table_sparsness(dataset) 
     130            self.datasetIsSparse = regular > 0.4 or meta > 0.4 
     131             
    88132        self.generateRules() 
     133         
    89134 
    90135if __name__=="__main__": 
  • orange/OrangeWidgets/Classify/OWCN2RulesViewer.py

    r8264 r8305  
    3434        QStyledItemDelegate.__init__(self, parent) 
    3535 
    36  
    3736    def displayText(self, value, locale): 
    3837        dist = value.toPyObject() 
    3938        if isinstance(dist, orange.Distribution): 
    4039            return QString("<" + ",".join(["%.1f" % c for c in dist]) + ">") 
    41 #            return QString("") 
    4240        else: 
    4341            return QStyledItemDelegate.displayText(value, locale) 
    4442         
    45      
    4643    def sizeHint(self, option, index): 
    4744        metrics = QFontMetrics(option.font) 
    48         height = metrics.lineSpacing() * 2 + 8 
    49         width = metrics.width(self.displayText(index.data(Qt.DisplayRole), QLocale())) 
     45        height = metrics.lineSpacing() * 2 + 8 # 4 pixel margin 
     46        width = metrics.width(self.displayText(index.data(Qt.DisplayRole), QLocale())) + 8 
    5047        return QSize(width, height) 
    51      
    5248     
    5349    def paint(self, painter, option, index): 
    5450        dist = index.data(Qt.DisplayRole).toPyObject() 
    55         rect = option.rect 
    56         rect_w = rect.width() - len([c for c in dist if c]) - 2 
    57         rect_h = rect.height() - 2 
     51        rect = option.rect.adjusted(4, 4, -4, -4) 
     52        rect_w = rect.width() - len([c for c in dist if c]) # This is for the separators in the distribution bar 
     53        rect_h = rect.height() 
    5854        colors = OWColorPalette.ColorPaletteHSV(len(dist)) 
    5955        abs = dist.abs 
     
    6157         
    6258        painter.save() 
     59        painter.setFont(option.font) 
    6360        qApp.style().drawPrimitive(QStyle.PE_PanelItemViewRow, option, painter) 
    6461         
     
    6764        drect_h = metrics.height() 
    6865        lineSpacing = metrics.lineSpacing() 
     66        leading = metrics.leading() 
    6967        distText = self.displayText(index.data(Qt.DisplayRole), QLocale()) 
     68         
     69        if option.state & QStyle.State_Selected: 
     70            color = option.palette.highlightedText().color() 
     71        else: 
     72            color = option.palette.text().color() 
     73#        painter.setBrush(QBrush(color)) 
     74        painter.setPen(QPen(color)) 
     75             
    7076        if showText: 
    71             textPos = QPoint(rect.topLeft().x(), rect.center().y() - lineSpacing) 
    72             painter.drawText(QRect(textPos, QSize(rect.width(), lineSpacing)), Qt.AlignCenter, distText) 
    73          
    74         painter.translate(QPoint(rect.topLeft().x(), rect.center().y() - (drect_h/2 if not showText else  - 2))) 
     77            textPos = rect.topLeft() 
     78            textRect = QRect(textPos, QSize(rect.width(), rect.height() / 2 - leading)) 
     79            painter.drawText(textRect, Qt.AlignHCenter | Qt.AlignBottom, distText) 
     80             
     81        painter.setPen(QPen(Qt.black)) 
     82        painter.translate(QPoint(rect.topLeft().x(), rect.center().y() - (drect_h/2 if not showText else  0))) 
    7583        for i, count in enumerate(dist): 
    7684            if count: 
     
    8088                painter.drawRect(QRect(1, 1, width, drect_h)) 
    8189                painter.translate(width, 0) 
    82          
    8390        painter.restore() 
    84          
    8591         
    8692class MultiLineStringItemDelegate(QStyledItemDelegate): 
     
    8894        metrics = QFontMetrics(option.font) 
    8995        text = index.data(Qt.DisplayRole).toString() 
    90         return metrics.size(0, text) 
    91      
     96        size = metrics.size(0, text) 
     97        return QSize(size.width() + 8, size.height() + 8) # 4 pixel margin 
    9298     
    9399    def paint(self, painter, option, index): 
     
    95101        painter.save() 
    96102        qApp.style().drawPrimitive(QStyle.PE_PanelItemViewRow, option, painter) 
    97         painter.drawText(option.rect, Qt.AlignLeft | Qt.AlignVCenter, text) 
     103        rect = option.rect.adjusted(4, 4, -4, -4) 
     104             
     105        if option.state & QStyle.State_Selected: 
     106            color = option.palette.highlightedText().color() 
     107        else: 
     108            color = option.palette.text().color() 
     109#        painter.setBrush(QBrush(color)) 
     110        painter.setPen(QPen(color)) 
     111         
     112             
     113        painter.drawText(rect, option.displayAlignment, text) 
    98114        painter.restore() 
    99115         
     
    103119        obj = _toPyObject(value) #value.toPyObject() 
    104120        return QString(str(obj)) 
    105              
    106  
     121     
     122     
     123class PyFloatItemDelegate(QStyledItemDelegate): 
     124    def displayText(self, value, locale): 
     125        obj = _toPyObject(value) 
     126        if isinstance(obj, float): 
     127            return QString("%.2f" % obj) 
     128        else: 
     129            return QString(str(obj)) 
     130         
     131def rule_to_string(rule, show_distribution = True): 
     132    """ 
     133    Write a string presentation of rule in human readable format. 
     134     
     135    :param rule: rule to pretty-print. 
     136    :type rule: :class:`Orange.classification.rules.Rule` 
     137     
     138    :param show_distribution: determines whether presentation should also 
     139        contain the distribution of covered instances 
     140    :type show_distribution: bool 
     141     
     142    """ 
     143    import Orange 
     144    def selectSign(oper): 
     145        if oper == Orange.core.ValueFilter_continuous.Less: 
     146            return "<" 
     147        elif oper == Orange.core.ValueFilter_continuous.LessEqual: 
     148            return "<=" 
     149        elif oper == Orange.core.ValueFilter_continuous.Greater: 
     150            return ">" 
     151        elif oper == Orange.core.ValueFilter_continuous.GreaterEqual: 
     152            return ">=" 
     153        else: return "=" 
     154 
     155    if not rule: 
     156        return "None" 
     157    conds = rule.filter.conditions 
     158    domain = rule.filter.domain 
     159     
     160    def pprint_values(values): 
     161        if len(values) > 1: 
     162            return "[" + ",".join(values) + "]" 
     163        else: 
     164            return str(values[0]) 
     165         
     166    ret = "IF " 
     167    if len(conds)==0: 
     168        ret = ret + "TRUE" 
     169 
     170    for i,c in enumerate(conds): 
     171        if i > 0: 
     172            ret += " AND " 
     173        if type(c) == Orange.core.ValueFilter_discrete: 
     174            ret += domain[c.position].name + "=" + pprint_values( \ 
     175                   [domain[c.position].values[int(v)] for v in c.values]) 
     176        elif type(c) == Orange.core.ValueFilter_continuous: 
     177            ret += domain[c.position].name + selectSign(c.oper) + str(c.ref) 
     178    if rule.classifier and type(rule.classifier) == Orange.classification.ConstantClassifier\ 
     179            and rule.classifier.default_val: 
     180        ret = ret + " THEN "+domain.class_var.name+"="+\ 
     181        str(rule.classifier.default_value) 
     182        if show_distribution: 
     183            ret += str(rule.class_distribution) 
     184    elif rule.classifier and type(rule.classifier) == Orange.classification.ConstantClassifier\ 
     185            and type(domain.class_var) == Orange.core.EnumVariable: 
     186        ret = ret + " THEN "+domain.class_var.name+"="+\ 
     187        str(rule.class_distribution.modus()) 
     188        if show_distribution: 
     189            ret += str(rule.class_distribution) 
     190    return ret         
     191 
     192         
    107193class OWCN2RulesViewer(OWWidget): 
    108194    settingsList = ["show_Rule_length", "show_Rule_quality", "show_Coverage", 
     
    161247        self.tableView = QTableView() 
    162248        self.tableView.setItemDelegate(PyObjectItemDelegate(self)) 
     249        self.tableView.setItemDelegateForColumn(1, PyFloatItemDelegate(self)) 
     250        self.tableView.setItemDelegateForColumn(2, PyFloatItemDelegate(self)) 
    163251        self.tableView.setItemDelegateForColumn(4, DistributionItemDelegate(self)) 
    164252        self.tableView.setItemDelegateForColumn(5, MultiLineStringItemDelegate(self)) 
     
    183271        self.resize(800, 600) 
    184272         
    185          
    186273    def setRuleClassifier(self, classifier=None): 
    187274        self.classifier = classifier 
     
    190277        else: 
    191278            self.rules = [] 
    192              
    193279         
    194280    def handleNewSignals(self): 
    195281        self.updateRulesModel() 
    196282        self.commit() 
    197          
    198283     
    199284    def updateRulesModel(self): 
     
    213298            self.tableView.resizeColumnsToContents() 
    214299            self.tableView.resizeRowsToContents() 
    215              
    216300     
    217301    def ruleText(self, rule): 
    218         text = orngCN2.ruleToString(rule, showDistribution=False) 
     302        text = rule_to_string(rule, show_distribution=False) 
    219303        p = re.compile(r"[0-9]\.[0-9]+") 
    220304        text = p.sub(lambda match: "%.2f" % float(match.group()[0]), text) 
     
    222306        text = text.replace("THEN", "\nTHEN") 
    223307        return text 
    224          
    225308     
    226309    def updateVisibleColumns(self): 
     
    228311            self.tableView.horizontalHeader().setSectionHidden(i, not getattr(self, "show_%s" % header.replace(" ", "_"))) 
    229312     
    230      
    231313    def commitIf(self): 
    232314        if self.autoCommit: 
     
    234316        else: 
    235317            self.changedFlag = True 
    236      
    237318             
    238319    def selectedAttrsFromRules(self, rules): 
     
    242323                selected.append(rule.filter.domain[c.position]) 
    243324        return set(selected) 
    244      
    245325     
    246326    def selectedExamplesFromRules(self, rules, examples): 
  • orange/OrangeWidgets/Classify/OWClassificationTreeViewer.py

    r8264 r8305  
    315315    tree = orange.TreeLearner(data, storeExamples = 1) 
    316316    ow.setClassificationTree(tree) 
     317    ow.show() 
    317318    a.exec_() 
    318319    ow.saveSettings() 
  • orange/OrangeWidgets/Evaluate/OWTestLearners.py

    r8264 r8305  
    135135        self.testDataBtn = self.sBtns.buttons[-1] 
    136136        self.testDataBtn.setDisabled(True) 
    137  
    138 #        box = OWGUI.widgetBox(self.sBtns, orientation='vertical', addSpace=False) 
    139 #        OWGUI.separator(box) 
     137         
    140138        OWGUI.separator(self.sBtns) 
    141139        OWGUI.checkBox(self.sBtns, self, 'applyOnAnyChange', 
     
    148146            self.resampling = 3 
    149147 
    150 #        OWGUI.separator(self.controlArea) 
    151  
    152148        # statistics 
    153149        self.statLayout = QStackedLayout() 
    154 #        self.cbox = OWGUI.widgetBox(self.controlArea, spacing=8, margin=0) 
    155 #        self.cbox.layout().setSpacing(8) 
    156150        self.cbox = OWGUI.widgetBox(self.controlArea, addToLayout=False) 
    157151        self.cStatLabels = [s.name for s in self.cStatistics] 
     
    160154                                     selectionMode = QListWidget.MultiSelection, 
    161155                                     callback=self.newscoreselection) 
    162 #        OWGUI.separator(self.cbox) 
     156         
    163157        self.cbox.layout().addSpacing(8) 
    164158        self.targetCombo = OWGUI.comboBox(self.cbox, self, "targetClass", orientation=0, 
     
    177171         
    178172        self.statLayout.setCurrentWidget(self.cbox) 
    179          
    180 #        self.rstatLB.box.hide() 
    181  
    182173 
    183174        # score table 
     
    185176        self.g = OWGUI.widgetBox(self.mainArea, 'Evaluation Results') 
    186177        self.tab = OWGUI.table(self.g, selectionMode = QTableWidget.NoSelection) 
    187  
    188         #self.lab = QLabel(self.g) 
    189178 
    190179        self.resize(680,470) 
     
    230219            if i not in usestat: 
    231220                self.tab.hideColumn(i+1) 
    232  
    233221 
    234222    def sendReport(self): 
     
    277265        indices = orange.MakeRandomIndices2(p0=min(n, len(self.data)), stratified=orange.MakeRandomIndices2.StratifiedIfPossible) 
    278266        new = self.data.selectref(indices(self.data)) 
    279 #        new = self.data.selectref([1]*min(n, len(self.data)) + 
    280 #                                  [0]*(len(self.data) - min(n, len(self.data)))) 
     267         
    281268        self.warning(0) 
     269        learner_exceptions = [] 
    282270        for l in [self.learners[id] for id in ids]: 
    283271            learner = l.learner 
     
    291279                    l.scores = [] 
    292280            except Exception, ex: 
    293                 self.warning(0, "Learner %s ends with exception: %s" % (l.name, str(ex))) 
     281                learner_exceptions.append((l, ex)) 
    294282                l.scores = [] 
    295283 
     284        if learner_exceptions: 
     285            text = "\n".join("Learner %s ends with exception: %s" % (l.name, str(ex)) \ 
     286                             for l, ex in learner_exceptions) 
     287            self.warning(0, text) 
     288             
    296289        if not learners: 
    297290            return 
     
    345338            try: 
    346339                scores.append(eval("orngStat." + s.f)) 
    347                  
    348340            except Exception, ex: 
    349341                self.error(i, "An error occurred while evaluating orngStat." + s.f + "on %s due to %s" % \ 
     
    385377            self.data = orange.Filter_hasClassValue(self.data) 
    386378            self.statLayout.setCurrentWidget(self.cbox if self.isclassification() else self.rbox) 
    387 #            if self.isclassification(): 
    388 #                self.rstatLB.box.hide() 
    389 #                self.cbox.show() 
    390 #            else: 
    391 #                self.cbox.hide() 
    392 #                self.rstatLB.box.show() 
     379             
    393380            self.stat = [self.rStatistics, self.cStatistics][self.isclassification()] 
    394381             
  • orange/OrangeWidgets/OWClustering.py

    r8264 r8305  
    55class HierarchicalClusterItem(QGraphicsRectItem): 
    66    """ An object used to draw orange.HierarchicalCluster on a QGraphicsScene 
     7     
     8    ..note:: deprecated use DendrogramWidget instead 
    79    """ 
    810    def __init__(self, cluster, *args, **kwargs): 
     
    1921        self.cluster = cluster 
    2022        self.branches = [] 
    21 #        if cluster.branches: 
    22 #            for branch in cluster.branches: 
    23 #                item = type(self)(branch, self) 
    24 #                item.setZValue(self.zValue()-1) 
    25 #                self.branches.append(item) 
    26 #            self.setRect(self.branches[0].rect().center().x(), 
    27 #                         0.0, #self.cluster.height, 
    28 #                         self.branches[-1].rect().center().x() - self.branches[0].rect().center().x(), 
    29 #                         self.cluster.height) 
    30 #        else: 
    31 #            self.setRect(cluster.first, 0, 0, 0) 
    3223        self.setFlags(QGraphicsItem.ItemIsSelectable) 
    3324        self.setPen(self.standardPen) 
    3425        self.setBrush(QBrush(Qt.white, Qt.SolidPattern)) 
    3526#        self.setAcceptHoverEvents(True) 
    36          
    37 #        if self.isTopLevel(): ## top level cluster 
    38 #            self.clusterGeometryReset() 
    3927             
    4028    @classmethod 
     
    153141        self.setHighlight(False) 
    154142         
     143DEBUG = False # Set to true to see widget geometries 
     144 
    155145from Orange.clustering import hierarchical 
    156  
    157146class DendrogramItem(QGraphicsRectItem): 
    158147    """ A Graphics item representing a cluster in a DendrogramWidget. 
     
    291280                path.lineTo(rect.bottomRight()) 
    292281                path.lineTo(rect.topRight()) 
    293 #        stroke = QPainterPathStroker() 
    294 #        path = stroke.createStroke(path) 
    295282        self._path = path 
    296 #        self.setPath(path) 
    297283     
    298284    def itemChange(self, change, value): 
     
    330316        else: 
    331317            raise AttributeError(name) 
     318     
    332319     
    333320class DendrogramLayout(QGraphicsLayout): 
     
    381368            for item, cluster in zip(self._items, self._clusters): 
    382369                start, center, end = self._layout_dict[cluster] 
     370                cluster_height = cluster.height 
    383371                if self.orientation == Qt.Vertical: 
    384372                    # Should this be translated so all items have positive x coordinates 
    385                     rect = QRectF(-cluster.height * height_scale, start * width_scale, 
    386                                   cluster.height * height_scale, (end - start) * width_scale) 
    387                     rect.translate(c_rect.width() + x_offset, 0 + y_offset) 
     373                    rect = QRectF(-cluster_height * height_scale, start * width_scale, 
     374                                  cluster_height * height_scale, (end - start) * width_scale) 
     375                    rect.translate(c_rect.width() + x_offset, y_offset) 
    388376                else: 
    389377                    rect = QRectF(start * width_scale, 0.0, 
    390                                   (end - start) * width_scale, cluster.height * height_scale) 
    391                     rect.translate(0 + x_offset,  y_offset) 
     378                                  (end - start) * width_scale, cluster_height * height_scale) 
     379                    rect.translate(x_offset,  y_offset) 
    392380                     
    393381                if rect.isEmpty(): 
     
    400388     
    401389    def setGeometry(self, geometry): 
     390        old = self.geometry() 
    402391        QGraphicsLayout.setGeometry(self, geometry) 
    403         self.do_layout() 
     392        if self.geometry() != old: 
     393            self.do_layout() 
    404394         
    405395    def sizeHint(self, which, constraint=QSizeF()): 
     
    415405                width = sum([hint.width() for hint in hints] + [0]) 
    416406            return QSizeF(width, height) 
     407        elif which == Qt.MinimumSize: 
     408            left, top, right, bottom = self.getContentsMargins() 
     409            return QSizeF(left + right, top + bottom) 
    417410        else: 
    418411            return QSizeF() 
     
    445438    """ 
    446439    polygon = QPolygonF() 
    447     for item in hierarchical.preorder(item): 
    448         adjusted = item.rect().adjusted(-adjust, -adjust, adjust, adjust) 
     440    # Selection spaning item itself 
     441    adjusted = item.rect().adjusted(-adjust, -adjust, adjust, adjust) 
     442    polygon = polygon.united(QPolygonF(adjusted)) 
     443     
     444    # Collect all left most tree branches 
     445    current = item 
     446    while current.branches: 
     447        current = current.branches[0] 
     448        adjusted = current.rect().adjusted(-adjust, -adjust, adjust, adjust) 
    449449        polygon = polygon.united(QPolygonF(adjusted)) 
     450     
     451    # Collect all right most tree branches 
     452    current = item 
     453    while current.branches: 
     454        current = current.branches[-1] 
     455        adjusted = current.rect().adjusted(-adjust, -adjust, adjust, adjust) 
     456        polygon = polygon.united(QPolygonF(adjusted)) 
     457     
    450458    return polygon 
    451459 
    452460     
    453461class DendrogramWidget(QGraphicsWidget): 
    454     """ A Graphics Widget displaying a dendrogram.  
     462    """ A Graphics Widget displaying a dendrogram. 
    455463    """ 
    456464    def __init__(self, root=None, parent=None, orientation=Qt.Vertical, scene=None): 
     
    485493                for branch in cluster.branches or []: 
    486494                    branch_item = self.dendrogram_items[branch]  
    487 #                    branch_item.setParentItem(item) 
    488495                    self.cluster_parent[branch] = cluster 
    489496                items.append(GraphicsRectLayoutItem(item)) 
     
    491498                 
    492499            self.layout().setDendrogram(root, items) 
    493 #            self.dendrogram_items[root].setParentItem(self) 
    494500             
    495501            self.resize(self.layout().sizeHint(Qt.PreferredSize)) 
    496502            self.layout().activate() 
     503            self.emit(SIGNAL("dendrogramLayoutChanged()")) 
    497504             
    498505    def item(self, cluster): 
     
    511518        root_height = self.root_cluster.height 
    512519        if self.orientation == Qt.Vertical: 
    513             return  (root_height - 0) / (rect.left() - rect.right()) * point.x() + root_height 
    514         else: 
    515             return (root_height - 0) / (rect.bottom() - rect.top()) * point.y() + root_height 
     520            return  (root_height - 0) / (rect.left() - rect.right()) * (point.x() - rect.left()) + root_height 
     521        else: 
     522            return (root_height - 0) / (rect.bottom() - rect.top()) * (point.y() - rect.top()) + root_height 
     523         
     524    def pos_at_height(self, height): 
     525        """ Return a point in local coordinates for `height` (in cluster 
     526        height scale). 
     527        """ 
     528        root_item = self.item(self.root_cluster) 
     529        rect = root_item.rect() 
     530        root_height = self.root_cluster.height 
     531        if self.orientation == Qt.Vertical: 
     532            x = (rect.right() - rect.left()) / root_height * (root_height - height) + rect.left() 
     533            y = 0.0 
     534        else: 
     535            x = 0.0 
     536            y = (rect.bottom() - rect.top()) / root_height * height + rect.top() 
     537             
     538        return QPointF(x, y) 
    516539             
    517540    def set_labels(self, labels): 
     
    576599          
    577600        """ 
    578         for sel in list(self.selected_items): 
    579             self._remove_selection(sel) 
    580              
    581         for item in items: 
    582             self._add_selection(item, reenumerate=False) 
    583              
    584         self._re_enumerate_selections() 
     601        to_remove = set(self.selected_items) - set(items) 
     602        to_add = set(items) - set(self.selected_items) 
     603         
     604        for sel in to_remove: 
     605            self._remove_selection(sel, emit_changed=False) 
     606        for sel in to_add: 
     607            self._add_selection(sel, reenumerate=False, emit_changed=False) 
     608         
     609        if to_add or to_remove: 
     610            self._re_enumerate_selections() 
     611            self.emit(SIGNAL("selectionChanged()")) 
    585612         
    586613    def set_selected_clusters(self, clusters): 
     
    593620         
    594621    def item_selection(self, item, select_state): 
    595         """ Update item selection. 
     622        """ Set the `item`s selection state to `select_state` 
    596623         
    597624        :param item: DendrogramItem. 
    598625        :param select_state: New selection state for item. 
    599626        """ 
    600         modifiers = QApplication.instance().keyboardModifiers() 
    601         extended_selection = modifiers & Qt.ControlModifier 
    602          
    603         if select_state == False and item not in self.selected_items: 
    604             # Already removed 
    605             return select_state 
    606         if not extended_selection: 
    607             selected_items = list(self.selected_items) 
    608             for selected in selected_items: 
    609                 self._remove_selection(selected) 
     627        if select_state == False and item not in self.selected_items or \ 
     628           select_state == True and item in self.selected_items: 
     629            return select_state # State unchanged 
    610630             
    611631        if item in self.selected_items: 
     
    629649             
    630650        return select_state 
    631                  
     651         
    632652    def _re_enumerate_selections(self): 
    633653        """ Re enumerate the selection items and update the colors. 
    634654        """  
    635         items = sorted(self.selected_items.items(), key=lambda item: item[1][0]) 
     655        items = sorted(self.selected_items.items(), key=lambda item: item[0].cluster.first) # Order the clusters 
    636656        palette = ColorPaletteHSV(len(items)) 
    637657        for new_i, (item, (i, selection_item)) in enumerate(items): 
     
    641661            selection_item.setBrush(QColor(color)) 
    642662             
    643     def _remove_selection(self, item): 
     663    def _remove_selection(self, item, emit_changed=True): 
    644664        """ Remove selection rooted at item. 
    645665        """ 
     
    652672        item.setSelected(False) 
    653673        self._re_enumerate_selections() 
    654         self.emit(SIGNAL("selectionChanged()")) 
    655          
    656     def _add_selection(self, item, reenumerate=True): 
     674        if emit_changed: 
     675            self.emit(SIGNAL("selectionChanged()")) 
     676         
     677    def _add_selection(self, item, reenumerate=True, emit_changed=True): 
    657678        """ Add selection rooted at item 
    658679        """ 
    659680        selection_item = self.selection_item_constructor(item) 
    660681        self.selected_items[item] = len(self.selected_items), selection_item 
     682        item.setSelected(True) 
    661683        if reenumerate: 
    662684            self._re_enumerate_selections() 
    663         self.emit(SIGNAL("selectionChanged()")) 
     685        if emit_changed: 
     686            self.emit(SIGNAL("selectionChanged()")) 
    664687         
    665688    def _selected_sub_items(self, item): 
     
    692715        for item, (i, selection_item) in self.selected_items.items(): 
    693716            selection_item.setPolygon(selection_polygon_from_item(item)) 
    694          
    695 #    def paint(self, painter, options, widget=0): 
    696 #        rect =  self.geometry() 
    697 #        rect.translate(-self.pos()) 
    698 #        painter.drawRect(rect) 
    699      
    700      
     717     
     718    def setGeometry(self, geometry): 
     719        QGraphicsWidget.setGeometry(self, geometry) 
     720        self.emit(SIGNAL("dendrogramGeometryChanged(QRectF)"), geometry) 
     721         
     722    def event(self, event): 
     723        ret = QGraphicsWidget.event(self, event) 
     724        if event.type() == QEvent.LayoutRequest: 
     725            self.emit(SIGNAL("dendrogramLayoutChanged()")) 
     726        return ret 
     727     
     728    if DEBUG: 
     729        def paint(self, painter, options, widget=0): 
     730            rect =  self.geometry() 
     731            rect.translate(-self.pos()) 
     732            painter.drawRect(rect) 
     733             
     734             
    701735class CutoffLine(QGraphicsLineItem): 
    702     """ A dragable cutoff line for selection of clusters in a DendrogramWidget 
    703     based in their height. 
    704      
     736    """ A dragable cutoff line for selection of clusters in a DendrogramWidget. 
    705737    """ 
     738    class emiter(QObject): 
     739        """ an empty QObject used by CuttofLine to emit signals 
     740        """ 
     741        pass 
     742     
    706743    def __init__(self, widget, scene=None): 
    707744        assert(isinstance(widget, DendrogramWidget)) 
    708745        QGraphicsLineItem.__init__(self, widget) 
    709746        self.setAcceptedMouseButtons(Qt.LeftButton) 
     747        self.emiter = self.emiter() 
    710748        pen = QPen(Qt.black, 2) 
    711749        pen.setCosmetic(True) 
     
    718756            self.setLine(0, geom.height(), geom.width(), geom.height()) 
    719757            self.setCursor(Qt.SizeVerCursor) 
     758        self.cutoff_height = widget.root_cluster.height 
    720759        self.setZValue(widget.item(widget.root_cluster).zValue() + 10) 
    721          
     760        widget.connect(widget, SIGNAL("dendrogramGeometryChanged(QRectF)"), self.on_geometry_changed) 
     761         
     762    def set_cutoff_at_height(self, height): 
     763        widget = self.parentWidget() 
     764        pos = widget.pos_at_height(height) 
     765        geom = widget.geometry() 
     766        if widget.orientation == Qt.Vertical: 
     767            self.setLine(pos.x(), 0, pos.x(), geom.height()) 
     768        else: 
     769            self.setLine(0, pos.y(), geom.width(), pos.y()) 
     770        self.cutoff_selection(height) 
     771             
     772    def cutoff_selection(self, height): 
     773        self.cutoff_height = height 
     774        widget = self.parentWidget() 
     775        clusters = clusters_at_height(widget.root_cluster, height) 
     776        items = [widget.item(cl) for cl in clusters] 
     777        self.emiter.emit(SIGNAL("cutoffValueChanged(float)"), height) 
     778        widget.set_selected_items(items) 
     779         
     780    def on_geometry_changed(self, geom): 
     781        widget = self.parentWidget() 
     782        height = self.cutoff_height 
     783        pos = widget.pos_at_height(height) 
     784                 
     785        if widget.orientation == Qt.Vertical: 
     786            self.setLine(pos.x(), 0, pos.x(), geom.height()) 
     787            self.setCursor(Qt.SizeHorCursor) 
     788        else: 
     789            self.setLine(0, pos.y(), geom.width(), pos.y()) 
     790            self.setCursor(Qt.SizeVerCursor) 
     791        self.setZValue(widget.item(widget.root_cluster).zValue() + 10) 
     792             
    722793    def mousePressEvent(self, event): 
    723794        pass  
     
    738809        pass 
    739810     
    740     def cutoff_selection(self, height): 
    741         widget = self.parentWidget() 
    742         clusters = clusters_at_height(widget.root_cluster, height) 
    743         items = [widget.item(cl) for cl in clusters] 
    744         widget.set_selected_items(items) 
    745          
     811    def mouseDoubleClickEvent(self, event): 
     812        pass 
    746813         
    747814def clusters_at_height(root_cluster, height): 
  • orange/OrangeWidgets/OWGUIEx.py

    r8264 r8305  
    163163        QObject.connect(self.listWidget, SIGNAL("itemClicked (QListWidgetItem *)"), self.doneCompletion) 
    164164         
     165        QObject.connect(self, SIGNAL("editingFinished()"), lambda : self.callbackOnComplete() if self.callbackOnComplete else None) 
     166         
    165167    def setItems(self, items): 
    166168        if items: 
     
    212214        if self.callbackOnComplete: 
    213215            QTimer.singleShot(0, self.callbackOnComplete) 
    214             #self.callbackOnComplete() 
    215  
    216216     
    217217    def textEdited(self): 
     
    219219        if self.getLastTextItem() == "":        # if we haven't typed anything yet we hide the list widget 
    220220            self.listWidget.hide() 
    221 #        else: 
    222              
     221            self.doneCompletion()  
    223222     
    224223    def getLastTextItem(self): 
  • orange/OrangeWidgets/OWHist.py

    r8264 r8305  
    101101 
    102102    def shadeTails(self): 
    103         if not self.xData and not self.yData: 
     103        if len(self.xData) == 0 and len(self.yData) == 0: 
    104104            return 
    105105         
  • orange/OrangeWidgets/OWNetworkHist.py

    r8264 r8305  
    170170             
    171171            if hasattr(self.matrix, "items"):                
    172                 if type(self.matrix.items) == type(orange.ExampleTable(orange.Domain(orange.StringVariable('tmp')))): 
     172                if type(self.matrix.items) == orange.ExampleTable: 
    173173                    #graph.setattr("items", self.data.items) 
    174174                    graph.items = self.matrix.items 
     
    262262         
    263263        if matrix != None: 
    264             self.matrix = matrix 
     264            matrix.items  = self.graph.items 
     265            self.graph_matrix = matrix 
    265266             
    266267        self.pconnected = nedges 
  • orange/OrangeWidgets/Prototypes/OWModelEmbedder.py

    r8264 r8305  
    44<contact>Miha Stajdohar (miha.stajdohar(@at@)gmail.com)</contact> 
    55<icon>icons/DistanceFile.png</icon> 
    6 <priority>1100</priority> 
     6<priority>1120</priority> 
    77""" 
    88 
     
    3737        self.model = None 
    3838         
    39         self.resize(800, 600) 
     39        self.loadSettings() 
     40         
    4041        self.widgets = {} 
    4142         
  • orange/OrangeWidgets/Regression/OWRegressionTreeViewer2D.py

    r8264 r8305  
    129129        self.outputs = [("Examples", ExampleTable)] 
    130130         
     131        self.NodeColorMethod = 1 
    131132        self.showNodeInfoText = False 
    132133         
  • orange/OrangeWidgets/Unsupervised/OWDistanceFile.py

    r8264 r8305  
    2222        matrix = pickle.load(pkl_file) 
    2323        data = None 
    24         #print self.matrix 
    2524        if hasattr(matrix, 'items'): 
    26             data = matrix.items 
     25            items = matrix.items 
     26            if isinstance(items, orange.ExampleTable): 
     27                data = items 
     28            elif isinstance(items, list) or hasattr(item, "__iter__"): 
     29                labels = items 
    2730        pkl_file.close() 
    28          
     31    elif type(fn) != file and os.path.splitext(fn)[1] == '.npy': 
     32        import numpy 
     33        nmatrix = numpy.load(fn) 
     34        matrix = orange.SymMatrix(len(nmatrix)) 
     35        milestones = orngMisc.progressBarMilestones(matrix.dim, 100) 
     36        for i in range(len(nmatrix)): 
     37            for j in range(i+1): 
     38                matrix[j,i] = nmatrix[i,j] 
     39                 
     40            if progress and i in milestones: 
     41                progress.advance() 
     42        #labels = [""] * len(nmatrix) 
    2943    else:     
    30         #print fn 
    3144        if type(fn) == file: 
    3245            fle = fn 
     
    4053        try: 
    4154            dim = int(spl[0]) 
    42         except: 
    43             msg = "Matrix dimension expected in the first line" 
    44             raise exceptions.Exception 
     55        except IndexError: 
     56            raise ValueError("Matrix dimension expected in the first line.") 
     57         
    4558        #print dim 
    4659        labeled = len(spl) > 1 and spl[1] in ["labelled", "labeled"] 
     
    5770                if not li.strip(): 
    5871                    continue 
    59                 msg = "File too long" 
    60                 raise exceptions.IndexError 
     72                raise ValueError("File to long") 
     73             
    6174            spl = lne.split("\t") 
    6275            if labeled: 
     
    6477                spl = spl[1:] 
    6578            if len(spl) > dim: 
    66                 msg = "Line %i too long" % li+2 
    67                 raise exceptions.IndexError 
     79                raise ValueError("Line %i too long" % li+2) 
     80             
    6881            for lj, s in enumerate(spl): 
    6982                if s: 
    7083                    try: 
    7184                        matrix[li, lj] = float(s) 
    72                     except: 
    73                         msg = "Invalid number in line %i, column %i" % (li+2, lj) 
     85                    except ValueError: 
     86                        raise ValueError("Invalid number in line %i, column %i" % (li+2, lj)) 
     87                     
    7488            if li in milestones: 
    7589                if progress: 
    7690                    progress.advance() 
    77         if progress: 
    78             progress.finish() 
    79          
    80     if msg: 
    81         raise exceptions.Exception(msg) 
     91    if progress: 
     92        progress.finish() 
    8293 
    8394    return matrix, labels, data 
     
    8697    settingsList = ["recentFiles", "invertDistances", "normalizeMethod", "invertMethod"] 
    8798 
    88     def __init__(self, parent=None, signalManager=None, inputItems=True): 
     99    def __init__(self, parent=None, signalManager=None, name="Distance File", inputItems=True): 
    89100        self.callbackDeposit = [] # deposit for OWGUI callback functions 
    90         OWWidget.__init__(self, parent, signalManager, "Distance File", wantMainArea = 0, resizingEnabled = 0) 
     101        OWWidget.__init__(self, parent, signalManager, name, wantMainArea = 0, resizingEnabled = 0) 
    91102         
    92103        if inputItems:  
     
    178189        #self.filecombo.updateGeometry() 
    179190 
     191        self.matrix = None 
     192        self.labels = None 
     193        self.data = None 
     194        pb = OWGUI.ProgressBar(self, 100) 
     195         
    180196        self.error() 
    181          
    182197        try: 
    183             self.matrix = None 
    184             self.labels = None 
    185             self.data = None 
    186             pb = OWGUI.ProgressBar(self, 100) 
    187198            self.matrix, self.labels, self.data = readMatrix(fn, pb) 
    188             self.relabel() 
    189         except: 
    190             self.error("Error while reading the file") 
     199        except Exception, ex: 
     200            self.error("Error while reading the file: '%s'" % str(ex)) 
     201            return 
     202        self.relabel() 
    191203             
    192204    def relabel(self): 
     
    194206        self.error() 
    195207        matrix = self.matrix 
    196         if matrix and self.data: 
     208        if matrix is not None and self.data is not None: 
    197209            if self.takeAttributeNames: 
    198210                domain = self.data.domain 
     
    209221                else: 
    210222                    self.error("The number of examples doesn't match the matrix dimension") 
    211         else: 
     223        elif matrix and self.labels: 
    212224            lbl = orange.StringVariable('label') 
    213225            self.data = orange.ExampleTable(orange.Domain([lbl]),  
     
    218230         
    219231        if self.data == None and self.labels == None: 
    220             matrix.setattr("items", range(matrix.dim)) 
     232            matrix.setattr("items", [str(i) for i in range(matrix.dim)]) 
    221233         
    222234        self.matrix.matrixType = orange.SymMatrix.Symmetric 
  • orange/OrangeWidgets/Unsupervised/OWDistanceMap.py

    r8264 r8305  
    7171 
    7272    def __init__(self, parent=None, signalManager = None): 
    73 #        self.callbackDeposit = [] # deposit for OWGUI callback function 
    7473        OWWidget.__init__(self, parent, signalManager, 'Distance Map', wantGraph=True) 
    7574 
     
    9291 
    9392        #set default settings 
    94         self.CellWidth = 15; self.CellHeight = 15 
    95         self.Merge = 1; 
     93        self.CellWidth = 15 
     94        self.CellHeight = 15 
     95        self.Merge = 1 
    9696        self.savedMerge = self.Merge 
    9797        self.Gamma = 1 
    9898        self.Grid = 1 
    9999        self.savedGrid = 1 
    100         self.CutLow = 0; self.CutHigh = 0; self.CutEnabled = 0 
     100        self.CutLow = 0 
     101        self.CutHigh = 0 
     102        self.CutEnabled = 0 
    101103        self.Sort = 0 
    102104        self.SquareCells = 0 
    103         self.ShowLegend = 1; 
    104         self.ShowLabels = 1; 
    105         self.ShowBalloon = 1; 
     105        self.ShowLegend = 1 
     106        self.ShowLabels = 1 
     107        self.ShowBalloon = 1 
    106108        self.ShowItemsInBalloon = 1 
    107109        self.SendOnRelease = 1 
     
    128130                         labelWidth=38, minValue=1, maxValue=self.maxHSize, 
    129131                         step=1, precision=0, 
    130                          callback=[lambda f="CellWidth", t="CellHeight": self.adjustCellSize(f,t), self.drawDistanceMap, self.manageGrid]) 
     132                         callback=[lambda f="CellWidth", t="CellHeight": self.adjustCellSize(f,t), 
     133                                   self.drawDistanceMap, 
     134                                   self.manageGrid]) 
    131135        OWGUI.qwtHSlider(box, self, "CellHeight", label='Height: ', 
    132136                         labelWidth=38, minValue=1, maxValue=self.maxVSize, 
    133137                         step=1, precision=0, 
    134                          callback=[lambda f="CellHeight", t="CellWidth": self.adjustCellSize(f,t), self.drawDistanceMap,self.manageGrid]) 
     138                         callback=[lambda f="CellHeight", t="CellWidth": self.adjustCellSize(f,t), 
     139                                   self.drawDistanceMap, 
     140                                   self.manageGrid]) 
    135141        OWGUI.checkBox(box, self, "SquareCells", "Cells as squares", 
    136142                         callback = [self.setSquares, self.drawDistanceMap]) 
    137         self.gridChkBox = OWGUI.checkBox(box, self, "Grid", "Show grid", callback = self.createDistanceMap, disabled=lambda: min(self.CellWidth, self.CellHeight) <= c_smallcell) 
     143        self.gridChkBox = OWGUI.checkBox(box, self, "Grid", "Show grid", 
     144                                         callback = self.createDistanceMap, 
     145                                         disabled=lambda: min(self.CellWidth, self.CellHeight) <= c_smallcell) 
    138146 
    139147        OWGUI.separator(tab) 
     
    148156                         callback=self.sortItems) 
    149157        OWGUI.rubber(tab) 
    150  
    151 ##        self.tabs.insertTab(tab, "Settings") 
    152158 
    153159        # FILTER TAB 
     
    172178            self.sliderCutLow.box.setDisabled(1) 
    173179            self.sliderCutHigh.box.setDisabled(1) 
    174  
    175  
    176 ##        self.colorPalette = ColorPalette(box, self, "", 
    177 ##                         additionalColors =["Cell outline", "Selected cells"], 
    178 ##                         callback = self.setColor) 
    179 ##        box.layout().addWidget(self.colorPalette) 
     180             
    180181        box = OWGUI.widgetBox(box, "Colors", orientation="horizontal") 
    181182        self.colorCombo = OWColorPalette.PaletteSelectorComboBox(self) 
     
    192193        OWGUI.rubber(tab) 
    193194 
    194         self.setColor(self.selectedSchemaIndex)         
    195  
    196 ##        self.tabs.insertTab(tab, "Colors") 
     195        self.setColor(self.selectedSchemaIndex) 
    197196 
    198197        # INFO TAB 
     
    221220        OWGUI.checkBox(box, self, 'SendOnRelease', "Send after mouse release") 
    222221        OWGUI.rubber(tab) 
    223 ##        self.tabs.insertTab(tab, "Info") 
    224222 
    225223        self.resize(700,400) 
     
    231229        #construct selector 
    232230        self.selector = QGraphicsRectItem(0, 0, self.CellWidth, self.CellHeight, None, self.scene) 
    233 ##        color = self.colorPalette.getCurrentColorSchema().getAdditionalColors()["Cell outline"] 
    234231        color = self.cellOutlineColor 
    235232        self.selector.setPen(QPen(self.qrgbToQColor(color),v_sel_width)) 
    236233        self.selector.setZValue(20) 
    237234 
    238 ##        self.bubble = BubbleInfo(self.scene) 
    239235        self.selection = SelectionManager() 
    240236 
     
    250246        self.errorText.setPos(10,10) 
    251247         
    252 #        OWGUI.button(self.controlArea, self, "&Save Graph", lambda:OWChooseImageSizeDlg(self.scene).exec_(), debuggingEnabled = 0) 
    253248        self.connect(self.graphButton, SIGNAL("clicked()"), lambda:OWChooseImageSizeDlg(self.scene, parent=self).exec_()) 
    254249 
    255  
    256         #restore color schemas from settings 
    257 ##        if self.ColorSchemas: 
    258 ##            self.colorPalette.setColorSchemas(self.ColorSchemas) 
     250        self._clustering_cache = {} 
    259251 
    260252    def sendReport(self): 
     
    807799 
    808800    def sortClustering(self): 
    809         self.rootCluster=orange.HierarchicalClustering(self.matrix, 
     801        cluster = self._clustering_cache.get("sort clustering", None) 
     802        if cluster is None: 
     803            cluster = orange.HierarchicalClustering(self.matrix, 
    810804                linkage=orange.HierarchicalClustering.Average) 
     805            # Cache the cluster 
     806            self._clustering_cache["sort clustering"] = cluster 
     807        self.rootCluster = cluster 
    811808        self.order = list(self.rootCluster.mapping) 
    812  
     809         
    813810    def sortClusteringOrdered(self): 
    814         self.rootCluster=orange.HierarchicalClustering(self.matrix, 
     811        cluster = self._clustering_cache.get("sort ordered clustering", None) 
     812        if cluster is None: 
     813            cluster = orange.HierarchicalClustering(self.matrix, 
    815814                linkage=orange.HierarchicalClustering.Average) 
    816         import orngClustering 
    817         self.progressBarInit() 
    818         orngClustering.orderLeaves(self.rootCluster, self.matrix, self.progressBarSet) 
    819         self.progressBarFinished() 
     815            import orngClustering 
     816            self.progressBarInit() 
     817            orngClustering.orderLeaves(cluster, self.matrix, self.progressBarSet) 
     818            self.progressBarFinished() 
     819            # Cache the cluster 
     820            self._clustering_cache["sort ordered clustering"] = cluster 
     821        self.rootCluster = cluster 
    820822        self.order = list(self.rootCluster.mapping) 
    821          
    822823 
    823824    def sortItems(self): 
     
    830831        self.send("Examples", None) 
    831832        self.send("Attribute List", None) 
     833        self._clustering_cache.clear() 
    832834 
    833835        if not matrix: 
  • orange/OrangeWidgets/Unsupervised/OWHierarchicalClustering.py

    r8264 r8305  
    1010from OWWidget import * 
    1111from OWQCanvasFuncts import * 
     12import OWClustering 
    1213import OWGUI 
    1314import OWColorPalette 
    1415import math 
     16import numpy 
    1517import os 
     18 
     19import orange 
     20from Orange.clustering import hierarchical  
    1621 
    1722from OWDlgs import OWChooseImageSizeDlg 
     
    2025from PyQt4.QtGui import * 
    2126 
    22 try: 
    23     from OWDataFiles import DataFiles 
    24 except: 
    25     class DataFiles(object): 
    26         pass 
    27  
    28 class recursion_limit(object): 
    29     def __init__(self, limit=1000): 
    30         self.limit = limit 
    31          
    32     def __enter__(self): 
    33         self.old_limit = sys.getrecursionlimit() 
    34         sys.setrecursionlimit(self.limit) 
    35      
    36     def __exit__(self, exc_type, exc_val, exc_tb): 
    37         sys.setrecursionlimit(self.old_limit) 
    38  
    3927class OWHierarchicalClustering(OWWidget): 
    40     settingsList=["Linkage", "OverwriteMatrix", "Annotation", "Brightness", "PrintDepthCheck", 
    41                 "PrintDepth", "HDSize", "VDSize", "ManualHorSize","AutoResize", 
    42                 "TextSize", "LineSpacing", "ZeroOffset", "SelectionMode", "DisableHighlights", 
    43                 "DisableBubble", "ClassifySelected", "CommitOnChange", "ClassifyName", "addIdAs"] 
     28    settingsList = ["Linkage", "Annotation", "PrintDepthCheck", 
     29                    "PrintDepth", "HDSize", "VDSize", "ManualHorSize","AutoResize", 
     30                    "TextSize", "LineSpacing", "SelectionMode", 
     31                    "AppendClusters", "CommitOnChange", "ClassifyName", "addIdAs"] 
    4432     
    4533    contextHandlers={"":DomainContextHandler("", [ContextField("Annotation", DomainContextHandler.Required)])} 
    4634     
    4735    def __init__(self, parent=None, signalManager=None): 
    48         #OWWidget.__init__(self, parent, 'Hierarchical Clustering') 
    4936        OWWidget.__init__(self, parent, signalManager, 'Hierarchical Clustering', wantGraph=True) 
    50         self.inputs=[("Distance matrix", orange.SymMatrix, self.dataset)] 
    51         self.outputs=[("Selected Examples", ExampleTable), ("Unselected Examples", ExampleTable), ("Centroids", ExampleTable), ("Structured Data Files", DataFiles)] 
    52         self.linkage=[("Single linkage", orange.HierarchicalClustering.Single), 
     37         
     38        self.inputs = [("Distance matrix", orange.SymMatrix, self.set_matrix)] 
     39        self.outputs = [("Selected Examples", ExampleTable), ("Unselected Examples", ExampleTable), ("Centroids", ExampleTable)] 
     40        self.linkage = [("Single linkage", orange.HierarchicalClustering.Single), 
    5341                        ("Average linkage", orange.HierarchicalClustering.Average), 
    5442                        ("Ward's linkage", orange.HierarchicalClustering.Ward), 
    5543                        ("Complete linkage", orange.HierarchicalClustering.Complete), 
    56                      ] 
    57         self.Linkage=3 
    58         self.OverwriteMatrix=0 
    59         self.Annotation=0 
    60         self.Brightness=5 
    61         self.PrintDepthCheck=0 
    62         self.PrintDepth=100 
    63         self.HDSize=500         #initial horizontal and vertical dendrogram size 
    64         self.VDSize=800 
    65         self.ManualHorSize=0 
    66         self.AutoResize=0 
    67         self.TextSize=8 
    68         self.LineSpacing=4 
    69         self.SelectionMode=0 
    70         self.ZeroOffset=1 
    71         self.DisableHighlights=0 
    72         self.DisableBubble=0 
    73         self.ClassifySelected=0 
    74         self.CommitOnChange=0 
    75         self.ClassifyName="HC_class" 
     44                       ] 
     45        self.Linkage = 3 
     46        self.Annotation = 0 
     47        self.PrintDepthCheck = 0 
     48        self.PrintDepth = 10 
     49        self.HDSize = 500         #initial horizontal and vertical dendrogram size 
     50        self.VDSize = 800 
     51        self.ManualHorSize = 0 
     52        self.AutoResize = 0 
     53        self.TextSize = 8 
     54        self.LineSpacing = 4 
     55        self.SelectionMode = 0 
     56        self.AppendClusters = 0 
     57        self.CommitOnChange = 0 
     58        self.ClassifyName = "HC_class" 
     59        self.addIdAs = 0