Changeset 7622:2c815d70f149 in orange
 Timestamp:
 02/08/11 19:23:29 (3 years ago)
 Branch:
 default
 Convert:
 fe1c1ca8485b1daaf7ee48b7af6334d598e7a021
 File:

 1 edited
Legend:
 Unmodified
 Added
 Removed

orange/Orange/statistics/distributions.py
r7621 r7622 5 5 6 6 7 ======================================== 8 Basic Statistics for Continuous Features9 ======================================== 7 ========================================= 8 Basic Statistics for Continuous Variables 9 ========================================= 10 10 11 11 The are two simple classes for computing basic statistics 12 12 for continuous features, such as their minimal and maximal value 13 or average: :class:`BasicStatistics` holds the statistics for a single feature14 and :class:`DomainBasicStatistics` is a container storinga list of instances of15 the above class for all features in the domain.13 or average: :class:`BasicStatistics` holds the statistics for a single variable 14 and :class:`DomainBasicStatistics` behaves like a list of instances of 15 the above class for all variables in the domain. 16 16 17 17 .. class:: BasicStatistics 18 18 19 `DomainBasicStatistics` computes onthe fly statistics. 19 ``BasicStatistics`` computes and stores minimal, maximal, average and 20 standard deviation of a variable. It does not include the median or any 21 other statistics that can be computed on the fly, without remembering the 22 data; such statistics can be obtained using :obj:`ContDistribution`. !!!TODO 23 24 Instances of this class are seldom constructed manually; they are more often 25 returned by :obj:`DomainBasicStatistics` described below. 20 26 21 27 .. attribute:: variable 22 28 23 Descriptor for the variable to which the data applies. 24 25 .. attribute:: min, max 26 27 Minimal and maximal variable value encountered. 28 29 .. attribute:: avg, dev 30 31 Average value and standard deviation. 29 The variable to which the data applies. 30 31 .. attribute:: min 32 33 Minimal value encountered 34 35 .. attribute:: max 36 37 Maximal value encountered 38 39 .. attribute:: avg 40 41 Average value 42 43 .. attribute:: dev 44 45 Standard deviation 32 46 33 47 .. attribute:: n 34 48 35 Number of instances for which the value was defined 36 (and used in the statistics). If instances were weighted, 37 ``n`` is the sum of weights of those instances. 38 39 .. attribute:: sum, sum2 40 41 Weighted sum of values and weighted sum of 42 squared values of this feature. 49 Number of instances for which the value was defined. 50 If instances were weighted, :obj:`n` holds the sum of weights 51 52 .. attribute:: sum 53 54 Weighted sum of values 55 56 .. attribute:: sum2 57 58 Weighted sum of squared values 43 59 44 60 .. … … 49 65 .. method:: add(value[, weight=1]) 50 66 51 Add a value to the statistics. 67 Add a value to the statistics: adjust :obj:`min` and :obj:`max` if 68 necessary, increase :obj:`n` and recompute :obj:`sum`, :obj:`sum2`, 69 :obj:`avg` and :obj:`dev`. 52 70 53 71 :param value: Value to be added to the statistics … … 61 79 Recompute the average and deviation. 62 80 63 The class works as follows. Values are added by :obj:`add`, for each value64 it checks and, if necessary, adjusts :obj:`min` and :obj:`max`, adds the value to65 :obj:`sum` and its square to :obj:`sum2`. The weight is added to :obj:`n`.66 67 68 The statistics does not include the median or any other statistics that can be computed on the fly, without remembering the data. Quantiles can be computed69 by :obj:`ContDistribution`. !!!TODO70 71 Instances of this class are seldom constructed manually; they are more often72 returned by :obj:`DomainBasicStatistics` described below.73 74 81 .. class:: DomainBasicStatistics 75 82 76 83 ``DomainBasicStatistics`` behaves like a ordinary list, except that its 77 elements can also be indexed by feature descriptors or feature names.84 elements can also be indexed by variable names or descriptors. 78 85 79 86 .. method:: __init__(data[, weight=None]) 80 87 81 Compute the statistics for all continuous features in the 82 give data, and put `None` to the places corresponding to features of other types.88 Compute the statistics for all continuous features in the data, and put 89 :obj:`None` to the places corresponding to variables of other types. 83 90 84 91 :param data: A table of instances … … 89 96 .. method:: purge() 90 97 91 Remove the ``None``'s corresponding to noncontinuous features. 98 Remove the :obj:`None`'s corresponding to noncontinuous features; this 99 truncates the list, so the indices do not respond to indices of 100 variables in the domain. 92 101 93 102 part of `distributionsbasicstat.py`_ (uses monks1.tab) … … 118 127 119 128 129 ================== 120 130 Contingency Matrix 121 131 ================== 122 132 123 Contingency matrix contains conditional distributions. When initialized, they 124 will typically contain absolute frequencies, that is, the number of instances 125 with a particular combination of two variables' values. If they are normalized 126 by dividing each cell by the row sum, the represent conditional probabilities 127 of the column variable (here denoted as ``innerVariable``) conditioned by the 128 row variable (``outerVariable``). 129 130 Contingencies work with both, discrete and continuous variables. 133 Contingency matrix contains conditional distributions. Unless explicitly 134 'normalized', they contain absolute frequencies, that is, the number of 135 instances with a particular combination of two variables' values. If they are 136 normalized by dividing each cell by the row sum, the represent conditional 137 probabilities of the column variable (here denoted as ``innerVariable``) 138 conditioned by the row variable (``outerVariable``). 139 140 Contingency matrices are usually constructed for discrete variables. Matrices 141 for continuous variables have certain limitations described in a :ref:`separate 142 section <contcont>`. 143 144 The example below loads the monks1 data set and prints out the conditional 145 class distribution given the value of `e`. 131 146 132 147 .. _distributionscontingency: code/distributionscontingency.py … … 144 159 4 <72.000, 36.000> 145 160 146 Contingencies behave like lists of distributions (in this case, class distributions) indexed by values (of `e`, in this example). Distributions are, in turn indexed 147 by values (class values, here). The variable `e` from the above example is called 148 the outer variable, and the class is the inner. This can also be reversed, and it 149 is also possible to use features for both, outer and inner variable, so the 150 matrix shows distributions of one variable's values given the value of another. 151 There is a corresponding hierarchy of classes for handling hierarchies: :obj:`Contingency` is a base class for :obj:`ContingencyVarVar` (both variables 152 are attribtes) and :obj:`ContingencyClass` (one variable is the class). 153 The latter is the base class for :obj:`ContingencyVarClass` and :obj:`ContingencyClassVar`. 161 Contingencies behave like lists of distributions (in this case, class 162 distributions) indexed by values (of `e`, in this 163 example). Distributions are, in turn indexed by values (class values, 164 here). The variable `e` from the above example is called the outer 165 variable, and the class is the inner. This can also be reversed. It is 166 also possible to use features for both, outer and inner variable, so 167 the matrix shows distributions of one variable's values given the 168 value of another. There is a corresponding hierarchy of classes: 169 :obj:`Contingency` is a base class for :obj:`ContingencyVarVar` (both 170 variables are attribtes) and :obj:`ContingencyClass` (one variable is 171 the class). The latter is the base class for 172 :obj:`ContingencyVarClass` and :obj:`ContingencyClassVar`. 154 173 155 174 The most commonly used of the above classes is :obj:`ContingencyVarClass` which 156 175 can compute and store conditional probabilities of classes given the feature value. 157 176 158 .. class:: Orange.statistics.distribution.Contingency 177 Classes for storing contingency matrices 178 ======================================== 179 180 .. class:: Contingency 181 182 Provides a base class for storing and manipulating contingency 183 matrices. Although it is not abstract, it is seldom used directly but rather 184 through more convenient derived classes described below. 159 185 160 186 .. attribute:: outerVariable 161 187 162 Descriptor (:class:`Orange.data.feature.Feature`) of the outer variable. 188 Outer variable (:class:`Orange.data.feature.Feature`) whose values are 189 used as the first, outer index. 163 190 164 191 .. attribute:: innerVariable 165 192 166 Descriptor (:class:`Orange.data.feature.Feature`) of the inner variable. 193 Inner variable(:class:`Orange.data.feature.Feature`), whose values are 194 used as the second, inner index. 167 195 168 196 .. attribute:: outerDistribution … … 176 204 .. attribute:: innerDistributionUnknown 177 205 178 The distribution (:class:`Distribution`) of the inner variable for 179 instances for which the outer variable was undefined. 180 This is the difference between the ``innerDistribution``181 and unconditionaldistribution of inner variable.206 The distribution (:class:`Distribution`) of the inner variable for 207 instances for which the outer variable was undefined. This is the 208 difference between the ``innerDistribution`` and (unconditional) 209 distribution of inner variable. 182 210 183 211 .. attribute:: varType 184 212 185 The type of the outer feature (:obj:`Orange.data.Type`, usually 186 :obj:`Orange.data.feature.Discrete` or 187 :obj:`Orange.data.feature.Continuous`). ``varType`` equals ``outerVariable.varType`` and ``outerDistribution.varType``. 188 189 .. method:: __init__(outerVariable, innerVariable) 213 The type of the outer variable (:obj:`Orange.data.Type`, usually 214 :obj:`Orange.data.feature.Discrete` or 215 :obj:`Orange.data.feature.Continuous`); equals 216 ``outerVariable.varType`` and ``outerDistribution.varType``. 217 218 .. method:: __init__(outer_variable, inner_variable) 190 219 191 220 Construct an instance of ``Contingency`` for the given pair of 192 221 variables. 193 222 194 :param outer Variable: Descriptor of the outer variable195 :type outer Variable: Orange.data.feature.Feature196 :param outer Variable: Descriptor of the inner variable197 :type inner Variable: Orange.data.feature.Feature223 :param outer_variable: Descriptor of the outer variable 224 :type outer_variable: Orange.data.feature.Feature 225 :param outer_variable: Descriptor of the inner variable 226 :type inner_variable: Orange.data.feature.Feature 198 227 199 228 .. method:: add(outer_value, inner_value[, weight=1]) 200 229 201 Add an element to the contingency matrix by adding 202 ``weight`` to thecorresponding cell.230 Add an element to the contingency matrix by adding ``weight`` to the 231 corresponding cell. 203 232 204 233 :param outer_value: The value for the outer variable … … 211 240 .. method:: normalize() 212 241 213 Normalize all distributions (rows) in the contingencyto sum to ``1``::242 Normalize all distributions (rows) in the matrix to sum to ``1``:: 214 243 215 244 >>> cont.normalize() … … 226 255 .. note:: 227 256 228 This method does n't change the ``innerDistribution`` or257 This method does not change the ``innerDistribution`` or 229 258 ``outerDistribution``. 230 259 231 260 With respect to indexing, contingency matrix is a cross between dictionary 232 261 and a list. It supports standard dictionary methods ``keys``, ``values`` and 233 ``items``. ::262 ``items``. :: 234 263 235 264 >> print cont.keys() … … 241 270 ('3', <72.000, 36.000>), ('4', <72.000, 36.000>)] 242 271 243 Although keys returned by the above functions are strings, contingency 244 can be indexed with anything that converts into values245 of the outer variable: strings, numbers or instances of ``Orange.data.Value``.::272 Although keys returned by the above functions are strings, contingency can 273 be indexed by anything that can be converted into values of the outer 274 variable: strings, numbers or instances of ``Orange.data.Value``. :: 246 275 247 276 >>> print cont[0] … … 253 282 The length of ``Contingency`` equals the number of values of the outer 254 283 variable. However, iterating through contingency 255 does n't return keys, as with dictionaries, but distributions.::284 does not return keys, as with dictionaries, but distributions. :: 256 285 257 286 >>> for i in cont: … … 264 293 265 294 266 .. class:: Orange.statistics.distribution.ContingencyClass 267 268 ``ContingencyClass`` is an abstract base class for contingency matrices 269 that contain the class, either as the inner or the outer 270 variable. 295 .. class:: ContingencyClass 296 297 An abstract base class for contingency matrices that contain the class, 298 either as the inner or the outer variable. 271 299 272 300 .. attribute:: classVar (read only) 273 301 274 The class attribute descriptor. 275 This is always equal either to :obj:`Contingency.innerVariable` or 276 ``outerVariable``. 302 The class attribute descriptor; always equal to either 303 :obj:`Contingency.innerVariable` or :obj:``Contingency.outerVariable``. 277 304 278 305 .. attribute:: variable 279 306 280 The class attribute descriptor. 281 This is always equal either to innerVariable or outerVariable 282 283 .. method:: add_attrclass(variable_value, class_value[, weight]) 284 285 Adds an element to contingency. The difference between this and 286 Contigency.add is that the variable value is always the first 287 argument and class value the second, regardless of what is inner and 288 outer. 307 Variable; always equal either to either innerVariable or outerVariable 308 309 .. method:: add_attrclass(variable_value, class_value[, weight=1]) 310 311 Add an element to contingency by increasing the corresponding count. The 312 difference between this and :obj:`Contigency.add` is that the variable 313 value is always the first argument and class value the second, 314 regardless of which one is inner and which one is outer. 289 315 290 316 :param attribute_value: Variable value … … 296 322 297 323 298 299 .. class:: Orange.statistics.distribution.ContingencyVarClass 300 301 A class derived from :obj:`ContingencyVarClass`, which uses a given feature 302 as the :obj:`Contingency.outerVariable` and class as the 303 :obj:`Contingency.innerVariable` to provide a form suitable for computation 304 of conditional class probabilities given the variable value. 305 306 Calling :obj:`ContingencyVarClass.add_attrclass(v, c)` is equivalent 307 to calling :obj:`Contingency.add(v, c)`. Similar as :obj:`Contingency`, 324 .. class:: ContingencyVarClass 325 326 A class derived from :obj:`ContingencyVarClass` in which the variable is 327 used as :obj:`Contingency.outerVariable` and class as the 328 :obj:`Contingency.innerVariable`. This form is a form suitable for 329 computation of conditional class probabilities given the variable value. 330 331 Calling :obj:`ContingencyVarClass.add_attrclass(v, c)` is equivalent to 332 :obj:`Contingency.add(v, c)`. Similar as :obj:`Contingency`, 308 333 :obj:`ContingencyVarClass` can compute contingency from instances. 309 334 310 .. method:: __init__(feature, class_ attribute)335 .. method:: __init__(feature, class_variable) 311 336 312 337 Construct an instance of :obj:`ContingencyVarClass` for the given pair of 313 338 variables. Inherited from :obj:`Contingency`. 314 339 315 :param outerVariable: Descriptor of the outer variable316 :type outerVariable: Orange.data.feature.Feature317 :param outerVariable: Descriptor of the inner variable318 :type innerVariable: Orange.data.feature.Feature340 :param feature: Outer variable 341 :type feature: Orange.data.feature.Feature 342 :param class_attribute: Class variable; used as ``innerVariable`` 343 :type class_attribute: Orange.data.feature.Feature 319 344 320 345 .. method:: __init__(feature, data[, weightId]) 321 346 322 Compute the contingency from the given instances. 347 Compute the contingency from data. 348 349 :param feature: Outer variable 350 :type feature: Orange.data.feature.Feature 351 :param data: A set of instances 352 :type data: Orange.data.Table 353 :param weightId: meta attribute with weights of instances 354 :type weightId: int 355 356 .. method:: p_class(value) 357 358 Return the probability distribution of classes given the value of the 359 variable. 360 361 :param value: The value of the variable 362 :type value: int, float, string or :obj:`Orange.data.Value` 363 :rtype: Orange.statistics.distribution.Distribution 364 365 366 .. method:: p_class(value, class_value) 367 368 Returns the conditional probability of the class_value given the 369 feature value, p(class_valuevalue) (note the order of arguments!) 370 371 :param value: The value of the variable 372 :type value: int, float, string or :obj:`Orange.data.Value` 373 :param class_value: The class value 374 :type value: int, float, string or :obj:`Orange.data.Value` 375 :rtype: float 376 377 .. _distributionscontingency3.py: code/distributionscontingency3.py 378 379 part of `distributionscontingency3.py`_ (uses monks1.tab) 380 381 .. literalinclude:: code/distributionscontingency3.py 382 :lines: 125 383 384 The inner and the outer variable and their relations to the class are 385 as follows:: 386 387 Inner variable: y 388 Outer variable: e 389 390 Class variable: y 391 Feature: e 392 393 Distributions are normalized, and probabilities are elements from the 394 normalized distributions. Knowing that the target concept is 395 y := (e=1) or (a=b), distributions are as expected: when e equals 1, class 1 396 has a 100% probability, while for the rest, probability is one third, which 397 agrees with a probability that two threevalued independent features 398 have the same value. :: 399 400 Distributions: 401 p(.1) = <0.000, 1.000> 402 p(.2) = <0.662, 0.338> 403 p(.3) = <0.659, 0.341> 404 p(.4) = <0.669, 0.331> 405 406 Probabilities of class '1' 407 p(11) = 1.000 408 p(12) = 0.338 409 p(13) = 0.341 410 p(14) = 0.331 411 412 Distributions from a matrix computed manually: 413 p(.1) = <0.000, 1.000> 414 p(.2) = <0.662, 0.338> 415 p(.3) = <0.659, 0.341> 416 p(.4) = <0.669, 0.331> 417 418 419 .. class:: ContingencyClassVar 420 421 :obj:`ContingencyClassVar` is similar to :obj:`ContingencyVarClass` except 422 that the class is outside and the variable is inside. This form of 423 contingency matrix is suitable for computing conditional probabilities of 424 variable given the class. All methods get the two arguments in the same 425 order as :obj:`ContingencyVarClass`. 426 427 .. method:: __init__(feature, class_variable) 428 429 Construct an instance of :obj:`ContingencyVarClass` for the given pair of 430 variables. Inherited from :obj:`Contingency`, except for the reversed 431 order of arguments. 432 433 :param feature: Outer variable 434 :type feature: Orange.data.feature.Feature 435 :param class_variable: Class variable 436 :type class_variable: Orange.data.feature.Feature 437 438 .. method:: __init__(feature, data[, weightId]) 439 440 Compute contingency from the data. 323 441 324 442 :param feature: Descriptor of the outer variable … … 329 447 :type weightId: int 330 448 331 .. method:: p_class(value)332 333 Return the probability distribution of classes given the value of the334 variable. Equivalent to `self[value]`, except for normalization.335 336 :param value: The value of the variable337 :type value: int, float, string or :obj:`Orange.data.Value`338 339 .. method:: p_class(value, class_value)340 341 Returns the conditional probability of the class_value given the342 feature value, p(class_valuevalue) (note the order of arguments!)343 Equivalent to `self[values][class_value]`, except for normalization.344 345 :param value: The value of the variable346 :type value: int, float, string or :obj:`Orange.data.Value`347 :param class_value: The class value348 :type value: int, float, string or :obj:`Orange.data.Value`349 350 .. _distributionscontingency3.py: code/distributionscontingency3.py351 352 part of `distributionscontingency3.py`_ (uses monks1.tab)353 354 .. literalinclude:: code/distributionscontingency3.py355 :lines: 125356 357 The inner and the outer variable and their relations to the class are358 as follows::359 360 Inner variable: y361 Outer variable: e362 363 Class variable: y364 Feature: e365 366 Distributions are normalized and probabilities are elements from the367 normalized distributions. Knowing that the target concept is368 y := (e=1) or (a=b), distributions are as expected: when e equals 1, class 1369 has a 100% probability, while for the rest, probability is one third, which370 agrees with a probability that two threevalued independent features371 have the same value. ::372 373 Distributions:374 p(.1) = <0.000, 1.000>375 p(.2) = <0.662, 0.338>376 p(.3) = <0.659, 0.341>377 p(.4) = <0.669, 0.331>378 379 Probabilities of class '1'380 p(11) = 1.000381 p(12) = 0.338382 p(13) = 0.341383 p(14) = 0.331384 385 Distributions from a matrix computed manually:386 p(.1) = <0.000, 1.000>387 p(.2) = <0.662, 0.338>388 p(.3) = <0.659, 0.341>389 p(.4) = <0.669, 0.331>390 391 392 .. class:: Orange.statistics.distribution.ContingencyClassVar393 394 :obj:`ContingencyClassVar` is similar to :obj:`ContingencyVarClass` except395 that here the class is outside and the variable is inside. This form of396 contingency matrix is suitable for computing conditional probabilities of397 variable given the class. All methods get the two arguments in the same398 order as in :obj:`ContingencyVarClass`.399 400 .. method:: __init__(feature, class_attribute)401 402 Construct an instance of :obj:`ContingencyVarClass` for the given pair of403 variables. Inherited from :obj:`Contingency`, except for the reversed404 order.405 406 :param outerVariable: Descriptor of the outer variable407 :type outerVariable: Orange.data.feature.Feature408 :param outerVariable: Descriptor of the inner variable409 :type innerVariable: Orange.data.feature.Feature410 411 .. method:: __init__(feature, instances[, weightId])412 413 Compute the contingency from the given instances.414 415 :param feature: Descriptor of the outer variable416 :type feature: Orange.data.feature.Feature417 :param data: A set of instances418 :type data: Orange.data.Table419 :param weightId: meta attribute with weights of instances420 :type weightId: int421 422 449 .. method:: p_attr(class_value) 423 450 424 451 Return the probability distribution of variable given the class. 425 Equivalent to `self[class_value]`, except for normalization.426 452 427 453 :param class_value: The value of the variable 428 454 :type class_value: int, float, string or :obj:`Orange.data.Value` 455 :rtype: Orange.statistics.distribution.Distribution 429 456 430 457 .. method:: p_attr(value, class_value) … … 434 461 Equivalent to `self[class][value]`, except for normalization. 435 462 436 :param value: The value of the variable463 :param value: Value of the variable 437 464 :type value: int, float, string or :obj:`Orange.data.Value` 438 :param class_value: The class value465 :param class_value: Class value 439 466 :type value: int, float, string or :obj:`Orange.data.Value` 467 :rtype: float 440 468 441 469 .. _distributionscontingency4.py: code/distributionscontingency4.py … … 443 471 part of the output from `distributionscontingency4.py`_ (uses monk1.tab) 444 472 445 The inner and the outer variable and their relations to the class446 and the features are exactly the reverse from:obj:`ContingencyClassVar`::473 The role of the feature and the class are reversed compared to 474 :obj:`ContingencyClassVar`:: 447 475 448 476 Inner variable: e … … 463 491 p(.1) = <0.500, 0.167, 0.167, 0.167> 464 492 465 If the class value is '0', the attribute e cannot be '1' (the first value),466 while distribution across other values is uniform.467 If the class value is '1', e is '1' in exactly half of examples, and468 distribution ofother values is again uniform.469 470 .. class:: Orange.statistics.distribution.ContingencyVarVar471 472 Contingency matrices in which none of the variables is the class. 473 The class is similar to the parent class :obj:`Contingency`, except for474 an additional constructor andmethod for getting conditional probabilities.493 If the class value is '0', the attribute `e` cannot be `1` (the first 494 value), while distribution across other values is uniform. If the class 495 value is `1`, `e` is `1` for exactly half of instances, and distribution of 496 other values is again uniform. 497 498 .. class:: ContingencyVarVar 499 500 Contingency matrices in which none of the variables is the class. The class 501 is derived from :obj:`Contingency`, and adds an additional constructor and 502 method for getting conditional probabilities. 475 503 476 504 .. method:: ContingencyVarVar(outer_variable, inner_variable) … … 482 510 Compute the contingency from the given instances. 483 511 484 :param outer_variable: Descriptor of the outer variable512 :param outer_variable: Outer variable 485 513 :type outer_variable: Orange.data.feature.Feature 486 :param inner_variable: Descriptor of the inner variable514 :param inner_variable: Inner variable 487 515 :type inner_variable: Orange.data.feature.Feature 488 516 :param data: A set of instances … … 493 521 .. method:: p_attr(outer_value) 494 522 495 Return the probability distribution of the inner 496 variable given theouter variable value.523 Return the probability distribution of the inner variable given the 524 outer variable value. 497 525 498 526 :param outer_value: The value of the outer variable 499 527 :type outer_value: int, float, string or :obj:`Orange.data.Value` 528 :rtype: Orange.statistics.distribution.Distribution 500 529 501 530 .. method:: p_attr(outer_value, inner_value) … … 508 537 :param inner_value: The value of the inner variable 509 538 :type inner_value: int, float, string or :obj:`Orange.data.Value` 539 :rtype: float 510 540 511 541 The following example investigates which material is used for … … 519 549 :lines: 119 520 550 521 Short bridges are mostly wooden or iron, 522 and the longer (and the most ofmiddle sized) are made from steel::551 Short bridges are mostly wooden or iron, and the longer (and most of the 552 middle sized) are made from steel:: 523 553 524 554 SHORT: … … 540 570 541 571 542 Contingency matrices for continuous variables543 544 545 The described classes can also be used for continuous values.546 547 If the outer feature is continuous, the index must be one548 of the values that do exist in the contingency matrix. Using other values549 triggers an exception::550 551 .. _distributionscontingency6: code/distributionscontingency6.py552 553 part of `distributionscontingency6`_ (uses monks1.tab)554 555 .. literalinclude:: code/distributionscontingency6.py556 :lines: 15,18,19557 558 Since even rounding is a problem, the keys should generally come from the559 contingencies `keys`.560 561 Contingencies with discrete outer variable continuous inner variables are562 more useful, since methods :obj:`ContingencyClassVar.p_class` and563 :obj:`ContingencyVarClass.p_attr` use the primitive density estimation564 provided by :obj:`Distribution`.565 566 For example, :obj:`ContingencyClassVar` on the iris dataset,567 you can enquire about the probability of the sepal length 5.5::568 569 .. _distributionscontingency7: code/distributionscontingency7.py570 571 part of `distributionscontingency7`_ (uses iris.tab)572 573 .. literalinclude:: code/distributionscontingency7.py574 575 The script outputs::576 577 Estimated frequencies for e=5.5578 f(5.5Irissetosa) = 2.000579 f(5.5Irisversicolor) = 5.000580 f(5.5Irisvirginica) = 1.000581 582 583 572 Contingency matrices for the entire domain 584  585 586 DomainContingency is basically a list of contingencies, 587 either :obj:`ContingencyVarClass` or:obj:`ContingencyClassVar`.573 ========================================== 574 575 A list of contingencies, either :obj:`ContingencyVarClass` or 576 :obj:`ContingencyClassVar`. 588 577 589 578 .. class:: DomainContingency … … 613 602 Contains the distribution of class values on the entire dataset. 614 603 615 .. method:: normalize 604 .. method:: normalize() 616 605 617 606 Call normalize for all contingencies. … … 636 625 .. literalinclude:: code/distributionscontingency8.py 637 626 :lines: 13 627 628 629 .. _contcont: 630 631 Contingency matrices for continuous variables 632 ============================================= 633 634 If the outer variable is continuous, the index must be one of the values that do 635 exist in the contingency matrix. Using other values raises an exception:: 636 637 .. _distributionscontingency6: code/distributionscontingency6.py 638 639 part of `distributionscontingency6`_ (uses monks1.tab) 640 641 .. literalinclude:: code/distributionscontingency6.py 642 :lines: 15,18,19 643 644 Since even rounding can be a problem, the only safe way to get the key is to 645 take it from from the contingencies' ``keys``. 646 647 Contingencies with discrete outer variable and continuous inner variables are 648 more useful, since methods :obj:`ContingencyClassVar.p_class` and 649 :obj:`ContingencyVarClass.p_attr` use the primitive density estimation 650 provided by :obj:`Orange.statistics.distribution.Distribution`. 651 652 For example, :obj:`ContingencyClassVar` on the iris dataset can return the 653 probability of the sepal length 5.5 for different classes:: 654 655 .. _distributionscontingency7: code/distributionscontingency7.py 656 657 part of `distributionscontingency7`_ (uses iris.tab) 658 659 .. literalinclude:: code/distributionscontingency7.py 660 661 The script outputs:: 662 663 Estimated frequencies for e=5.5 664 f(5.5Irissetosa) = 2.000 665 f(5.5Irisversicolor) = 5.000 666 f(5.5Irisvirginica) = 1.000 638 667 639 668 """
Note: See TracChangeset
for help on using the changeset viewer.