Changeset 9385:fd37d2ce5541 in orange


Ignore:
Timestamp:
12/17/11 23:55:02 (2 years ago)
Author:
mitar
Branch:
default
Convert:
7994caed127abeaf7fdc1dbfc2abc4cd02dc70c3
Message:

Cleaned up tutorial.

Location:
docs
Files:
2 added
18 edited

Legend:

Unmodified
Added
Removed
  • docs/reference/Makefile

    r9372 r9385  
    3030 
    3131clean: 
    32     -rm -rf _* *.html html .doctrees .buildinfo searchindex.js *.epub epub 
     32    -rm -rf _* *.html html .doctrees .buildinfo searchindex.js *.epub epub rst/_build 
    3333 
    3434epub: 
  • docs/reference/README.txt

    r9373 r9385  
    2121Python modules in orange/Orange.  
    2222 
    23 Regression testing and scripts are in docs/reference/rst/code. Additional 
    24 files, such as images, are in docs/reference/rst/files. 
     23Example scripts and datasets are in docs/reference/rst/code. Additional files, 
     24such as images, are in docs/reference/rst/files. 
  • docs/reference/rst/Orange.statistics.rst

    r9372 r9385  
     1.. py:module:: Orange.statistics 
     2 
    13########################### 
    24Statistics (``statistics``) 
  • docs/reference/rst/index.rst

    r9372 r9385  
    1 .. test documentation master file, created by 
    2    sphinx-quickstart on Wed Nov 17 12:52:23 2010. 
    3    You can adapt this file completely to your liking, but it should at least 
    4    contain the root `toctree` directive. 
    5  
    61########################## 
    72Orange Scripting Reference 
     
    5045* :ref:`modindex` 
    5146* :ref:`search` 
    52  
  • docs/sphinx-ext/themes/orange_theme/static/orange.css

    r9372 r9385  
    564564img { 
    565565    border: 0; 
     566    max-width: 100%; 
    566567} 
    567568 
  • docs/tutorial/Makefile

    r9374 r9385  
    3030 
    3131clean: 
    32     -rm -rf _* *.html html .doctrees .buildinfo searchindex.js *.epub epub 
     32    -rm -rf _* *.html html .doctrees .buildinfo searchindex.js *.epub epub rst/_build 
    3333 
    3434epub: 
  • docs/tutorial/rst/association_rules.rst

    r9372 r9385  
    1 .. _assoc1.py: code/assoc1.py 
    2 .. _assoc2.py: code/assoc2.py 
    3 .. _assoc3.py: code/assoc3.py 
    4 .. _imports-85.tab: code/imports-85.tab 
    5 .. _orngAssoc.htm: ../modules/orngAssoc.htm 
    6  
    7  
    81.. index:: association rules 
    92 
     
    2316tabular data).  For number of reasons (but mostly for convenience) 
    2417association rules should be constructed and managed through the 
    25 interface provided by `orngAssoc.htm`_.  As implemented in Orange, 
     18interface provided by :py:mod:`Orange.associate`.  As implemented in Orange, 
    2619association rules construction procedure does not handle continuous 
    2720attributes, so make sure that your data is categorized. Also, class 
    2821variables are treated just like attributes.  For examples in this 
    29 tutorial, we will use data from the data set `imports-85.tab`_, which 
     22tutorial, we will use data from the data set :download:`imports-85.tab <code/imports-85.tab>`, which 
    3023surveys different types of cars and lists their characteristics. We 
    3124will use only first ten attributes from this data set and categorize 
     
    4336will have support of at least 0.4. Next, we select a subset of first 
    4437five rules, print them out, delete first three rules and repeat the 
    45 printout. The script that does this is (part of `assoc1.py`_, uses 
    46 `imports-85.tab`_):: 
     38printout. The script that does this is (part of :download:`assoc1.py <code/assoc1.py>`, uses 
     39:download:`imports-85.tab <code/imports-85.tab>`):: 
    4740 
    4841   rules = orange.AssociationRulesInducer(data, support=0.4) 
     
    9588   arguments.  
    9689 
    97 Here goes the code (part of `assoc2.py`_, uses `imports-85.tab`_):: 
     90Here goes the code (part of :download:`assoc2.py <code/assoc2.py>`, uses :download:`imports-85.tab <code/imports-85.tab>`):: 
    9891 
    9992   rules = orange.AssociationRulesInducer(data, support = 0.4) 
     
    131124confidence, and then print out few best rules. We have also lower 
    132125required minimal support, just to see how many rules we obtain in this 
    133 way (`assoc3.py`_, `imports-85.tab`_):: 
     126way (:download:`assoc3.py <code/assoc3.py>`, :download:`imports-85.tab <code/imports-85.tab>`):: 
    134127 
    135128   minSupport = 0.2 
  • docs/tutorial/rst/basic_exploration.rst

    r9372 r9385  
    1 .. _adult.tab: ../datasets/adult.tab 
    2 .. _adult_sample.tab: ../datasets/adult_sample.tab 
    3 .. _classification.htm: code/classification.htm 
    4 .. _data_characteristics.py: code/data_characteristics.py 
    5 .. _data_characteristics2.py: code/data_characteristics2.py 
    6 .. _data_characteristics3.py: code/data_characteristics3.py 
    7 .. _data_characteristics4.py: code/data_characteristics4.py 
    8 .. _load_data.htm: code/load_data.htm 
    9 .. _regression.htm: code/regression.htm 
    10 .. _report_missing.py: code/report_missing.py 
    11 .. _sample_adult.py: code/sample_adult.py 
    12  
    13  
    141Basic data exploration 
    152====================== 
     
    2310related to this data set is to determine whether a person 
    2411characterized by 14 attributes like education, race, occupation, etc., 
    25 makes over $50K/year. Because of the original set `adult.tab`_ is 
     12makes over $50K/year. Because of the original set :download:`adult.tab <code/adult.tab>` is 
    2613rather big (32561 data instances, about 4 MBytes), we will first 
    2714create a smaller sample of about 3% of instances and use it in our 
    2815examples. If you are curious how we do this, here is the code 
    29 (`sample_adult.py`_ ):: 
     16(:download:`sample_adult.py <code/sample_adult.py>`):: 
    3017 
    3118   import orange 
     
    5845are nominal and continuous), information if data contains missing 
    5946values, and class distribution. Below is the script that does all 
    60 this (`data_characteristics.py`_, `adult_sample.tab`_):: 
     47this (:download:`data_characteristics.py <code/data_characteristics.py>`, :download:`adult_sample.tab <code/adult_sample.tab>`):: 
    6148 
    6249   import orange 
     
    128115each class in the data sets, then the last part of the script needs 
    129116to be slightly changed. This time, we have used string formatting 
    130 with print as well (part of `data_characteristics2.py`_):: 
     117with print as well (part of :download:`data_characteristics2.py <code/data_characteristics2.py>`):: 
    131118 
    132119   # obtain class distribution 
     
    164151attribute, and means for continuous attribute (we will leave the 
    165152computation of standard deviation and other statistics to you). Let's 
    166 compute means of continuous attributes first (part of `data_characteristics3.py`_):: 
     153compute means of continuous attributes first (part of :download:`data_characteristics3.py <code/data_characteristics3.py>`):: 
    167154 
    168155   print "Continuous attributes:" 
     
    197184class. Instead, we used a build-in method DomainContingency, which 
    198185does just that. All that our script will do is, mainly, to print it 
    199 out in a readable form (part of `data_characteristics3.py`_):: 
     186out in a readable form (part of :download:`data_characteristics3.py <code/data_characteristics3.py>`):: 
    200187 
    201188   print "\nNominal attributes (contingency matrix for classes:", data.domain.classVar.values, ")" 
     
    259246determine if for specific instances and attribute the value is not 
    260247defined. Let us use this function to compute the proportion of missing 
    261 values per each attribute (`report_missing.py`_, uses `adult_sample.tab`_):: 
     248values per each attribute (:download:`report_missing.py <code/report_missing.py>`, uses :download:`adult_sample.tab <code/adult_sample.tab>`):: 
    262249 
    263250   import orange 
     
    328315frequencies for discrete attributes, and for both number of instances 
    329316where specific attribute has a missing value.  The use of this object 
    330 is exemplified in the following script (data_characteristics4.py`_, 
    331 uses `adult_sample.tab`_):: 
     317is exemplified in the following script (:download:`data_characteristics4.py <code/data_characteristics4.py>`, 
     318uses :download:`adult_sample.tab <code/adult_sample.tab>`):: 
    332319 
    333320   import orange 
  • docs/tutorial/rst/classification.rst

    r9372 r9385  
    1 .. _classifier.py: code/classifier.py 
    2 .. _classifier2.py: code/classifier2.py 
    3 .. _voting.tab: code/voting.tab 
    4 .. _dot: http://graphviz.org/ 
    5 .. _handful.py: code/handful.py 
    6 .. _tree.py: code/tree.py 
    7  
    81Classification 
    92============== 
     
    158for classification, or supervised data mining. These methods start 
    169from the data that incorporates class-labeled instances, like 
    17 `voting.tab`_:: 
     10:download:`voting.tab <code/voting.tab>`:: 
    1811 
    1912   >>> data = orange.ExampleTable("voting.tab") 
     
    4437construct a naive Bayesian classifier from voting data set, and 
    4538will use it to classify the first five instances from this data set 
    46 (`classifier.py`_, uses `voting.tab`_):: 
     39(:download:`classifier.py <code/classifier.py>`, uses :download:`voting.tab <code/voting.tab>`):: 
    4740 
    4841   import orange 
     
    8174additional parameter ``orange.GetProbabilities``. Also, note that the 
    8275democrats have a class index 1. We find this out with print 
    83 ``data.domain.classVar.values`` (`classifier2.py`_, uses `voting.tab`_):: 
     76``data.domain.classVar.values`` (:download:`classifier2.py <code/classifier2.py>`, uses :download:`voting.tab <code/voting.tab>`):: 
    8477 
    8578   import orange 
     
    127120a wrapper (module) called ``orngTree`` was build around it to simplify 
    128121the use of classification trees and to assemble the learner with 
    129 some usual (default) components. Here is a script with it (`tree.py`_, 
    130 uses `voting.tab`_):: 
     122some usual (default) components. Here is a script with it (:download:`tree.py <code/tree.py>`, 
     123uses :download:`voting.tab <code/voting.tab>`):: 
    131124 
    132125   import orange, orngTree 
     
    144137.. note::  
    145138   The script for classification tree is almost the same as the one 
    146    for naive Bayes (`classifier2.py`_), except that we have imported 
     139   for naive Bayes (:download:`classifier2.py <code/classifier2.py>`), except that we have imported 
    147140   another module (``orngTree``) and used learner 
    148141   ``orngTree.TreeLearner`` to build a classifier called ``tree``. 
     
    207200compiled to PNG using program called `dot`_. 
    208201 
    209 .. image:: tree.* 
     202.. image:: files/tree.png 
    210203   :alt: A graphical presentation of a classification tree 
     204 
     205.. _dot: http://graphviz.org/ 
    211206 
    212207Nearest neighbors and majority classifiers 
     
    236231have already learned), majority and k-nearest neighbors classifier 
    237232(new ones) and prints prediction for first 10 instances of voting data 
    238 set (`handful.py`_, uses `voting.tab`_):: 
     233set (:download:`handful.py <code/handful.py>`, uses :download:`voting.tab <code/voting.tab>`):: 
    239234 
    240235   import orange, orngTree 
  • docs/tutorial/rst/conf.py

    r9376 r9385  
    237237 
    238238# Example configuration for intersphinx: refer to the Python standard library. 
    239 intersphinx_mapping = {'http://docs.python.org/': None} 
     239intersphinx_mapping = { 
     240    'python': ('http://docs.python.org/', None), 
     241    'reference': ('http://orange.biolab.si/doc/reference/', 'http://orange.biolab.si/doc/reference/_objects/'), 
     242} 
  • docs/tutorial/rst/discretization.rst

    r9372 r9385  
    1 .. _disc.py: code/disc.py 
    2 .. _disc2.py: code/disc2.py 
    3 .. _disc3.py: code/disc3.py 
    4 .. _disc4.py: code/disc4.py 
    5 .. _disc5.py: code/disc5.py 
    6 .. _disc6.py: code/disc6.py 
    7 .. _disc7.py: code/disc7.py 
    8 .. _domain.htm: code/domain.htm 
    9 .. _iris.tab: code/iris.tab 
    10  
    111.. index:: discretization 
    122.. index:: 
     
    4434 
    4535Here is a script which demonstraters the basics of discretization in 
    46 Orange (`disc.py`_, uses `iris.tab`_):: 
     36Orange (:download:`disc.py <code/disc.py>`, uses :download:`iris.tab <code/iris.tab>`):: 
    4737 
    4838   import orange 
     
    9686separately, and them use newly crafter attributes to form your new 
    9787domain for the new data set. We have not told you anything on working 
    98 with example domains, so if you want to learn more on this, jump to 
    99 `domain.htm`_ section of this tutorial, and then come back. For those 
    100 of you that trust us in what we are doing, just read on. 
     88with example domains, but trust us in what we are doing, and just read on. 
    10189 
    10290In Orange, when converting examples (transforming one data set to 
     
    10896discretized using quartiles (``sl``) and using Fayyad-Irani's 
    10997algorithm (``sl_ent``). We shall also keep the original (continuous) 
    110 attribute ``sepal width`` (from `disc2.py`_, uses `iris.tab`_):: 
     98attribute ``sepal width`` (from :download:`disc2.py <code/disc2.py>`, uses :download:`iris.tab <code/iris.tab>`):: 
    11199 
    112100   def printexamples(data, inxs, msg="First %i examples"): 
     
    169157 
    170158to our code after the introduction of this two attributes (the new script is in 
    171 `disc3.py`_), following is the second part of the output:: 
     159:download:`disc3.py <code/disc3.py>`), following is the second part of the output:: 
    172160 
    173161   5 examples before discretization 
     
    192180Both, ``EquiNDiscretization`` and ``EntropyDiscretization`` construct 
    193181transformer objects of type ``IntervalDiscretizer``. It's cut-off 
    194 points are stored in a list points (`disc4.py`_, uses `iris.tab`_):: 
     182points are stored in a list points (:download:`disc4.py <code/disc4.py>`, uses :download:`iris.tab <code/iris.tab>`):: 
    195183 
    196184   import orange 
     
    223211change anything the discretization will actually do to the data. In 
    224212the following example, we have rounded the cut-off points for the 
    225 attribute ``pl`` (`disc5.py`_, uses  `iris.tab`_):: 
     213attribute ``pl`` (:download:`disc5.py <code/disc5.py>`, uses :download:`iris.tab <code/iris.tab>`):: 
    226214 
    227215   import orange 
     
    268256 
    269257Let's now discretize Iris' attribute pl using three intervals with 
    270 cut-off points 2.0 and 4.0 (`disc6.py`_, uses  `iris.tab`_):: 
     258cut-off points 2.0 and 4.0 (:download:`disc6.py <code/disc6.py>`, uses :download:`iris.tab <code/iris.tab>`):: 
    271259 
    272260   import orange 
     
    311299from their original continuous versions, so you need only to convert 
    312300the testing examples to a new (discretized) domain. Following code 
    313 shows how (`disc7.py`_, uses  `iris.tab`_):: 
     301shows how (:download:`disc7.py <code/disc7.py>`, uses :download:`iris.tab <code/iris.tab>`):: 
    314302 
    315303   import orange 
  • docs/tutorial/rst/ensembles.rst

    r9372 r9385  
    1 .. _c_bagging.htm: code/c_bagging.htm 
    2 .. _ensemble2.py: code/ensemble2.py 
    3 .. _ensemble3.py: code/ensemble3.py 
    4 .. _orngEnsemble.htm: ../modules/orngEnsemble.htm 
    5 .. _promoters.tab: code/promoters.tab 
    6  
    7  
    81.. index:: ensembles 
    92.. index::  
     
    2114wrappers behave exactly like other Orange learners/classifiers. We 
    2215will here first show how to use a module for bagging and boosting that 
    23 is included in Orange distribution (`orngEnsemble.htm`_ module), and 
     16is included in Orange distribution (:py:mod:`Orange.ensemble` module), and 
    2417then, for a somehow more advanced example build our own ensemble 
    2518learner. Using this module, using it is very easy: you have to define 
    2619a learner, give it to bagger or booster, which in turn returns a new 
    27 (boosted or bagged) learner. Here goes an example (`ensemble3.py`_, 
    28 uses `promoters.tab`_):: 
     20(boosted or bagged) learner. Here goes an example (:download:`ensemble3.py <code/ensemble3.py>`, 
     21uses :download:`promoters.tab <code/promoters.tab>`):: 
    2922 
    3023   import orange, orngTest, orngStat, orngEnsemble 
  • docs/tutorial/rst/evaluation.rst

    r9372 r9385  
    1 .. _accuracy.py: code/accuracy.py 
    2 .. _accuracy2.py: code/accuracy2.py 
    3 .. _accuracy3.py: code/accuracy3.py 
    4 .. _accuracy4.py: code/accuracy4.py 
    5 .. _accuracy5.py: code/accuracy5.py 
    6 .. _accuracy6.py: code/accuracy6.py 
    7 .. _accuracy7.py: code/accuracy7.py 
    8 .. _accuracy8.py: code/accuracy8.py 
    9 .. _orngStat.htm: ../modules/orngStat.htm 
    10 .. _orngTest.htm: ../modules/orngTest.htm 
    11 .. _roc.py: code/roc.py 
    12 .. _voting.tab: code/voting.tab 
    13  
    14  
    151Testing and evaluating your classifiers 
    162======================================= 
     
    217In this lesson you will learn how to estimate the accuracy of 
    228classifiers. The simplest way to do this is to use Orange's 
    23 `orngTest.htm`_ and `orngStat.htm`_ modules. This is probably how you 
     9:py:mod:`Orange.evaluation.testing` and :py:mod:`Orange.statistics` modules. This is probably how you 
    2410will perform evaluation in your scripts, and thus we start with 
    2511examples that uses these two modules. You may as well perform testing 
     
    2814classifiers, do cross-validation, leave-one-out and random 
    2915sampling. While all of this functionality is available in 
    30 `orngTest.htm`_ and `orngStat.htm`_ modules, these example scripts may 
     16:py:mod:`Orange.evaluation.testing` and :py:mod:`Orange.statistics` modules, these example scripts may 
    3117still be useful for those that want to learn more about Orange's 
    3218learner/classifier objects and the way to use them in combination with 
     
    4329script reports on four different scores: classification accuracy, 
    4430information score, Brier score and area under ROC curve 
    45 (`accuracy7.py`_, uses `voting.tab`_):: 
     31(:download:`accuracy7.py <code/accuracy7.py>`, uses :download:`voting.tab <code/voting.tab>`):: 
    4632 
    4733   import orange, orngTest, orngStat, orngTree 
     
    7965and ``AUC``). 
    8066 
    81 Apart from statistics that we have mentioned above, `orngStat.htm`_ 
     67Apart from statistics that we have mentioned above, :py:mod:`Orange.statistics`, 
    8268has build-in functions that can compute other performance metrics, and 
    83 `orngTest.htm`_ includes other testing schemas. If you need to test 
     69:py:mod:`Orange.evaluation.testing` includes other testing schemas. If you need to test 
    8470your learners with standard statistics, these are probably all you 
    8571need. Compared to the script above, we below show the use of some 
    8672other statistics, with perhaps more modular code as above (part of 
    87 `accuracy8.py`_):: 
     73:download:`accuracy8.py <code/accuracy8.py>`):: 
    8874 
    8975   data = orange.ExampleTable("voting") 
     
    120106Let us continue with a line of exploration of voting data set, and 
    121107build a naive Bayesian classifier from it, and compute the 
    122 classification accuracy on the same data set (`accuracy.py`_, uses 
    123 `voting.tab`_):: 
     108classification accuracy on the same data set (:download:`accuracy.py <code/accuracy.py>`, uses 
     109:download:`voting.tab <code/voting.tab>`):: 
    124110 
    125111   import orange 
     
    149135computes the classification accuracies for each of the classifier. By 
    150136this means, let us compare naive Bayes and classification trees 
    151 (`accuracy2.py`_, uses `voting.tab`_):: 
     137(:download:`accuracy2.py <code/accuracy2.py>`, uses :download:`voting.tab <code/voting.tab>`):: 
    152138 
    153139   import orange, orngTree 
     
    202188first half of the data for training and the rest for testing. The 
    203189script is similar to the one above, with a part which is different 
    204 shown below (part of `accuracy3.py`_, uses `voting.tab`_):: 
     190shown below (part of :download:`accuracy3.py <code/accuracy3.py>`, uses :download:`voting.tab <code/voting.tab>`):: 
    205191 
    206192   # set up the classifiers 
     
    276262 
    277263Our script, without accuracy function, which is exactly like the 
    278 one we have defined in `accuracy2.py`_, is (part of `accuracy4.py`_):: 
     264one we have defined in :download:`accuracy2.py <code/accuracy2.py>`, is (part of :download:`accuracy4.py <code/accuracy4.py>`):: 
    279265 
    280266   def test_rnd_sampling(data, learners, p=0.7, n=10): 
     
    367353repetitive random sampling above. We define a function called 
    368354``cross_validation`` and use it to compute the accuracies (part of 
    369 `accuracy5.py`_):: 
     355:download:`accuracy5.py <code/accuracy5.py>`):: 
    370356 
    371357   def cross_validation(data, learners, k=10): 
     
    419405cycle, a single instance is used for testing, while the classifier is 
    420406build on all other instances. One can define leave-one-out test 
    421 through a single Python function (part of `accuracy6.py`_):: 
     407through a single Python function (part of :download:`accuracy6.py <code/accuracy6.py>`):: 
    422408 
    423409   def leave_one_out(data, learners): 
     
    463449-------------- 
    464450 
    465 Going back to the data set we use in this lesson (`voting.tab`_), let 
     451Going back to the data set we use in this lesson (:download:`voting.tab <code/voting.tab>`), let 
    466452us say that at the end of 1984 we met on a corridor two members of 
    467453congress. Somebody tells us that they are for a different party. We 
     
    483469implementation of this measure. 
    484470 
    485 We will use a script similar to `accuracy5.py`_ (k-fold cross 
     471We will use a script similar to :download:`accuracy5.py <code/accuracy5.py>` (k-fold cross 
    486472validation) and will replace the accuracy() function with a function 
    487473that computes area under ROC for a given data set and set of 
     
    497483would be counted as 0.5 instead of 1. The code for function that 
    498484computes the area under ROC using this method is coded in Python as 
    499 (part of `roc.py`_):: 
     485(part of :download:`roc.py <code/roc.py>`):: 
    500486 
    501487   def aroc(data, classifiers): 
     
    526512cross-validation computing area under ROC is rather fast (below 3s), 
    527513there exist a better algorithm with complexity O(n log n) instead of 
    528 O(n^2). Anyway, running `roc.py`_ shows that naive Bayes is better in 
     514O(n^2). Anyway, running :download:`roc.py <code/roc.py>` shows that naive Bayes is better in 
    529515terms of discrimination using area under ROC:: 
    530516 
  • docs/tutorial/rst/feature_subset_selection.rst

    r9372 r9385  
    1 .. _adult_sample.tab: ../datasets/adult_sample.tab 
    2 .. _fss6.py: code/fss6.py 
    3 .. _fss7.py: code/fss7.py 
    4 .. _orngFSS.htm: ../modules/orngFSS.htm 
    5  
    6  
    71.. index:: 
    82   single: feature subset selection 
     
    137While the core Orange provides mechanisms to estimate relevance of 
    148attributes that describe classified instances, a module called 
    15 `orngFSS.htm`_ provides functions and wrappers that simplify feature 
     9:py:mod:`Orange.feature.selection` provides functions and wrappers that simplify feature 
    1610subset selection. For instance, the following code loads the data, 
    1711sets-up a filter that will use Relief measure to estimate the 
    1812relevance of attributes and remove attribute with relevance lower than 
    19 0.01, and in this way construct a new data set (`fss6.py`_, uses 
    20 `adult_sample.tab`_):: 
     130.01, and in this way construct a new data set (:download:`fss6.py <code/fss6.py>`, uses 
     14:download:`adult_sample.tab <code/adult_sample.tab>`):: 
    2115 
    2216   import orange, orngFSS 
     
    9185particular wrapper from orngDisc). The code is quite short since we 
    9286will also use a wrapper called FilteredLearner from orngFSS module 
    93 (part of `fss7.py`_, uses `adult_sample.tab`_):: 
     87(part of :download:`fss7.py <code/fss7.py>`, uses :download:`adult_sample.tab <code/adult_sample.tab>`):: 
    9488 
    9589   import orange, orngDisc, orngTest, orngStat, orngFSS 
     
    116110 
    117111The code that computes this statistics, as well as determines which 
    118 are those features that were used, is shown below (from `fss7.py`_):: 
     112are those features that were used, is shown below (from :download:`fss7.py <code/fss7.py>`):: 
    119113 
    120114   # how many attributes did each classifier use? 
  • docs/tutorial/rst/learners_in_python.rst

    r9372 r9385  
    1 .. _adult_sample.tab: ../datasets/adult_sample.tab 
    2 .. _bagging.py: code/bagging.py 
    3 .. _bagging_test.py: code/bagging_test.py 
    4 .. _bayes.py: code/bayes.py 
    5 .. _bayes_test.py: code/bayes_test.py 
    6 .. _c_nb_disc.htm: code/c_nb_disc.htm 
    7 .. _iris.tab: code/iris.tab 
    8 .. _nbdisc.py: code/nbdisc.py 
    9 .. _nbdisc_test.py: code/nbdisc_test.py 
    10 .. _o_categorization.htm: code/o_categorization.htm 
    11 .. _orngEnsemble.htm: ../modules/orngEnsemble.htm 
    12 .. _voting.tab: code/voting.tab 
    13  
    14  
    151Build your own learner 
    162====================== 
     
    8369implementation of bagging. 
    8470 
     71.. _naive bayes with discretization: 
     72 
    8573Naive Bayes with discretization 
    8674------------------------------- 
     
    8876Let us build a learner/classifier that is an extension of build-in 
    8977naive Bayes and which before learning categorizes the data. We will 
    90 define a module `nbdisc.py`_ that will implement two classes, Learner 
     78define a module :download:`nbdisc.py <code/nbdisc.py>` that will implement two classes, Learner 
    9179and Classifier. Following is a Python code for a Learner class (part 
    92 of `nbdisc.py`_):: 
     80of :download:`nbdisc.py <code/nbdisc.py>`):: 
    9381 
    9482   class Learner(object): 
     
    130118finally pass it to a class ``Classifier``. You may expect that at its 
    131119first invocation the ``Classifier`` will just remember the model we 
    132 have called it with (part of `nbdisc.py`_):: 
     120have called it with (part of :download:`nbdisc.py <code/nbdisc.py>`):: 
    133121 
    134122   class Classifier: 
     
    164152For a more elaborate test that also shows the use of a learner (that 
    165153is not given the data at its initialization), here is a script that 
    166 does 10-fold cross validation (`nbdisc_test.py`_, uses `iris.tab`_ and 
    167 `nbdisc.py`_):: 
     154does 10-fold cross validation (:download:`nbdisc_test.py <code/nbdisc_test.py>`, uses :download:`iris.tab <code/iris.tab>` and 
     155:download:`nbdisc.py <code/nbdisc.py>`):: 
    168156 
    169157   import orange, orngEval, nbdisc 
     
    232220We will develop a module called bayes.py that will implement our naive 
    233221Bayes learner and classifier. The structure of the module will be as 
    234 with `c_nb_disc.htm`_.  Again, we will implement two classes, one for 
     222with `naive bayes with discretization`_.  Again, we will implement two classes, one for 
    235223learning and the other on for classification. Here is a ``Learner``: 
    236 class (part of `bayes.py`_):: 
     224class (part of :download:`bayes.py <code/bayes.py>`):: 
    237225 
    238226   class Learner_Class: 
     
    289277data set, computes class and conditional probabilities and calls 
    290278classifiers, passing the probabilities along with some other variables 
    291 required for classification (part of `bayes.py`_):: 
     279required for classification (part of :download:`bayes.py <code/bayes.py>`):: 
    292280 
    293281   class Classifier: 
     
    345333The following script tests our naive Bayes, and compares it to 
    34633410-nearest neighbors. Running the script (do you it yourself) reports 
    347 classification accuracies just about 90% (`bayes_test.py`_, uses 
    348 `bayes.py`_ and `voting.tab`_):: 
     335classification accuracies just about 90% (:download:`bayes_test.py <code/bayes_test.py>`, uses 
     336:download:`bayes.py <code/bayes.py>` and :download:`voting.tab <code/voting.tab>`):: 
    349337 
    350338   import orange, orngEval, bayes 
     
    365353Here we show how to use the schema that allows us to build our own 
    366354learners/classifiers for bagging. While you can find bagging, 
    367 boosting, and other ensemble-related stuff in `orngEnsemble.htm`_ module, we thought 
     355boosting, and other ensemble-related stuff in :py:mod:`Orange.ensemble` module, we thought 
    368356explaining how to code bagging in Python may provide for a nice 
    369357example. The following pseudo-code (from 
     
    389377 
    390378The code for the ``Learner_Class`` is therefore (part of 
    391 `bagging.py`_):: 
     379:download:`bagging.py <code/bagging.py>`):: 
    392380 
    393381   class Learner_Class: 
     
    421409examples (``example.getitems``). Finally, a ``Classifier`` is called 
    422410with a list of classifiers, name and domain information (part of 
    423 `bagging.py`_):: 
     411:download:`bagging.py <code/bagging.py>`):: 
    424412 
    425413   class Classifier: 
     
    455443Here is the code that tests our bagging we have just implemented. It 
    456444compares a decision tree and its bagged variant.  Run it yourself to 
    457 see which one is better (`bagging_test.py`_, uses `bagging.py`_ and 
    458 `adult_sample.tab`_):: 
     445see which one is better (:download:`bagging_test.py <code/bagging_test.py>`, uses :download:`bagging.py <code/bagging.py>` and 
     446:download:`adult_sample.tab <code/adult_sample.tab>`):: 
    459447 
    460448   import orange, orngTree, orngEval, bagging 
  • docs/tutorial/rst/load_data.rst

    r9372 r9385  
    11Data input 
    22========== 
    3  
    4 .. _lenses.tab: code/lenses.tab 
    5 .. _lenses.py: code/lenses.py 
    63 
    74.. index:: data input 
     
    1714 
    1815Let us start with example and Orange native data format. Let us 
    19 consider an artificial data set `lenses.tab`_ on prescription of eye 
     16consider an artificial data set :download:`lenses.tab <code/lenses.tab>` on prescription of eye 
    2017lenses [CJ1987]. The data set has four attributes (age of the patient, 
    2118spectacle prescription, notion on astigmatism, and information on tear 
     
    2623on the four attributes, prescribes the right lenses. But before we do 
    2724that, let us see how the data set file is composed and how to read it 
    28 in Orange by displaying first few lines of `lenses.tab`_:: 
     25in Orange by displaying first few lines of :download:`lenses.tab <code/lenses.tab>`:: 
    2926 
    3027   age       prescription  astigmatic    tear_rate     lenses 
     
    6360Attribute values are separated with tabulators (<TAB>).  This is 
    6461rather hard to see above (it looks like spaces were used), so to 
    65 verify that check the original `lenses.tab`_ data set in 
     62verify that check the original :download:`lenses.tab <code/lenses.tab>` data set in 
    6663your favorite text editor.  Alternatively, authors of this text like 
    6764best to edit these files in a spreadsheet program (and use 
     
    6966as edited in Excel can look like this: 
    7067 
    71 .. image:: excel.* 
     68.. image:: files/excel.png 
    7269   :alt: Data in Excel 
    7370 
    74 Now create a directory, save `lenses.tab`_ in 
     71Now create a directory, save :download:`lenses.tab <code/lenses.tab>` in 
    7572it (right click on the link and choose choose "Save Target As 
    7673..."). Open a terminal (cmd shell in Windows, Terminal on Mac OS X), 
     
    103100>>> 
    104101 
    105 Now let's put together a script file `lenses.py`_ that 
     102Now let's put together a script file :download:`lenses.py <code/lenses.py>` that 
    106103reads lenses data, prints out names of the attributes and class, and 
    107 lists first 5 data instances (`lenses.py`_):: 
     104lists first 5 data instances (:download:`lenses.py <code/lenses.py>`):: 
    108105 
    109106   import orange 
     
    129126statements that are within each loop). 
    130127 
    131 Save `lenses.py`_ in your working directory. There 
     128Save :download:`lenses.py <code/lenses.py>` in your working directory. There 
    132129should now be both files lenses.py and lenses.tab. Now let's see if we 
    133130run the script we have just written:: 
     
    151148described in a separate file ".names".  Instead of going into how 
    152149exactly these files are formed, we show just an example that Orange 
    153 can handle them. For this purpose, load `car.data <car.data>`_ and 
    154 `car.names <car.names>`_ and run the following code:: 
     150can handle them. For this purpose, load :download:`car.data <code/car.data>` and 
     151:download:`car.names <code/car.names>` and run the following code:: 
    155152 
    156153   > python 
  • docs/tutorial/rst/regression.rst

    r9372 r9385  
    1 .. _housing.tab: code/housing.tab 
    2 .. _orngStat.htm: ../modules/orngStat.htm 
    3 .. _regression1.py: code/regression1.py 
    4 .. _regression2.py: code/regression2.py 
    5 .. _regression3.py: code/regression3.py 
    6 .. _regression4.py: code/regression4.py 
    7  
    8  
    91.. index:: regression 
    102 
     
    2517 
    2618Let us start with regression trees. Below is an example script that builds 
    27 the tree from `housing.tab`_ data set and prints 
    28 out the tree in textual form (`regression1.py`_):: 
     19the tree from :download:`housing.tab <code/housing.tab>` data set and prints 
     20out the tree in textual form (:download:`regression1.py <code/regression1.py>`):: 
    2921 
    3022   import orange, orngTree 
     
    5547regression trees and k-nearest neighbors, and also uses a majority 
    5648learner which for regression simply returns an average value from 
    57 learning data set (`regression2.py`_):: 
     49learning data set (:download:`regression2.py <code/regression2.py>`):: 
    5850 
    5951   import orange, orngTree, orngTest, orngStat 
     
    108100For our third and last example for regression, let us see how we can 
    109101use cross-validation testing and for a score function use 
    110 (`regression3.py`_, uses  `housing.tab`_):: 
     102(:download:`regression3.py <code/regression3.py>`, uses `housing.tab <code/housing.tab>`):: 
    111103 
    112104   import orange, orngTree, orngTest, orngStat 
     
    144136implementation where a list of scoring techniques is defined 
    145137independetly from the code that reports on the results (part of 
    146 `regression4.py`_):: 
     138:download:`regression4.py <code/regression4.py>`):: 
    147139 
    148140   lr = orngRegression.LinearRegressionLearner(name="lr") 
     
    177169* R2 - coefficient of determinatin, also referred to as R-squared. 
    178170 
    179 For precise definition of these measures, see `orngStat.htm`_. Running 
     171For precise definition of these measures, see :py:mod:`Orange.statistics`. Running 
    180172the script above yields:: 
    181173 
  • docs/tutorial/rst/start.rst

    r9372 r9385  
    33 
    44To start Orange scripting, you will need to `download 
    5 <http://orange.biolab.si/download.html>`_ and install Orange. Python can 
     5<http://orange.biolab.si/download/>`_ and install Orange. Python can 
    66be run in a window with a terminal, special integrated environments 
    77(like PythonWin), or shells like `iPython 
     
    1414with this feature: 
    1515 
    16 .. image:: python_win.* 
     16.. image:: files/python_win.png 
    1717   :alt: Orange in PythonWin 
    1818 
     
    2020completion): 
    2121 
    22 .. image:: ipython.* 
     22.. image:: files/ipython.png 
    2323   :alt: Orange in iPython 
    2424 
Note: See TracChangeset for help on using the changeset viewer.