Ignore:
Timestamp:
02/27/13 15:02:50 (14 months ago)
Author:
Ales Erjavec <ales.erjavec@…>
Branch:
default
Message:

Cleanup of 'Widget catalog' documentation.

Fixed rst text formating, replaced dead hardcoded reference links (now using
:ref:), etc.

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/widgets/rst/data/rank.rst

    r11050 r11359  
    2525 
    2626   - ExampleTable Attributes (ExampleTable) 
    27       Data set in where each example corresponds to an attribute from the original set, and the attributes correspond one of the selected attribute evaluation measures. 
     27      Data set in where each example corresponds to an attribute from the 
     28      original set, and the attributes correspond one of the selected 
     29      attribute evaluation measures. 
    2830 
    2931 
     
    3133----------- 
    3234 
    33 This widget computes a set of measures for evaluating the quality/usefulness of attributes: ReliefF, information gain, gain ratio and gini index. Besides providing this information, it also allows user to select a subset of attributes or it can automatically select the specified number of best-ranked attributes. 
     35This widget computes a set of measures for evaluating the quality/usefulness 
     36of attributes: ReliefF, information gain, gain ratio and gini index. 
     37Besides providing this information, it also allows user to select a subset 
     38of attributes or it can automatically select the specified number of 
     39best-ranked attributes. 
    3440 
    3541.. image:: images/Rank.png 
    3642 
    37 The right-hand side of the widget presents the computed quality of the attributes. The first line shows the attribute name and the second the number of its values (or a "C", if the attribute is continuous. Remaining columns show different measures of quality. 
     43The right-hand side of the widget presents the computed quality of the 
     44attributes. The first line shows the attribute name and the second the 
     45number of its values (or a "C", if the attribute is continuous. Remaining 
     46columns show different measures of quality. 
    3847 
    39 The user is able to select the measures (s)he wants computed and presented. :obj:`ReliefF` requires setting two arguments: the number of :obj:`Neighbours` taken into account and the number of randomly chosen reference :obj:`Examples`. The former should be higher if there is a lot of noise; the latter generally makes the computation less reliable if set too low, while higher values make it slow. 
     48The user is able to select the measures (s)he wants computed and presented. 
     49:obj:`ReliefF` requires setting two arguments: the number of :obj:`Neighbours` 
     50taken into account and the number of randomly chosen reference :obj:`Examples`. 
     51The former should be higher if there is a lot of noise; the latter generally 
     52makes the computation less reliable if set too low, while higher values 
     53make it slow. 
    4054 
    41 The order in which the attributes are presented can be set either in the list below the measures or by clicking the table's column headers. Attributes can also be sorted by a measure not printed in the table. 
     55The order in which the attributes are presented can be set either in the 
     56list below the measures or by clicking the table's column headers. Attributes 
     57can also be sorted by a measure not printed in the table. 
    4258 
    43 Measures that cannot handle continuous attributes (impurity measures - information gain, gain ratio and gini index) are run on discretized attributes. For sake of simplicity we always split the continuous attributes in intervals with (approximately) equal number of examples, but the user can set the number of :obj:`Intervals`. 
     59Measures that cannot handle continuous attributes (impurity 
     60measures - information gain, gain ratio and gini index) are run on 
     61discretized attributes. For sake of simplicity we always split the 
     62continuous attributes in intervals with (approximately) equal number of 
     63examples, but the user can set the number of :obj:`Intervals`. 
    4464 
    45 It is also possible to set the number of decimals (:obj:`No. of decimals`) in the print out. Using a number to high may exaggerate the accuracy of the computation; many decimals may only be useful when the computed numbers are really small. 
     65It is also possible to set the number of decimals 
     66(:obj:`No. of decimals`) in the print out. Using a number to high may 
     67exaggerate the accuracy of the computation; many decimals may only be 
     68useful when the computed numbers are really small. 
    4669 
    47 The widget outputs two example tables. The one, whose corresponding signal is named :code:`ExampleTable Attributes` looks pretty much like the one shown in the Rank widget, except that the second column is split into two columns, one giving the attribute type (D for discrete and C for continuous), and the other giving the number of distinct values if the attribute is discrete and undefined if it's continuous. 
     70The widget outputs two example tables. The one, whose corresponding signal 
     71is named :code:`ExampleTable Attributes` looks pretty much like the one 
     72shown in the Rank widget, except that the second column is split into two 
     73columns, one giving the attribute type (D for discrete and C for continuous), 
     74and the other giving the number of distinct values if the attribute is 
     75discrete and undefined if it's continuous. 
    4876 
    49 The second, more interesting table has the same examples as the original, but with a subset of the attributes. To select/unselect attributes, click the corresponding rows in the table. This way, the widget can be used for manual selection of attributes. Something similar can also be done with a `Select Attributes <SelectAttributes.htm>`_ widget, except that the Rank widget can be used for selecting the attributes according to their quality, while Select Attributes offers more in terms of changing the order of attributes, picking another class attribute and similar. 
     77The second, more interesting table has the same examples as the original, 
     78but with a subset of the attributes. To select/unselect attributes, click 
     79the corresponding rows in the table. This way, the widget can be used for 
     80manual selection of attributes. Something similar can also be done with 
     81a :ref:`Select Attributes` widget, except that the Rank widget can be used 
     82for selecting the attributes according to their quality, while Select 
     83Attributes offers more in terms of changing the order of attributes, 
     84picking another class attribute and similar. 
    5085 
    51 The widget can also be used to automatically select a feature subset. If :obj:`Best ranked` is selected in box :obj:`Select attributes`, the widget will output a data set where examples are described by the specified number of best ranked attributes. The data set is changed whenever the order of attributes is changed for any reason (different measure is selected for sorting, ReliefF or discretization settings are changed...) 
     86The widget can also be used to automatically select a feature subset. 
     87If :obj:`Best ranked` is selected in box :obj:`Select Attributes`, the 
     88widget will output a data set where examples are described by the 
     89specified number of best ranked attributes. The data set is changed 
     90whenever the order of attributes is changed for any reason (different 
     91measure is selected for sorting, ReliefF or discretization settings are 
     92changed...) 
    5293 
    53 The first two options in :obj:`Select Attributes` box can be used to clear the selection (:obj:`None`) or to select all attributes (:obj:`All`). 
     94The first two options in :obj:`Select Attributes` box can be used to 
     95clear the selection (:obj:`None`) or to select all attributes (:obj:`All`). 
    5496 
    55 Button :obj:`Commit` sends the data set with the selected attributes. If :obj:`Send automatically` is set, the data set is committed on any change. 
     97Button :obj:`Commit` sends the data set with the selected attributes. 
     98If :obj:`Commit automatically` is set, the data set is committed on any change. 
    5699 
    57100 
     
    59102-------- 
    60103 
    61 On typical use of the widget is to put it immediately after the `File widget <File.htm>`_ to reduce the attribute set. The snapshot below shows this as a part of a bit more complicated schema. 
     104On typical use of the widget is to put it immediately after the :ref:`File` 
     105widget to reduce the attribute set. The snapshot below shows this as a part of 
     106a bit more complicated schema. 
    62107 
    63108.. image:: images/Rank-after-file-Schema.png 
    64109 
    65 The examples in the file are put through `Data Sampler <DataSampler.htm>`_ which split the data set into two subsets: one, containing 70% of examples (signal :code:`Classified Examples`) will be used for training a `naive Bayesian classifier <../Classify/NaiveBayes.htm>`_, and the other 30% (signal :code:`Remaining Classified Examples`) for testing. Attribute subset selection based on information gain was performed on the training set only, and five most informative attributes were selected for learning. A data set with all other attributes removed (signal :code:`Reduced Example Table`) is fed into :code:`Test Learners`. Test Learners widgets also gets the :code:`Remaining Classified Examples` to use them as test examples (don't forget to set :code:`Test on Test Data` in that widget!). 
     110The examples in the file are put through ref:`Data Sampler` which split the 
     111data set into two subsets: one, containing 70% of examples (signal 
     112:code:`Classified Examples`) will be used for training a 
     113:ref:`Naive Bayes <Naive Bayes>` classifier, and the other 30% (signal 
     114:code:`Remaining Classified Examples`) for testing. Attribute subset selection 
     115based on information gain was performed on the training set only, and five most 
     116informative attributes were selected for learning. A data set with all other 
     117attributes removed (signal :code:`Reduced Example Table`) is fed into 
     118:ref:`Test Learners`. Test Learners widgets also gets the 
     119:code:`Remaining Classified Examples` to use them as test examples (don't 
     120forget to set :code:`Test on Test Data` in that widget!). 
    66121 
    67 To verify how the subset selection affects the classifier's performance, we added another :code:`Test Learners`, but connected it to the :code:`Data Sampler` so that the two subsets emitted by the latter are used for training and testing without any feature subset selection. 
     122To verify how the subset selection affects the classifier's performance, we 
     123added another :ref:`Test Learners`, but connected it to the 
     124:code:`Data Sampler` so that the two subsets emitted by the latter are used 
     125for training and testing without any feature subset selection. 
    68126 
    69 Running this schema on the heart disease data set shows quite a considerable improvements in all respects on the reduced attribute subset. 
     127Running this schema on the heart disease data set shows quite a considerable 
     128improvements in all respects on the reduced attribute subset. 
    70129 
    71 In another, way simpler example, we connected a `Tree Viewer <../Classify/ClassificationTreeGraph.htm>`_ to the Rank widget to observe different attribute quality measures at different nodes. This can give us some picture about how important is the selection of measure in tree construction: the more the measures agree about attribute ranking, the less crucial is the measure selection. 
     130In another, way simpler example, we connected a 
     131:ref:`Classification Tree Viewer` to the Rank widget to observe different 
     132attribute quality measures at different nodes. This can give us some picture 
     133about how important is the selection of measure in tree construction: the more 
     134the measures agree about attribute ranking, the less crucial is the measure 
     135selection. 
    72136 
    73137.. image:: images/Rank-Tree.png 
    74138 
    75 A variation of the above is using the Rank widget after the `Interactive tree builder <../Classify/InteractiveTreeBuilder.htm>`_: the sorted attributes may help us in deciding the attribute to use at a certain node. 
     139A variation of the above is using the Rank widget after the 
     140:ref:`Interactive Tree Builder`: the sorted attributes may help us in deciding 
     141the attribute to use at a certain node. 
    76142 
    77143.. image:: images/Rank-ITree.png 
Note: See TracChangeset for help on using the changeset viewer.