source: orange/docs/extend-widgets/rst/channels.rst @ 11881:99bec0d8a70d

Revision 11881:99bec0d8a70d, 8.7 KB checked in by Ales Erjavec <ales.erjavec@…>, 5 weeks ago (diff)

More fixes to widget development manual code snippets.

Line 
1###################
2Channels and Tokens
3###################
4
5Our data sampler widget was, regarding the channels, rather simple
6and linear: the widget was designed to receive the token from one
7widget, and send an output token to another widget. Just like in an
8example schema below:
9
10.. image:: schemawithdatasamplerB.png
11
12There's quite a bit more to channels and management of tokens, and
13we will overview most of the stuff you need to know to make your more
14complex widgets in this section.
15
16********************
17Multi-Input Channels
18********************
19
20In essence, the basic idea about "multi-input" channels is that they can
21be used to connect them with several output channels. That is, if a
22widget supports such a channel, several widgets can feed their input
23to that widget simultaneously.
24
25Say we want to build a widget that takes a data set and test
26various predictive modeling techniques on it. A widget has to have an
27input data channel, and this we know how to deal with from our
28:doc:`previous <settings>` lesson. But, somehow differently, we
29want to connect any number of widgets which define learners to our
30testing widget. Just like in a schema below, where three different
31learners are used:
32
33.. image:: learningcurve.png
34
35We will here take a look at how we define the channels for a learning
36curve widget, and how we manage its input tokens. But before we do it,
37just in brief: learning curve is something that you can use to test
38some machine learning algorithm in trying to see how its performance
39depends on the size of the training set size. For this, one can draw a
40smaller subset of data, learn the classifier, and test it on remaining
41data set. To do this in a just way (by Salzberg, 1997), we perform
42k-fold cross validation but use only a proportion of the data for
43training. The output of the widget should then look something
44like:
45
46.. image:: learningcurve-output.png
47
48Now back to channels and tokens. Input and output channels for our
49widget are defined by
50
51.. literalinclude:: OWLearningCurveA.py
52   :start-after: start-snippet-1
53   :end-before: end-snippet-1
54
55
56Notice that everything is pretty much the same as it was with
57widgets from previous lessons, the only difference being
58``Multiple + Default`` (importable from  the OWWidget namespace)
59as the last value in the list that defines the :obj:`Learner`
60channel. This ``Multiple + Default`` says that this
61is a multi-input channel and is the default input for its type.
62If it would be unspecified then by default value of
63``Single + NonDefault`` would be used. That would mean that the
64widget can receive the input only from one widget and is not the default input
65channel for its type (more on default channels later).
66
67.. note::
68   :obj:`Default` flag here is used for illustration. Since *"Learner"*
69   channel is the only channel for a :class:`Orange.classification.Learner`
70   type it is also the default.
71
72How does the widget know from which widget did the token come from?
73In Orange, tokens are sent around with an id of a widget that is
74sending the token, and having a multi-input channel only tells Orange to
75send a token together with sending widget id, the two arguments with
76which the receiving function is called. For our *"Learner"*
77channel the receiving function is :func:`set_learner`, and this looks
78like the following
79
80.. literalinclude:: OWLearningCurveA.py
81   :pyobject: OWLearningCurveA.set_learner
82
83
84OK, this looks like one long and complicated function. But be
85patient! Learning curve is not the simplest widget there is, so
86there's some extra code in the function above to manage the
87information it handles in the appropriate way. To understand the
88signals, though, you should only understand the following. We store
89the learners (objects that learn from data) in the list
90:obj:`self.learners`. The list contains tuples with an id of the
91widget that has sent the learner, and the learner itself. We could
92store such information in a dictionary as well, but for this
93particular widget the order of learners is important, and we thought
94that list is a more appropriate structure.
95
96The function above first checks if the learner sent is empty
97(:obj:`None`). Remember that sending an empty learner
98essentially means that the link with the sending widget was removed,
99hence we need to remove such learner from our list. If a non-empty
100learner was sent, then it is either a new learner (say, from a widget
101we have just linked to our learning curve widget), or an update
102version of the previously sent learner. If the later is the case, then
103there is an ID which we already have in the learners list, and we
104need to replace previous information on that learner. If a new learner
105was sent, the case is somehow simpler, and we just add this learner
106and its learning curve to the corresponding variables that hold this
107information.
108
109The function that handles :obj:`learners` as shown above is
110the most complicated function in our learning curve widget. In fact,
111the rest of the widget does some simple GUI management, and calls
112learning curve routines from testing and performance
113scoring functions from stats. I rather like
114the easy by which new scoring functions are added to the widget, since
115all that is needed is the augmenting the list
116
117.. literalinclude:: OWLearningCurveA.py
118   :start-after: start-snippet-2
119   :end-before: end-snippet-2
120
121
122which is defined in the initialization part of the widget. The
123other useful trick in this widget is that evaluation (k-fold cross
124validation) is carried out just once given the learner, data set and
125evaluation parameters, and scores are then derived from class
126probability estimates as obtained from the evaluation procedure. Which
127essentially means that switching from one to another scoring function
128(and displaying the result in the table) takes only a split of a
129second. To see the rest of the widget, check out
130:download:`its code <OWLearningCurveA.py>`.
131
132*****************************
133Using Several Output Channels
134*****************************
135
136There's nothing new here, only that we need a widget that has
137several output channels of the same type to illustrate the idea of the
138default channels in the next section. For this purpose, we will modify
139our sampling widget as defined in previous lessons such that it will
140send out the sampled data to one channel, and all other data to
141another channel. The corresponding channel definition of this widget
142is
143
144.. literalinclude:: OWDataSamplerC.py
145   :start-after: start-snippet-1
146   :end-before: end-snippet-1
147
148
149We used this in the third incarnation of :download:`data sampler widget <OWDataSamplerC.py>`,
150with essentially the only other change in the code in the :func:`selection` and
151:func:`commit` functions
152
153.. literalinclude:: OWDataSamplerC.py
154   :pyobject: OWDataSamplerC.selection
155
156.. literalinclude:: OWDataSamplerC.py
157   :pyobject: OWDataSamplerC.commit
158
159
160If a widget that has multiple channels of the same type is
161connected to a widget that accepts such tokens, Orange Canvas opens a
162window asking the user to confirm which channels to connect. Hence,
163if we have just connected *Data Sampler (C)* widget to a Data Table
164widget in a schema below:
165
166.. image:: datasampler-totable.png
167
168we would get a following window querying users for information on
169which channels to connect:
170
171.. image:: datasampler-channelquerry.png
172
173*************************************************************
174Default Channels (When Using Input Channels of the Same Type)
175*************************************************************
176
177Now, let's say we want to extend our learning curve widget such
178that it does the learning the same way as it used to, but can -
179provided that such data set is defined - test the
180learners (always) on the same, external data set. That is, besides the
181training data set, we need another channel of the same type but used
182for training data set. Notice, however, that most often we will only
183provide the training data set, so we would not like to be bothered (in
184Orange Canvas) with the dialog which channel to connect to, as the
185training data set channel will be the default one.
186
187When enlisting the input channel of the same type, the default
188channels have a special flag in the channel specification list. So for
189our new :download:`learning curve <OWLearningCurveB.py>` widget, the
190channel specification is
191
192.. literalinclude:: OWLearningCurveB.py
193   :start-after: start-snippet-1
194   :end-before: end-snippet-1
195
196
197That is, the :obj:`Train Data` channel is a single-token
198channel which is a default one (third parameter). Note that the flags can
199be added (or OR-d) together so ``Default + Multiple`` is a valid flag.
200To test how this works, connect a file widget to a learning curve widget and
201- nothing will really happen:
202
203.. image:: file-to-learningcurveb.png
204
205That is, no window with a query on which channels to connect to will
206open, as the default *"Train Data"* was selected.
Note: See TracBrowser for help on using the repository browser.