source: orange/docs/widgets/rst/data/mergedata.rst @ 11359:8d54e79aa135

Revision 11359:8d54e79aa135, 3.7 KB checked in by Ales Erjavec <ales.erjavec@…>, 14 months ago (diff)

Cleanup of 'Widget catalog' documentation.

Fixed rst text formating, replaced dead hardcoded reference links (now using
:ref:), etc.

Line 
1.. _Merge Data:
2
3Merge Data
4==========
5
6.. image:: ../icons/MergeData.png
7
8Merges two data sets based on the values of selected attributes.
9
10Signals
11-------
12
13Inputs:
14
15
16   - Examples A (ExampleTable)
17      Attribute-valued data set.
18   - Examples B (ExampleTable)
19      Attribute-valued data set.
20
21
22Outputs:
23
24
25   - Merged Examples A+B (ExampleTable)
26      Attribute-valued data set composed from instances from input data A
27      which are appended attributes from input data B and their values
28      determined by matching the values of the selected attributes.
29   - Merged Examples B+A (ExampleTable)
30      Attribute-valued data set composed from instances from input data
31      B which are appended attributes from input data A and their values
32      determined by matching the values of the selected attributes.
33
34
35Description
36-----------
37
38Merge Data widget is used to horizontally merge two data sets based on the
39values of selected attributes. On input, two data sets are required, A and B.
40The widget allows for selection of an attribute from each domain which will be
41used to perform the merging. When selected, the widget produces two outputs,
42A+B and B+A. The first output (A+B) corresponds to instances from input
43data A which are appended attributes from B, and the second output (B+A)
44to instances from B which are appended attributes from A.
45
46The merging is done by the values of the selected (merging) attributes. For
47example, instances from from A+B are constructed in the following way.
48First, the value of the merging attribute from A is taken and instances
49from B are searched with matching values of the merging attributes. If
50more than a single instance from B is found, the first one is taken and
51horizontally merged with the instance from A. If no instance from B match
52the criterium, the unknown values are assigned to the appended attributes.
53Similarly, B+A is constructed.
54
55.. image:: images/MergeData1.png
56   :alt: Merge Data
57
58Examples
59--------
60
61Below is an example that loads spot intensity data from microarray
62measurements and spot annotation data. While microarray data consists of
63measurements of several spots representing equal DNA material (denoted by
64equal :obj:`Spot ID's`), the annotation data consists of a single line
65(instance) for each spot.
66
67Merging the two data sets results in annotations appended to each spot
68intensity datum. The :obj:`Spot intensities` data is connected to
69:obj:`Examples A` input of the :ref:`Merge Data` widget, and the
70:obj:`Spot annotations` data to the :obj:`Examples B` input. Both outputs
71of the :ref:`Merge Data` widget are then connected to the :ref:`Data Table`
72widget. In the latter, the :obj:`Merged Examples A+B` are shown.
73The attributes between :obj:`Spot ID` and :obj:`BG {Ref}`, including these
74two, are from the :obj:`Spot intensities` data set (:obj:`Examples A`),
75while the last three are from the :obj:`Spot annotations` data set
76(:obj:`Examples B`). Only instances representing non-control DNA (these
77with :obj:`Spot ID` equal to :obj:`ST_Hs_???`) received annotations, while
78for the others (:obj:`Spot ID = ST_Cr_048`), no annotation data exists in
79the :obj:`Spot annotations` data and unknown values were assigned to the
80appended attributes.
81
82.. image:: images/MergeData2s.png
83   :alt: Schema with Merge Data
84
85Hint
86----
87
88If the two data sets consists of equally-named attributes (others than the
89ones used to perform the merging), Orange will by default check for
90consistency of the values of these attributes and report an error in
91case of non-matching values. In order to avoid the consistency checking,
92make sure that new attributes are created for each data set: you may use
93"... Always create a new attribute" option in the :ref:`File` widget for
94loading the data.
Note: See TracBrowser for help on using the repository browser.