source: orange/docs/widgets/rst/visualize/mosaicdisplay.rst @ 11359:8d54e79aa135

Revision 11359:8d54e79aa135, 3.9 KB checked in by Ales Erjavec <ales.erjavec@…>, 14 months ago (diff)

Cleanup of 'Widget catalog' documentation.

Fixed rst text formating, replaced dead hardcoded reference links (now using
:ref:), etc.

Line 
1.. _Mosaic Display:
2
3Mosaic Display
4==============
5
6.. image:: ../icons/MosaicDisplay.png
7
8Shows a mosaic display of n-way tables.
9
10Signals
11-------
12
13Inputs:
14   - Examples (ExampleTable)
15      Input data set.
16   - Example Subset (ExampleTable)
17      A subset of data instances from Examples.
18   - Selected Examples (ExampleTable)
19      A subset of examples belonging to manually selected cells in mosaic
20      display.
21
22Outputs:
23   - None
24
25
26Description
27-----------
28
29The mosaic display is a graphical method to visualize the counts in n-way
30`contingency tables <http://en.wikipedia.org/wiki/Contingency_table>`_, that
31is, tables where each cell corresponds to a distinct value-combination of n
32attributes. The method was proposed by Hartigan & Kleiner
33([HartiganKleiner81]_) and extended by Friendly ([Friendly94]_). Each cell in
34mosaic display corresponds to a single cell in contingency table. If the data
35contains a class attribute, the mosaic display will show the class
36distribution.
37
38Orange's implementation of mosaic display allows to observe the interactions
39of up to four variables in a single visualization. The snapshot below shows a
40mosaic display for the Titanic data set, observing three variables (sex,
41status, and age) and their association with a class (survived). The diagram
42shows that the survival (red color) was highest for women traveling in the
43first class, and lowest for men traveling in the second and third class.
44
45.. image:: images/MosaicDisplay-Titanic.png
46   :alt: Mosiac Display on titanic dataset
47
48This visualization gets slightly more complex - but once getting used to, more
49informative - if the expected class distribution is shown on the same
50visualization. For this purpose, a sub-box
51(:obj:`Use sub-boxes on the left to show...` and below it choose
52:obj:`Apriori class distribution`. This would plot a bar on the top of every
53cell displayed, being able to observe the difference between the actual and
54expected distribution for each cell. Change :obj:`Apriori class distribution`
55to :obj:`Expected class distribution` to compare the actual distributions to
56those computed by assuming the independence of attributes.
57
58.. image:: images/MosaicDisplay-Titanic-Apriori.png
59
60The degree of deviation from aprori class distribution for each cell can be
61directly visualized using :obj:`Standard Pearson residuals` option (from
62:obj:`Colors in cells represent ...` box, see snapshot below). On Titanic data
63set, this visualization clearly shows for which combinations of attributes the
64changes of survival were highest or lowest.
65
66.. image:: images/MosaicDisplay-Titanic-Residuals.png
67   :alt: Mosiac Display - Pearson residual
68
69If there are many attributes, finding subsets which would yield interesting
70mosaic displays is at least cumbersome. Orange implementation includes
71:obj:`VizRank` (:obj:`Main` tab), which can list provide a list of most
72interesting subsets with chosen cardinality. Various measures of
73interestingness are implemented, but in principle all favor displays where at
74least some cells would exhibit high deviation from the apriori class
75distributions.
76
77.. image:: images/MosaicDisplay-Titanic-VizRank.png
78   :alt: Mosiac Display with Viz Rank
79
80Instead of comparing cell's class distribution to apriori ones, these can be
81compared to distribution from a subset of instances from the same data domain.
82The widget uses a separate input channel for this purpose. Notice also that
83individual cells can be selected/de-selected (clicking with left or right mouse
84button on the cell), sending out the instances from the selected cells using
85the :obj:`Selected Examples` channel.
86
87References
88----------
89
90.. [HartiganKleiner81] Hartigan, J. A., and Kleiner, B. (1981).  Mosaics for
91   contingency tables. In W. F. Eddy (Ed.),  Computer Science and Statistics:
92   Proceedings of the 13th Symposium on the Interface. New York:
93   Springer-Verlag.
94
95.. [Friendly94] Friendly, M. (1994). Mosaic displays for multi-way contingency
96   tables.  Journal of the American Statistical Association,  89, 190-200.
Note: See TracBrowser for help on using the repository browser.