source: orange/docs/widgets/rst/visualize/mosaicdisplay.rst @ 11050:e3c4699ca155

Revision 11050:e3c4699ca155, 4.0 KB checked in by Miha Stajdohar <miha.stajdohar@…>, 16 months ago (diff)

Widget docs From HTML to Sphinx.

Line 
1.. _Mosaic Display:
2
3Mosaic Display
4==============
5
6.. image:: ../icons/MosaicDisplay.png
7
8Shows a mosaic display of n-way tables.
9
10Signals
11-------
12
13Inputs:
14   - Examples (ExampleTable)
15      Input data set.
16   - Example Subset (ExampleTable)
17      A subset of data instances from Examples.
18   - Selected Examples (ExampleTable)
19      A subset of examples belonging to manually selected cells in mosaic display.
20
21Outputs:
22   - None
23
24
25Description
26-----------
27
28The mosaic display is a graphical method to visualize the counts in n-way `contingency tables <http://en.wikipedia.org/wiki/Contingency_table>`_, that is, tables where each cell corresponds to a distinct value-combination of n attributes. The method was proposed by <a href="#HartiganKleiner81" title="Hartigan &amp; Kleiner (1981) Mosaics for contingency tables">Hartigan & Kleiner (1981)</a> and extended in <a href="#Friendly94" title="Friendly (1994) Mosaic displays for multi-way contingency tables">Friendly (1994)</a>. Each cell in mosaic display corresponds to a single cell in contingency table. If the data contains a class attribute, the mosaic display will show the class distribution.
29
30Orange's implementation of mosaic display allows to observe the interactions of up to four variables in a single visualization. The snapshot below shows a mosaic display for the Titanic data set, observing three variables (sex, status, and age) and their association with a class (survived). The diagram shows that the survival (red color) was highest for women traveling in the first class, and lowest for men traveling in the second and third class.
31
32.. image:: images/MosaicDisplay-Titanic.png
33
34This visualization gets slightly more complex - but once getting used to, more informative - if the expected class distribution is shown on the same visualization. For this purpose, a sub-box (:obj:`Use sub-boxes on the left to show...` and below it choose :obj:`Apriori class distribution`. This would plot a bar on the top of every cell displayed, being able to observe the difference between the actual and expected distribution for each cell. Change :obj:`Apriori class distribution` to :obj:`Expected class distribution` to compare the actual distributions to those computed by assuming the independence of attributes.
35
36.. image:: images/MosaicDisplay-Titanic-Apriori.png
37
38The degree of deviation from aprori class distribution for each cell can be directly visualized using :obj:`Standard Pearson residuals` option (from :obj:`Colors in cells represent ...` box, see snapshot below). On Titanic data set, this visualization clearly shows for which combinations of attributes the changes of survival were highest or lowest.
39
40.. image:: images/MosaicDisplay-Titanic-Residuals.png
41
42If there are many attributes, finding subsets which would yield interesting mosaic displays is at least cumbersome. Orange implementation includes :obj:`VizRank` (:obj:`Main` tab), which can list provide a list of most interesting subsets with chosen cardinality. Various measures of interestingness are implemented, but in principle all favor displays where at least some cells would exhibit high deviation from the apriori class distributions.
43
44.. image:: images/MosaicDisplay-Titanic-VizRank.png
45
46Instead of comparing cell's class distribution to apriori ones, these can be compared to distribution from a subset of instances from the same data domain. The widget uses a separate input channel for this purpose. Notice also that individual cells can be selected/de-selected (clicking with left or right mouse button on the cell), sending out the instances from the selected cells using the :obj:`Selected Examples` channel.
47
48References
49----------
50
51   - Hartigan, J. A., and Kleiner, B. (1981).  Mosaics for contingency tables. In W. F. Eddy (Ed.),  Computer Science and Statistics: Proceedings of the 13th Symposium on the Interface. New York: Springer-Verlag.
52   - Friendly, M. (1994). Mosaic displays for multi-way contingency tables.  Journal of the American Statistical Association,  89, 190-200.
Note: See TracBrowser for help on using the repository browser.