Plots a sieve diagram for a pair of attributes.
- Examples (ExampleTable)
Input data set.
A sieve diagram is a graphical method for visualizing the frequencies in a two-way contingency table and comparing them to the expected frequencies under assumtion of independence. The sieve diagram was proposed by Riedwyl and Schüpbach in a technical report in 1983 and later called a parquet diagram ([Riedwy1994]). In this display the area of each rectangle is proportional to expected frequency and observed frequency is shown by the number of squares in each rectangle. The difference between observed and expected frequency (proportional to standard Pearson residual) appears as the density of shading, using color to indicate whether the deviation from independence is positive (blue) or negative (red).
The snapshot below shows a sieve diagram for Titanic data set and attributes sex and survived (the later is actually a class attribute in this data set). The plot shows that the two variables are highly associated, as there are substantial differences between observed and expected frequencies in all of the four quadrants. For example and as highlighted in a balloon, the chance for not surviving the accident was for female passengers much lower than expected (0.05 vs. 0.14).
Orange can help to identify pairs of attributes with interesting associations. Such attribute pairs are upon request (Calculate Chi Squares) listed in Interesting attribute pair. As it turns out, the most interesting attribute pair in Titanic data set is indeed the one we show in the above snapshot. For a contrast, the sieve diagram of the least interesting pair (age vs. survival) is shown below.