source: orange/orange/doc/datasets/echocardiogram.htm @ 1760:9d4bb141fb0e

Revision 1760:9d4bb141fb0e, 5.1 KB checked in by blaz <blaz.zupan@…>, 9 years ago (diff)

data info file

Line 
1<html>
2<head>
3<title>Credit Approval Data Base</title>
4</head>
5<body>
6<h1>Info on Credit Approval Data Base</h1>
7<pre>
81. Title: Echocardiogram Data
9
102. Source Information:
11   -- Donor: Steven Salzberg (salzberg@cs.jhu.edu)
12   -- Collector:
13      -- Dr. Evlin Kinney
14      -- The Reed Institute
15      -- P.O. Box 402603
16      -- Maimi, FL 33140-0603
17   -- Date Received: 28 February 1989
18
193. Past Usage:
20   -- 1. Salzberg, S. (1988).  Exemplar-based learning: Theory and
21         implementation (Technical Report TR-10-88).  Harvard University,
22         Center for Research in Computing Technology, Aiken Computation
23         Laboratory (33 Oxford Street; Cambridge, MA 02138).
24      -- Steve applied his EACH program to predict survival (i.e., life
25         or death), did not use the wall-motion attribute, and recorded 87
26         correct and 29 incorrect in an incremental application to this
27         database.  He also showed that, by tuning EACH to this domain,
28         EACH was able to derive (non-incrementally) a set of 28
29         hyper-rectangles that could perfectly classify 119 instances.
30   -- 2. Kan, G., Visser, C., Kooler, J., & Dunning, A. (1986).  Short
31         and long term predictive value of wall motion score in acute
32         myocardial infarction.  British Heart Journal, 56, 422-427.
33      -- They predicted the same variable (whether patients will live
34         one year after a heart attack) using a different set of 345
35         instances.  Their statistical test recorded a 61% accuracy
36         in predicting that a patient will die (post-hoc fit).
37   -- 3. Elvin Kinney (in communication with Steven Salzberg) reported
38         that a Cox regression application recorded a 60% accuracy
39         in predicting that a patient will die.
40
414. Relevant Information:
42  -- All the patients suffered heart attacks at some point in the past.
43     Some are still alive and some are not.  The survival and still-alive
44     variables, when taken together, indicate whether a patient survived
45     for at least one year following the heart attack. 
46
47     The problem addressed by past researchers was to predict from the
48     other variables whether or not the patient will survive at least
49     one year.  The most difficult part of this problem is correctly
50     predicting that the patient will NOT survive.  (Part of the difficulty
51     seems to be the size of the data set.)
52
535. Number of Instances: 132
54
556. Number of Attributes: 13 (all numeric-valued)
56
577. Attribute Information:
58   1. survival -- the number of months patient survived (has survived,
59          if patient is still alive).  Because all the patients
60          had their heart attacks at different times, it is
61          possible that some patients have survived less than
62          one year but they are still alive.  Check the second
63          variable to confirm this.  Such patients cannot be
64          used for the prediction task mentioned above.
65   2. still-alive -- a binary variable.  0=dead at end of survival period,
66             1 means still alive
67   3. age-at-heart-attack -- age in years when heart attack occurred
68   4. pericardial-effusion -- binary. Pericardial effusion is fluid
69                  around the heart.  0=no fluid, 1=fluid
70   5. fractional-shortening -- a measure of contracility around the heart
71                   lower numbers are increasingly abnormal
72   6. epss -- E-point septal separation, another measure of contractility. 
73          Larger numbers are increasingly abnormal.
74   7. lvdd -- left ventricular end-diastolic dimension.  This is
75          a measure of the size of the heart at end-diastole.
76          Large hearts tend to be sick hearts.
77   8. wall-motion-score -- a measure of how the segments of the left
78               ventricle are moving
79   9. wall-motion-index -- equals wall-motion-score divided by number of
80               segments seen.  Usually 12-13 segments are seen
81               in an echocardiogram.  Use this variable INSTEAD
82               of the wall motion score.
83   10. mult -- a derivate var which can be ignored
84   11. name -- the name of the patient (I have replaced them with "name")
85   12. group -- meaningless, ignore it
86   13. alive-at-1 -- Boolean-valued. Derived from the first two attributes.
87                     0 means patient was either dead after 1 year or had
88                     been followed for less than 1 year.  1 means patient
89                     was alive at 1 year.
90
918. Missing Attribute Values: (denoted by "?")
92   Attribute #:    Number of Missing Values: (total: 132)
93   ------------    -------------------------
94              1    2 
95              2    1 
96              3    5 
97              4    1 
98              5    8 
99              6    15
100              7    11
101              8    4 
102              9    1 
103             10    4
104             11    0
105             12    22
106             13    58
107
1089. Distribution of attribute number 2: still-alive
109   Value   Number of instances with this value
110    ----   -----------------------------------
111      0    88 (dead)
112      1    43 (alive)
113      ?    1
114    Total  132
115
116
11710. Distribution of attribute number 13: alive-at-1
118   Value   Number of instances with this value
119    ----   -----------------------------------
120      0    50
121      1    24
122      ?    58
123    Total  132
124</pre>
125</body>
126</html>
Note: See TracBrowser for help on using the repository browser.