source: orange/orange/doc/datasets/water-treatment.htm @ 1760:9d4bb141fb0e

Revision 1760:9d4bb141fb0e, 12.1 KB checked in by blaz <blaz.zupan@…>, 9 years ago (diff)

data info file

Line 
1<html>
2<head>
3<title>Water Treatment Data Base</title>
4</head>
5<body>
6<h1>Info on Water Treatment Data Base</h1>
7<pre>
81. Title: Faults in a urban waste water treatment plant
9
102. Source Information:
11   -- Creators: Manel Poch (igte2@cc.uab.es)
12         Unitat d'Enginyeria Quimica
13         Universitat Autonoma de Barcelona. Bellaterra. Barcelona; Spain
14   -- Donor: Javier Bejar and Ulises Cortes (bejar@lsi.upc.es)
15         Dept. Llenguatges i Sistemes Informatics;
16         Universitat Politecnica de Catalunya. Barcelona; Spain
17   -- Date: June, 1993
18
193. Past Usage:
20   1. J. De Gracia.
21      ``Avaluacio de tecniques de classificacio per a la gestio de
22        Bioprocessos: Aplicacio a un reactor de fangs activats''
23         Master Thesis. Dept. de Quimica. Unitat d'Enginyeria Quimica.
24       Universitat Autonoma de Barcelona. Bellaterra (Barcelona). 1993.
25         -- Results:
26              Comparison between the classification of plant situations using
27             cluster analysis and conceptual clustering. The induced classes
28             are exposed and contrasted.
29
30
31   2. J. Bejar, U. Cort\'es and M. Poch.
32       ``LINNEO+: A Classification Methodology for Ill-structured Domains''.
33        Research report RT-93-10-R. Dept. Llenguatges i Sistemes Informatics.
34        Barcelona. 1993.
35         -- Results:
36          The conceptual clustering algorithm used in the first reference
37             is exposed. Some results are given about the use of a priori
38             expert knowledge to bias the classification process in the plant
39             domain.
40
41   3.  Ll. Belanche, U. Cortes and M. S\`anchez.
42      ``A knowledge-based system for the diagnosis of waste-water treatment
43       plant''. Proceedings of the 5th international conference of industrial
44       and engineering applications of AI and Expert Systems IEA/AIE-92. Ed
45       Springer-Verlag. Paderborn, Germany, June 92.
46         -- Results:
47             Explanation of the waste water treatment plant diagnosis problems
48             Not directly related to the dataset.
49
50
51
524. Relevant Information:
53
54    This dataset comes from the daily measures of sensors in a urban waste
55  water treatment plant. The objective is to classify the operational
56  state of the plant in order to predict faults through the state
57  variables of the plant at each of the stages of the treatment process.
58  This domain has been stated as an ill-structured domain.
59   
60 
615. Number of instances: 527
62
636. Number of Attributes: 38
64
65    There are some missing values, all are unknown information.
66
677. Attribute Information:
68
69 All atrributes are numeric and continuous
70
71N.  Attrib.   
72 1  Q-E        (input flow to plant) 
73 2  ZN-E       (input Zinc to plant)
74 3  PH-E       (input pH to plant)
75 4  DBO-E      (input Biological demand of oxygen to plant)
76 5  DQO-E      (input chemical demand of oxygen to plant)
77 6  SS-E       (input suspended solids to plant) 
78 7  SSV-E      (input volatile supended solids to plant)
79 8  SED-E      (input sediments to plant)
80 9  COND-E     (input conductivity to plant)
8110  PH-P       (input pH to primary settler)
8211  DBO-P      (input Biological demand of oxygen to primary settler)
8312  SS-P       (input suspended solids to primary settler)
8413  SSV-P      (input volatile supended solids to primary settler)
8514  SED-P      (input sediments to primary settler)
8615  COND-P     (input conductivity to primary settler)
8716  PH-D       (input pH to secondary settler)
8817  DBO-D      (input Biological demand of oxygen to secondary settler)
8918  DQO-D      (input chemical demand of oxygen to secondary settler)
9019  SS-D       (input suspended solids to secondary settler)
9120  SSV-D      (input volatile supended solids to secondary settler)
9221  SED-D      (input sediments to secondary settler) 
9322  COND-D     (input conductivity to secondary settler)
9423  PH-S       (output pH)   
9524  DBO-S      (output Biological demand of oxygen)
9625  DQO-S      (output chemical demand of oxygen)
9726  SS-S       (output suspended solids)
9827  SSV-S      (output volatile supended solids)
9928  SED-S      (output sediments)
10029  COND-S     (output conductivity)
10130  RD-DBO-P   (performance input Biological demand of oxygen in primary settler)
10231  RD-SS-P    (performance input suspended solids to primary settler)
10332  RD-SED-P   (performance input sediments to primary settler)
10433  RD-DBO-S   (performance input Biological demand of oxygen to secondary settler)
10534  RD-DQO-S   (performance input chemical demand of oxygen to secondary settler)
10635  RD-DBO-G   (global performance input Biological demand of oxygen)
10736  RD-DQO-G   (global performance input chemical demand of oxygen)
10837  RD-SS-G    (global performance input suspended solids)
10938  RD-SED-G   (global performance input sediments)
110
111
112-- Statistics:
113 
114 N.  Attrib.     min      max       mean      st-dev
115 1  Q-E        10000    60081     37226.56  6571.46   
116 2  ZN-E           0.1     33.5       2.36     2.74   
117 3  PH-E           6.9      8.7       7.81     0.24   
118 4  DBO-E         31      438       188.71    60.69   
119 5  DQO-E         81      941       406.89   119.67   
120 6  SS-E          98     2008       227.44   135.81   
121 7  SSV-E         13.2     85.0      61.39    12.28   
122 8  SED-E          0.4     36         4.59     2.67   
123 9  COND-E       651     3230      1478.62   394.89   
12410  PH-P           7.3      8.5       7.83     0.22   
12511  DBO-P         32      517       206.20    71.92   
12612  SS-P         104     1692       253.95   147.45   
12713  SSV-P          7.1     93.5      60.37    12.26   
12814  SED-P          1.0     46.0       5.03     3.27   
12915  COND-P       646     3170      1496.03   402.58   
13016  PH-D           7.1      8.4       7.81     0.19   
13117  DBO-D         26      285       122.34    36.02   
13218  DQO-D         80      511       274.04    73.48   
13319  SS-D          49      244        94.22    23.94   
13420  SSV-D         20.2    100        72.96    10.34   
13521  SED-D          0.0      3.5       0.41     0.37   
13622  COND-D        85     3690      1490.56   399.99   
13723  PH-S           7.0      9.7       7.70     0.18   
13824  DBO-S          3      320        19.98    17.20   
13925  DQO-S          9      350        87.29    38.35   
14026  SS-S           6      238        22.23    16.25   
14127  SSV-S         29.2    100        80.15     9.00   
14228  SED-S          0.0      3.5       0.03     0.19   
14329  COND-S       683     3950      1494.81   387.53   
14430  RD-DBO-P       0.6     79.1      39.08    13.89   
14531  RD-SS-P        5.3     96.1      58.51    12.75   
14632  RD-SED-P       7.7    100        90.55     8.71   
14733  RD-DBO-S       8.2     94.7      83.44     8.4   
14834  RD-DQO-S       1.4     96.8      67.67    11.61   
14935  RD-DBO-G      19.6     97        89.01     6.78   
15036  RD-DQO-G      19.2     98.1      77.85     8.67   
15137  RD-SS-G       10.3     99.4      88.96     8.15   
15238  RD-SED-G      36.4    100        99.08     4.32   
153
154
1558. Missing Attribute Values:
156
157 N. Attrib.   N. of Missings
158 1  Q-E:    18 
159 2  ZN-E:    3
160 3  PH-E:    0
161 4  DBO-E:  23
162 5  DQO-E:   6
163 6  SS-E:    1
164 7  SSV-E:  11
165 8  SED-E:  25
166 9  COND-E:  0
16710  PH-P:    0
16811  DBO-P:  40
16912  SS-P:    0
17013  SSV-P:  11
17114  SED-P:  24
17215  COND-P:  0
17316  PH-D:    0
17417  DBO-D:  28
17518  DQO-D:   9
17619  SS-D:    2
17720  SSV-D:  13
17821  SED-D:  25
17922  COND-D:  0
18023  PH-S:    1
18124  DBO-S:  23
18225  DQO-S:  18
18326  SS-S:    5
18427  SSV-S:      17
18528  SED-S:      28
18629  COND-S:  1
18730  RD-DBO-P:   62
18831  RD-SS-P:     4
18932  RD-SED-P:   27
19033  RD-DBO-S:   40
19134  RD-DQO-S:   26
19235  RD-DBO-G:   36
19336  RD-DQO-G:   25
19437  RD-SS-G:     8
19538  RD-SSED-G:  31
196
197
1989. Class Distribution 
199
200  These are the classes induced by out conceptual clustering algorithm:
201
202 -- Class 1: Normal situation
203     
204   - Objects (275 days): 
205
206    D-1/3/90 to  D-12/3/90, D-16/3/90 to D-30/3/90, D-1/2/90 to D-19/2/90, D-21/2/90 to D-28/2/90,
207    D-1/1/90 to D-26/1/90, D-29/1/90 to D-31/1/90, D-1/6/90 to D-4/6/90, D-6/6/90 to D-8/6/90,
208    D-24/6/90, D-25/6/90, D-28/6/90, D-29/6/90, D-1/5/90 to D-6/5/90, D-8/5/90 to D-20/5/90,
209    D-24/5/90, D-25/5/90, D-29/5/90, D-1/4/90, D-4/4/90 to D-8/4/90, D-10/4/90 to D-20/4/90,
210    D-27/4/90, D-2/7/90, D-4/7/90 to D-8/7/90, D-12/7/90 to D-15/7/90, D-19/7/90, D-23/7/90,
211    D-26/7/90, D-4/9/90, D-5/9/90, D-23/9/90, D-28/9/90, D-30/9/90, D-17/8/90, D-21/8/90 to D-25/8/90,
212    D-29/8/90, D-30/8/90, D-3/12/90, D-9/12/90, D-16/12/90 to D-20/12/90, D-23/12/90, D-24/12/90,
213    D-27/12/90 to D-30/12/90,  D-6/11/90 to D-8/11/90, D-14/11/90, D-16/11/90, D-18/11/90,
214    D-20/11/90, D-21/11/90, D-27/11/90, D-10/10/90, D-18/10/90, D-29/10/90, D-30/10/90,
215    D-3/3/91 to D-6/3/91, D-10/3/91 to D-12/3/91, D-18/3/91, D-20/3/91, D-27/3/91, D-29/3/91,
216    D-3/2/91, D-5/2/91, D-8/2/91, D-14/2/91, D-17/2/91, D-18/2/91, D-21/2/91 to D-24/2/91,
217    D-1/1/91, D-2/1/91, D-6/1/91, D-8/1/91, D-10/1/91 to D-20/1/91, D-25/1/91, D-2/5/91, D-3/5/91,
218    D-7/5/91, D-14/5/91, D-15/5/91, D-17/5/91, D-19/5/91, D-21/5/91 to D-23/5/91, D-1/4/91 to D-3/4/91,
219    D-5/4/91 to D-12/4/91, D-15/4/91 to D-21/4/91, D-23/4/91, D-1/7/91, D-3/7/91, D-4/7/91, D-7/7/91,
220    D-10/7/91 to D-12/7/91, D-15/7/91, D-16/7/91, D-22/7/91 to D-25/7/91, D-28/7/91, D-30/7/91, D-31/7/91,
221    D-2/6/91 to D-4/6/91, D-6/6/91, D-7/6/91, D-13/6/91, D-16/6/91 to D-21/6/91, D-25/6/91 to D-30/6/91,
222    D-4/10/91, D-6/10/91, D-17/10/91 to D-30/10/91, D-1/8/91, D-2/8/91, D-27/8/91, D-29/8/91.   
223
224
225 -- Class 2: Secondary settler problems-1
226     
227   - Objects (1 day): D-13/3/90
228
229 -- Class 3: Secondary settler problems-2
230
231   - Objects (1 day): D-14/3/90
232
233 -- Class 4: Secondary settler problems-3
234
235   - Objects (1 day): D-15/3/90, D-17/7/91 to D-19/7/91
236
237 -- Class 5: Normal situation with performance over the mean
238
239   - Objects (116 days):
240
241    D-28/1/90, D-10/6/90 to D-22/6/90, D-26/6/90, D-27/6/90, D-7/5/90, D-21/5/90 to D-23/5/90,
242    D-27/5/90, D-28/5/90, D-30/5/90, D-2/4/90, D-3/4/90, D-9/4/90, D-22/4/90 to D-26/4/90, D-1/7/90,
243    D-3/7/90, D-9/7/90 to D-11/7/90, D-16/7/90 to D-18/7/90, D-20/7/90, D-22/7/90, D-24/7/90, D-25/7/90,
244    D-27/7/90 to D-31/7/90, D-2/9/90, D-3/9/90, D-6/9/90 to D-13/9/90, D-16/9/90 to D-21/9/90,
245    D-24/9/90 to D-27/9/90, D-1/8/90 to D-7/8/90, D-16/8/90, D-28/8/90, D-31/8/90, D-7/12/90,
246    D-2/11/90, D-5/11/90, D-9/11/90, D-12/11/90, D-13/11/90, D-1/10/90 to D-5/10/90, D-24/10/90,
247    D-25/10/90, D-1/3/91, D-8/3/91, D-17/3/91, D-26/3/91, D-31/3/91, D-9/1/91, D-10/5/91, D-16/5/91,
248    D-20/5/91, D-29/5/91, D-30/5/91, D-14/4/91, D-22/4/91, D-24/4/91, D-25/4/91, D-5/7/91, D-8/7/91,
249    D-9/7/91, D-21/7/91, D-26/7/91, D-5/6/91, D-10/6/91, D-12/6/91, D-14/6/91, D-2/10/91, D-8/10/91,
250    D-9/10/91, D-11/10/91,D-13/10/91, D-16/10/91.
251 
252 -- Class 6: Solids overload-1
253
254  - Objects (3 days):   D-5/6/90 D-28/5/91 D-31/5/91
255 
256 -- Class 7: Secondary settler problems-4
257
258  - Objects (1 day): D-29/4/90
259
260 -- Class 8: Storm-1
261 
262  - Objects (1 day): D-14/9/90
263
264 -- Class 9: Normal situation with low influent
265
266  - Objects (69 days):
267
268    D-8/8/90 to D-10/8/90, D-13/8/90, D-15/8/90, D-19/8/90, D-20/8/90, D-27/8/90, D-1/11/90,
269    D-4/11/90, D-11/11/90, D-19/11/90, D-7/10/90 to D-9/10/90, D-12/10/90 to D-17/10/90,
270    D-21/10/90, D-23/10/90, D-26/10/90, D-28/10/90, D-7/3/91, D-24/3/91, D-25/3/91,
271    D-1/5/91, D-5/5/91, D-8/5/91, D-9/5/91, D-12/5/91, D-13/5/91, D-26/5/91, D-27/5/91,
272    D-26/4/91, D-28/4/91, D-29/4/91, D-2/7/91, D-14/7/91, D-29/7/91, D-9/6/91, D-24/6/91,
273    D-1/10/91, D-3/10/91, D-5/10/91, D-12/10/91, D-15/10/91, D-4/8/91  D-9/8/91 to D-26/8/91,
274    D-28/8/91, D-30/8/91.
275 
276 -- Class 10: Storm-2
277
278  - Objects (1 day): D-12/8/90
279
280 -- Class 11: Normal situation
281
282  - Objects (53 days):
283
284    D-2/12/90, D-4/12/90, D-6/12/90, D-10/12/90 to D-14/12/90 D-21/12/90, D-26/12/90,
285    D-15/11/90, D-22/11/90 to D-26/11/90, D-28/11/90 to D-30/11/90, D-19/10/90,
286    D-13/3/91 to D-15/3/91, D-19/3/91, D-21/3/91, D-22/3/91, D-1/2/91, D-4/2/91,
287    D-6/2/91, D-7/2/91, D-10/2/91 to  D-13/2/91, D-15/2/91, D-19/2/91,
288    D-25/2/91 to D-28/2/91, D-3/1/91, D-4/1/91, D-7/1/91, D-21/1/91 to D-24/1/91,
289    D-27/1/91 to D-31/1/91, D-6/5/91, D-4/4/91.
290
291 -- Class 12: Storm-3
292
293  - Objects (1 day): D-22/10/90
294
295 -- Class 13: Solids overload-2
296
297  - Objects (1 day): D-24/5/91
298
299
300-- Comments to the data file:
301   
302   The first element of each line is the day of the data,
303   the rest are the attribute values
304</pre>
305</body>
306</html>
Note: See TracBrowser for help on using the repository browser.