source: orange/docs/widgets/rst/data/file.rst @ 11785:f21218ba8f91

Revision 11785:f21218ba8f91, 2.8 KB checked in by blaz <blaz.zupan@…>, 5 months ago (diff)

Updated documentation for File widget (use of stamper.py to render images).

Line 
1.. _File:
2
3File
4====
5
6.. image:: ../../../../Orange/OrangeWidgets/Data/icons/File.svg
7   :alt: File widget icon
8   
9Signals
10-------
11
12Inputs:
13   - None
14
15Outputs:
16   - Data
17         Attribute-valued data set read from the input file.
18
19Description
20-----------
21
22This is the widget you will probably use in every schema. It reads the input data file
23(data table with examples) and sends the data set to the output channel. It maintains
24a history of most recently used data files. For convenience, the history also includes
25a directory with the sample data sets that come with Orange.
26
27File can read data from simple tab-delimited or comma-separated files, as well as
28files Weka's .arrf format.
29
30.. image:: images/File-stamped.png
31   :alt: File widget with loaded Iris data set
32   :align: right
33
341. Browse for a data file.
35#. Browse through previously opened data files, or load any of the sample data
36   files.
37#. Reloads currently selected data file.
38#. Information on loaded data set (data set size, number and types of
39   data features).
40#. Opens a sub-window with advanced settings.
41#. Adds a report on data set info (size, features).
42
43.. container:: clearer
44
45    .. image :: images/spacer.png
46
47Advanced Options
48----------------
49
50.. image:: images/File-Advanced-stamped.png
51   :alt: Advanced options of File widget
52   :align: right
53
541. Symbol for don't care data entry.
55#. Symbol for don't know data entry.
56#. Settings for treatment of feature names in the feature space of Orange.
57
58.. container:: clearer
59
60    .. image :: images/spacer.png
61
62Tab-delimited data file can include user defined symbols for undefined values. The symbols for
63"don't care" and "don't know" values can be specified in the corresponding edit lines.
64The default values for "don't know" and "don't care" depend upon format. Most users will
65use tab-delimited files: keep the field empty or put a question mark in there and that's
66it. Most algorithms do not differ between don't know and don't care values, so consider
67them both to mean undefined.
68
69Orange will usually treat the attributes with the same name
70but appearing in different files as the same attribute, so a classifier which uses the
71attribute "petal length" from the first will use the attribute of the same name from
72the second. In cases when attributes from different files just accidentally bear different
73names, one can instruct Orange to either always construct new attribute or construct them when
74they differ in their domains. Use the options on dealing with new attributes
75with great care (if at all).
76
77Example
78-------
79
80Most Orange workflows would probably start with the :ref:`File` widget. In the schema below,
81the widget is used to read the data that is sent to both :ref:`Data Table` widget and
82to widget that displays :ref:`Attribute Statistics`.
83
84.. image:: images/File_schema.png
85   :alt: Example schema with File widget
Note: See TracBrowser for help on using the repository browser.