Changeset 10069:fdfa1e75f7a1 in orange


Ignore:
Timestamp:
02/08/12 12:44:21 (2 years ago)
Author:
janezd <janez.demsar@…>
Branch:
default
Message:

Added conversion to and from numpy to documentation on Orange.data.Table

File:
1 edited

Legend:

Unmodified
Added
Removed
  • docs/reference/rst/Orange.data.table.rst

    r10040 r10069  
    1616Data tables can also be created programmatically, as in the :ref:`code 
    1717below <example-table-prog1>`. 
    18 s 
     18 
    1919:obj:`Table` supports most list-like operations: getting, setting, 
    2020removing data instances, as well as methods :obj:`append` and 
     
    129129        :obj:`Table` (if domains do not match, they are converted), 
    130130        as a list containing either instances of 
    131         :obj:`Orange.data.Instance` or lists, or as a numpy array. 
     131        :obj:`Orange.data.Instance` or lists. 
     132 
     133        This constructor can also be used for conversion from numpy 
     134        arrays. The argument ``instances`` can be a numpy array. The number 
     135        of variables in the domain must match the number of columns. 
    132136 
    133137        :param domain: domain descriptor 
     
    439443            :rtype: :obj:`Orange.data.Table` 
    440444 
     445    .. method:: to_numpy(content, weightID, multinominal) 
     446 
     447        Convert a data table to numpy array. Raises an exception if the data 
     448        contains undefined values. :obj:`to_numpyMA` converts to a masked 
     449        array where the mask denotes the defined values. (For conversion 
     450        from numpy, see the constructor.) 
     451 
     452        The function returns a tuple with the array and, depending on 
     453        arguments, some vectors. The argument ``content`` is a string 
     454        separated in two parts with a slash. The part to the left of slash 
     455        describes the content of the array; in the part on the right side 
     456        lists the vectors. The content is described with the following 
     457        characters: 
     458 
     459        ``a`` 
     460            features (without the class); can only appear on the left 
     461 
     462        ``A`` 
     463            like ``a``, but raises exception if there are no features 
     464 
     465        ``c`` 
     466            class value represented as an index of the value (0, 1, 2...); 
     467            if the data has no class, the column is omitted (if ``c`` is to 
     468            the left of the slash) or the tuple will contain ``None`` 
     469            instead of the vector. 
     470 
     471        ``C`` 
     472            like ``c``, but raises exception if the data has no class 
     473 
     474        ``w`` 
     475            instance weight; like for ``c`` the column is omitted or 
     476            ``None`` is returned instead of the vector if the argument 
     477            ``weightID`` is missing. 
     478 
     479        ``W`` 
     480            instance weight; raise an exception if ``weightID`` 
     481            is missing. 
     482 
     483        ``0`` 
     484            a vector of zeros 
     485 
     486        ``1`` 
     487            a vector of ones 
     488 
     489    The default content is ``a/cw``: an array with feature values and 
     490    separate vectors with classes and weights. Specifying an empty string 
     491    has the same effect. If the elements to the right of the slash repeat, 
     492    the function returns the same Python object, e.g. in ``acc000/cwww`` the 
     493    three weight vectors are one and the same Python object, so modifying 
     494    one will change all three of them. 
     495 
     496        This is the default behaviour on data set iris with 150 data 
     497        instances described by four features and a class value:: 
     498 
     499        >>> data = orange.ExampleTable("../datasets/iris") 
     500        >>> a, c, w = data.toNumpy() 
     501        >>> a.shape 
     502        (150, 4) 
     503        >>> c.shape 
     504        (150,) 
     505        >>> print w 
     506            None 
     507        >>> a[0] 
     508        array([ 5.0999999 ,  3.5       ,  1.39999998,  0.2       ]) 
     509        >>> c[0] 
     510        0.0 
     511 
     512        For a more complicated example, the array will contain a column with 
     513        class, features, a vector of ones, two vectors with classes and 
     514        another vector of zeroes:: 
     515 
     516        >>> a, = data.toNumpy("ca1cc0") 
     517        >>> a[0] 
     518        array([ 0., 5.0999999, 3.5       , 1.39999998, 0.2       , 1., 0., 0., 0.]) 
     519        >>> a[130] 
     520        array([ 2., 7.4000001, 2.79999995, 6.0999999 , 1.89999998, 1., 2., 2., 0.]) 
     521        >>> c[120] 
     522        2.0 
     523 
     524    The third argument specifies the treatment of non-continuous 
     525    non-binary values (binary values are always translated to 0.0 or 
     526    1.0). The argument's value can be 
     527    :obj:`Orange.data.Table.Multinomial_Ignore` (such features are 
     528    omitted), :obj:`Orange.data.Table.Multinomial_AsOrdinal` (the 
     529    values' indices are treated as continuous numbers) or 
     530    :obj:`Orange.data.Table.Multinomial_Error` (an exception is raised 
     531    if such features are encountered). Default treatment is 
     532    :obj:`Orange.data.Table.ExampleTable.Multinomial_AsOrdinal`. 
     533 
     534    When the class attribute is discrete and has more than two values, 
     535    an exception is raised unless multinomial attributes are treated as 
     536    ordinal. More options for treating multinominal values are available 
     537    in :obj:`Orange.data.continuization`. 
     538 
     539    .. method:: to_numpyMA(content, weightID, multinominal) 
     540 
     541        Similar to :obj:`to_numpy` except that it returns a masked array 
     542        with mask representing the (un)defined values. 
     543 
    441544    .. method:: checksum() 
    442545 
     
    498601 
    499602            Remove a meta attribute from all data instances. 
     603 
     604 
Note: See TracChangeset for help on using the changeset viewer.