source: orange/docs/development/rst/c.rst @ 11587:fd03cbe41685

Revision 11587:fd03cbe41685, 47.1 KB checked in by janezd <janez.demsar@…>, 11 months ago (diff)

Fixes in documentation for writing interfaces between C++ and Python

RevLine 
[11577]1################################
2Writing Orange Extensions in C++
3################################
4
[11587]5This page gives an introduction to extending Orange in C++ with emphasis on
6how to define interfaces to Python. Besides reading this page, we recommend
7studying some of existing extension modules like orangeom, and the Orange's
8interface itself.
[11577]9
[11587]10We shall first present a general picture and then focus on specific parts of the
11interface.
[11577]12
[11587]13Instead of general tools for creating interfaces between C++ and Python
14(Swig, Sip, PyBoost...), Orange uses its own specific set of tools.
[11577]15
[11587]16To expose a C++ object to Python, we need to mark them as exportable, select a
17general constructor template to use or program a specific one, we have to mark
18the attributes to be exported, and provide the interfaces for C++ member
19functions. When we give the access to mostly C++ code as it is, the interface
20functions have only a few lines. When we want to make the exported function more
21friendly, eg. allow various types of arguments or fitting the default arguments
22according to the given ones, these functions are longer.
[11577]23
[11587]24To define a non-member function, we write the function itself as described in
25the Python's manual (see the first chapter of "Extending and Embedding the
26Python Interpreter") and then mark it with a specific keyword.
27Pyxtract will recognize the keyword and add it to the list of exported functions.
[11577]28
[11587]29To define a special method, one needs to provide a function with the appropriate
30name constructed from the class name and the special method's name, which is the
31same as in Python's PyTypeObjects.
[11577]32
[11587]33For instance, the elements of ``ExampleTable`` (examples) can be accessed
34through indexing because we defined a C function that gets an index (and the
35table, of course) and returns the corresponding example. Here is the function
36(with error detection removed for the sake of clarity). ::
[11577]37
38    PyObject *ExampleTable_getitem_sq(PyObject *self, int idx)
39    {
40        CAST_TO(TExampleTable, table);
41        return Example_FromExampleRef((*table)[idx], EXAMPLE_LOCK(PyOrange_AsExampleTable(self)));
42    }
43
[11587]44Also, ``ExampleTable`` has a non-special method ``sort([list-of-attributes])``.
45This is implemented through a C function that gets a list of attributes and
46calls the C++ class' method
47``TExampleTable::sort(const vector<int> order)``. To illustrate, this is a
48slightly simplified function (we've removed some flexibility regarding the
49parameters and the exception handling). ::
[11577]50
51    PyObject *ExampleTable_sort(PyObject *self, PyObject *args) PYARGS(METH_VARARGS, "() -> None")
52    {
53        CAST_TO(TExampleTable, table);
54
55        if (!args || !PyTuple_Size(args)) {
56            table->sort();
57            RETURN_NONE;
58        }
59
60        TVarList attributes;
61        varListFromDomain(PyTuple_GET_ITEM(args, 0), table->domain, attributes, true, true);
62        vector<int> order;
[11587]63        for(TVarList::reverse_iterator vi(attributes.rbegin()), ve(attributes.rend()); vi!=ve; vi++) {
[11577]64            order.push_back(table->domain->getVarNum(*vi));
[11587]65        }
[11577]66        table->sort(order);
67        RETURN_NONE;
68    }
69
[11587]70The function casts the ``PyObject *`` into the
71corresponding C++ object, reads the arguments, calls the C++
72functions and returns the result (``None``, in this case).
[11577]73
[11587]74Interfacing with Python requires a lot of manual work, but this gives a
75programmer the opportunity to provide a function which accepts many different
76forms of arguments. The above function, for instance, accepts a list in
77which attributes are specified by indices, names or descriptors, all
78corresponding to the ``ExampleTable`` which is being sorted. Inheritance of
79methods, on the other hand, ensures that only the methods that are truly
80specific for a class need to be coded.
[11577]81
[11587]82The part of the interface that is built automatically is taken care of by
83two scripts. ``pyprops`` parses all Orange's header files and extracts all
84the class built-in properties. The second is ``pyxtract``, which goes
85through the C++ files that contain the interface functions such as those above.
86It recognizes the functions that implement special or member methods and
87constructs the corresponding ``PyTypeObject``s.
[11577]88
89*******
90pyprops
91*******
92
[11587]93Pyprops scans each hpp file for classes we want to export to Python). Properties
94can be ``bool``, ``int``, ``float``, ``string``, ``TValue`` or a wrapped Orange
95type.
[11577]96
[11587]97Class definition needs to look as follows. ::
[11577]98
99    class [ORANGE_API] <classname>; [: public <parentclass> ]
100
[11587]101This should be in a single line. To mark the class for export, this should be
102followed by ``__REGISTER_CLASS`` or ``__REGISTER_ABSTRACT_CLASS`` before any
103properties or components are defined. The difference between the two, as far as
104pyprops is concerned, is that abstract classes do not define the ``clone``
105method.
[11577]106
107To export a property, it should be defined like this. ::
108
109    <type> <name> //P[R|O] [>|+<alias>] <description>
110
[11587]111Pyprops doesn't check the type and won't object if you use other types than
112those listed above. The error will be discovered later, during linking. ``//P``
113signals that we want to export the property. If followed by ``R`` or ``O``, the
114property is read-only or obsolete. The property can also have an alias name;
115``>`` renames it and ``+`` adds an alias.
[11577]116
117Each property needs to be declared in a separate line, e.g. ::
118
119    int x; //P;
120    int y; //P;
121
[11587]122If we don't want to export a certain property, we omit the ``//P`` mark. An
123exception to this are wrapped Orange objects: for instance, if a class has a
124(wrapped) pointer to the domain, ``PDomain`` and it doesn't export it, pyxtract
125should still know about them because for the purpose of garbage collection. You
126should mark them by ``//C`` so that they are put into the list of objects that
127need to be counted. Failing to do so would cause a memory leak.
[11577]128
[11587]129If a class directly or indirectly holds references to any wrapped objects that
130are neither properties nor components, it needs to declare ``traverse`` and
131``clear`` as described in Python documentation.
[11577]132
[11587]133Pyprops creates a ppp file for each hpp, which includes the extracted
134information in form of C++ structures that compile into the interface.
135The ppp file needs to be included in the corresponding cpp file. For
136instance, domain.ppp is included in domain.cpp.
[11577]137
138********
139pyxtract
140********
141
[11587]142Pyxtract's job is to detect the functions that define special methods (such as
143printing, conversion, sequence and arithmetic related operations...) and member
144functions. Based on what it finds for each specific class, it constructs the
145corresponding ``PyTypeObject``s. For the functions to be recognized, they must
146follow a specific syntax.
[11577]147
[11587]148There are two basic mechanisms for marking the functions to export. Special
149functions are recognized by their definition (they need to return
150``PyObject *``, ``void`` or ``int`` and their name must be of form
151<classname>_<functionname>). Member functions,
152inheritance relations, constants etc. are marked by macros such as ``PYARGS``
153in the above definition of ``ExampleTable_sort``. Most of these macros don't do
154anything except for marking stuff for pyxtract.
[11577]155
156Class declaration
157=================
158
[11587]159Each class needs to be declared as exportable. If it's a base class, pyxtract
160needs to know the data structure for the instances of this class. As for all
161Python objects the structure must be "derived" from ``PyObject`` (Python is
162written in C, so the subclasses are not derived in the C++ sense but extend the
163C structure instead). Most objects are derived from Orange; the only exceptions
164are ``orange.Example``, ``orange.Value`` and ``orange.DomainDepot``.
[11577]165
[11587]166Pyxtract should also know how the class is constructed - it can have a specific
167constructor, one of the general constructors or no constructor at all.
[11577]168
[11587]169The class is declared in one of the following ways (here are some examples from
170actual Orange code).
[11577]171
172``BASED_ON(EFMDataDescription, Orange)``
173    This tells pyxtract that ``EFMDataDescription`` is an abstract class derived from ``Orange``: there is no constructor for this class in Python, but the C++ class itself is not abstract and can appear and be used in Python. For example, when we construct an instance of ``ClassifierByLookupTable`` with more than three attributes, an instance of ``EFMDataDescription`` will appear in one of its fields.
174
175``ABSTRACT(ClassifierFD, Classifier)``
[11587]176    This defines an abstract class, which will never be constructed in the C++ code. The only difference between this ``BASED_ON`` and ``ABSTRACT`` is that the former can have pickle interface, while the latter don't need one.
[11577]177
[11587]178Abstract C++ classes are not necessarily defined as ``ABSTRACT`` in the Python
179interface. For example, ``TClassifier`` is an abstract C++ class, but you can
180seemingly construct an instance of ``Classifier`` in Python. What happens is
181that there is an additional C++ class ``TClassifierPython``, which poses as
182Python's class ``Classifier``. So the Python class ``Classifier`` is not defined
183as ``ABSTRACT`` or ``BASED_ON`` but using the ``Classifier_new`` function, as
184described below.
[11577]185
186
187``C_NAMED(EnumVariable, Variable, "([name=, values=, autoValues=, distributed=, getValueFrom=])")``
[11587]188    ``EnumVariable`` is derived from ``Variable``. Pyxtract will also create a constructor which will accept the object's name as an optional argument. The third argument is a string that describes the constructor, eg. gives a list of arguments. IDEs for Python, such as PythonWin, will show this string in a balloon help while the programmer is typing.
[11577]189
190``C_UNNAMED(RandomGenerator, Orange, "() -> RandomGenerator")``
191    This is similar as ``C_NAMED``, except that the constructor accepts no name. This form is rather rare since all Orange objects can be named.
192
193``C_CALL(BayesLearner, Learner, "([examples], [weight=, estimate=] -/-> Classifier")``
[11587]194    ``BayesLearner`` is derived from ``Learner``. It will have a peculiar constructor. It will, as usual, first construct an instance of ``BayesLearner``. If no arguments are given (except for, possibly, keyword arguments), it will return the constructed instance. Otherwise, it will call the ``Learner``'s call operator and return its result instead of ``BayesLearner``.
[11577]195
196``C_CALL3(MakeRandomIndices2, MakeRandomIndices2, MakeRandomIndices, "[n | gen [, p0]], [p0=, stratified=, randseed=] -/-> [int]")``
197    ``MakeRandomIndices2`` is derived from ``MakeRandomIndices`` (the third argument). For a contrast from the ``C_CALL`` above, the corresponding constructor won't call ``MakeRandomIndices`` call operator, but the call operator of ``MakeRandomIndices2`` (the second argument). This constructor is often used when the parent class doesn't provide a suitable call operator.
198
199``HIDDEN(TreeStopCriteria_Python, TreeStopCriteria)``
200    ``TreeStopCriteria_Python`` is derived from ``TreeStopCriteria``, but we would like to hide this class from the user. We use this definition when it is elegant for us to have some intermediate class or a class that implements some specific functionality, but don't want to bother the user with it. The class is not completely hidden - the user can reach it through the ``type`` operator on an instance of it. This is thus very similar to a ``BASED_ON``.
201
202``DATASTRUCTURE(Orange, TPyOrange, orange_dict)``
203    This is for the base classes. ``Orange`` has no parent class. The C++ structure that stores it is ``TPyOrange``; ``TPyOrange`` is essentially ``PyObject`` (again, the structure always has to be based on ``PyObject``) but with several additional fields, among them a pointer to an instance of ``TOrange`` (the C++ base class for all Orange's classes). ``orange_dict`` is a name of ``TPyOrange``'s field that points to a Python dictionary; when you have an instance ``bayesClassifier`` and you type, in Python, ``bayesClassifier.someMyData=15``, this gets stored in ``orange_dict``. The actual mechanism behind this is rather complicated and you most probably won't need to use it. If you happen to need to define a class with ``DATASTRUCTURE``, you can simply omit the last argument and give a 0 instead.
204
[11587]205Even if the class is defined by ``DATASTRUCTURE``, you can still specify a
206different constructor, most probably the last form of it (the ``_new``
207function). In this case, specify a keyword ``ROOT`` as a parent and pyxtract
208will understand that this is the base class.
[11577]209
[11587]210Object construction in Python is divided between two methods. The constructors
211we discussed above construct the essential part of the object - they allocate
212the necessary memory and initialize the fields far enough that the object is
213valid to enter the garbage collection. The second part is handled by the
214``init`` method. It is, however, not forbidden to organize the things so that
215``new`` does all the job. This is also the case in Orange. The only task left
216for ``init`` is to set any attributes that user gave as the keyword arguments to
217the constructor.
[11577]218
[11587]219For instance, Python's statement
220``orange.EnumVariable("a", values=["a", "b", "c"])`` is executed so that ``new``
221constructs the variable and gives it the name, while ``init`` sets the
222``values`` field.
[11577]223
[11587]224The ``new`` operator can also accept keyword arguments. For
225instance, when constructing an ``ExampleTable`` by reading the data from a file,
226you can specify a domain (using keyword argument ``domain``), a list of
227attributes to reuse if possible (``use``), you can tell it not to reuse the
228stored domain or not to store the newly constructed domain (``dontCheckStored``,
229``dontStore``). After the ``ExampleTable`` is constructed, ``init`` is called to
230set the attributes. To tell it to ignore the keyword arguments that the
231constructor might (or had) used, we write the following. ::
[11577]232
233    CONSTRUCTOR_KEYWORDS(ExampleTable, "domain use useMetas dontCheckStored dontStore filterMetas")
234
[11587]235There's another macro related to attributes. Let ``ba`` be an orange object, say
236an instance of ``orange.BayesLearner``. If you assign new attributes as usual
237directly, eg. ``ba.myAttribute = 12``, you will get a warning (you should use
238the object's method ``setattr(name, value)`` to avoid it). Some objects have
239some attributes that cannot be implemented in C++ code, yet they are usual and
240useful. For instance, ``Graph`` can use attributes ``objects``, ``forceMapping``
241and ``returnIndices``, which can only be set from Python (if you take a look at
242the documentation on ``Graph`` you will see why these cannot be implemented in
243C++). Yet, since user are allowed to set these attributes and will do so often,
244we don't want to give warnings. We achieve this by ::
[11577]245
246    RECOGNIZED_ATTRIBUTES(Graph, "objects forceMapping returnIndices")
247
248
249Special methods
250===============
251
[11587]252Special methods act as the class built-in methods. They define what the type can
253do: if it, for instance, supports multiplication, it should define the operator
254that gets the object itself and another object and return the product (or throw
255an exception). If it allows for indexing, it defines an operator that gets the
256object itself and the index, and returns the element. These operators are
257low-level; most can be called from Python scripts but they are also internally
258by Python. For instance, if ``table`` is an ``ExampleTable``, then
259``for e in table:`` or ``reduce(f, table)`` will both work by calling the
260indexing operator for each table's element.
261For more details, consider the Python manual, chapter "Extending and
262Embedding the Python Interpreter" section "Defining New Types".
[11577]263
[11587]264To define a method for Orange class, you need to define a function named,
265``<classname>_<methodname>``; the function should return either
266``PyObject *``, ``int`` or ``void``. The function's head has to be written in a
267single line. Regarding the arguments and the result, it should conform to
268Python's specifications. Pyxtract will detect the methods and set the pointers
269in ``PyTypeObject`` correspondingly.
[11577]270
[11587]271Here is a list of methods: the left column represents a method name that
272triggers pyxtract (these names generally correspond to special method names of
273Python classes as a programmer in Python sees them) and the second is the
274name of the field in ``PyTypeObject`` or subjugated structures. See Python
275documentation for description of functions' arguments and results. Not all
276methods can be directly defined; for those that can't, it is because we either
277use an alternative method (eg. ``setattro`` instead of ``setattr``) or pyxtract
278gets or computes the data for this field in some other way.
[11577]279
280General methods
281---------------
282
283+--------------+-----------------------+-----------------------------------------------------------+
284| pyxtract     | PyTypeObject          |                                                           |
285+==============+=======================+===========================================================+
286| ``dealloc``  | ``tp_dealloc``        | Frees the memory occupied by the object. You will need to |
287|              |                       | define this for the classes with a new ``DATASTRUCTURE``; |
288|              |                       | if you only derive a class from some Orange class, this   |
289|              |                       | has been taken care of. If you have a brand new object,   |
290|              |                       | copy the code of one of Orange's deallocators.            |
291+--------------+-----------------------+-----------------------------------------------------------+
292| ``.``        | ``tp_getattr``        | Can't be redefined since we use ``tp_getattro`` instead.  |
293+--------------+-----------------------+-----------------------------------------------------------+
294| ``.``        | ``tp_setattr``        | Can't be redefined since we use ``tp_setattro`` instead.  |
295+--------------+-----------------------+-----------------------------------------------------------+
296| ``cmp``      | ``tp_compare``        |                                                           |
297+--------------+-----------------------+-----------------------------------------------------------+
298| ``repr``     | ``tp_repr``           |                                                           |
299+--------------+-----------------------+-----------------------------------------------------------+
300| ``.``        | ``as_number``         | (pyxtract will initialize this field if you give any of   |
301|              |                       | the methods from the number protocol; you needn't care    |
302|              |                       | about this field)                                         |
303+--------------+-----------------------+-----------------------------------------------------------+
304| ``.``        | ``as_sequence``       | (pyxtract will initialize this field if you give any of   |
305|              |                       | the methods from the sequence protocol)                   |
306+--------------+-----------------------+-----------------------------------------------------------+
307| ``.``        | ``as_mapping``        | (pyxtract will initialize this field if you give any of   |
308|              |                       | the methods from the mapping protocol)                    |
309+--------------+-----------------------+-----------------------------------------------------------+
310| ``hash``     | ``tp_hash``           | Class ``Orange`` computes a hash value from the pointer;  |
311|              |                       | you don't need to overload it if your object inherits the |
312|              |                       | function. If you write an independent class, just copy the|
313|              |                       | code.                                                     |
314+--------------+-----------------------+-----------------------------------------------------------+
315| ``call``     | ``tp_call``           |                                                           |
316+--------------+-----------------------+-----------------------------------------------------------+
317| ``call``     | ``tp_call``           |                                                           |
318+--------------+-----------------------+-----------------------------------------------------------+
319| ``str``      | ``tp_str``            |                                                           |
320+--------------+-----------------------+-----------------------------------------------------------+
321| ``getattr``  | ``tp_getattro``       |                                                           |
322+--------------+-----------------------+-----------------------------------------------------------+
323| ``setattr``  | ``tp_setattro``       |                                                           |
324+--------------+-----------------------+-----------------------------------------------------------+
325| ``.``        | ``tp_as_buffer``      | Pyxtract doesn't support the buffer protocol.             |
326+--------------+-----------------------+-----------------------------------------------------------+
327| ``.``        | ``tp_flags``          | Flags are set by pyxtract.                                |
328+--------------+-----------------------+-----------------------------------------------------------+
329| ``.``        | ``tp_doc``            | Documentation is read from the constructor definition     |
330|              |                       | (see above).                                              |
331+--------------+-----------------------+-----------------------------------------------------------+
332| ``traverse`` | ``tp_traverse``       | Traverse is tricky (as is garbage collection in general). |
333|              |                       | There's something on it in a comment in root.hpp; besides |
334|              |                       | that, study the examples. In general, if a wrapped member |
335|              |                       | is exported to Python (just as, for instance,             |
336|              |                       | ``Classifier`` contains a ``Variable`` named              |
337|              |                       | ``classVar``), you don't need to care about it. You should|
338|              |                       | manually take care of any wrapped objects not exported to |
339|              |                       | Python. You probably won't come across such cases.        |
340+--------------+-----------------------+-----------------------------------------------------------+
341| ``clear``    | ``tp_clear``          |                                                           |
342+--------------+-----------------------+-----------------------------------------------------------+
343| ``richcmp``  | ``tp_richcmp``        |                                                           |
344+--------------+-----------------------+-----------------------------------------------------------+
345| ``.``        | ``tp_weaklistoffset`` |                                                           |
346+--------------+-----------------------+-----------------------------------------------------------+
347| ``iter``     | ``tp_iter``           |                                                           |
348+--------------+-----------------------+-----------------------------------------------------------+
349| ``iternext`` | ``tp_iternext``       |                                                           |
350+--------------+-----------------------+-----------------------------------------------------------+
351| ``.``        | ``tp_methods``        | Set by pyxtract if any methods are given.                 |
352+--------------+-----------------------+-----------------------------------------------------------+
353| ``.``        | ``tp_members``        |                                                           |
354+--------------+-----------------------+-----------------------------------------------------------+
355| ``.``        | ``getset``            | Pyxtract initializes this by a pointer to manually        |
356|              |                       | written getters/setters (see below).                      |
357+--------------+-----------------------+-----------------------------------------------------------+
358| ``.``        | ``tp_base``           | Set by pyxtract to a class specified in constructor       |
359|              |                       | (see above).                                              |
360+--------------+-----------------------+-----------------------------------------------------------+
361| ``.``        | ``tp_dict``           | Used for class constants (eg. ``Classifier.GetBoth``)     |
362+--------------+-----------------------+-----------------------------------------------------------+
363| ``.``        | ``tp_descrget``       |                                                           |
364+--------------+-----------------------+-----------------------------------------------------------+
365| ``.``        | ``tp_descrset``       |                                                           |
366+--------------+-----------------------+-----------------------------------------------------------+
367| ``.``        | ``tp_dictoffset``     | Set by pyxtract to the field given in ``DATASTRUCTURE``   |
368|              |                       | (if there is any).                                        |
369+--------------+-----------------------+-----------------------------------------------------------+
370| ``init``     | ``tp_init``           |                                                           |
371+--------------+-----------------------+-----------------------------------------------------------+
372| ``.``        | ``tp_alloc``          | Set to ``PyType_GenericAlloc``                            |
373+--------------+-----------------------+-----------------------------------------------------------+
374| ``new``      | ``tp_new``            |                                                           |
375+--------------+-----------------------+-----------------------------------------------------------+
376| ``.``        | ``tp_free``           | Set to ``_PyObject_GC_Del``                               |
377+--------------+-----------------------+-----------------------------------------------------------+
378| ``.``        | ``tp_is_gc``          |                                                           |
379+--------------+-----------------------+-----------------------------------------------------------+
380| ``.``        | ``tp_bases``          |                                                           |
381+--------------+-----------------------+-----------------------------------------------------------+
382| ``.``        | ``tp_mro``            |                                                           |
383+--------------+-----------------------+-----------------------------------------------------------+
384| ``.``        | ``tp_cache``          |                                                           |
385+--------------+-----------------------+-----------------------------------------------------------+
386| ``.``        | ``tp_subclasses``     |                                                           |
387+--------------+-----------------------+-----------------------------------------------------------+
388| ``.``        | ``tp_weaklist``       |                                                           |
389+--------------+-----------------------+-----------------------------------------------------------+
390
391Numeric protocol
392----------------
393
394+------------+------------------+-------------+-----------------+------------+---------------+-----------+--------------+
395| ``add``    |  ``nb_add``      | ``pow``     | ``nb_power``    | ``lshift`` | ``nb_lshift`` | ``int``   | ``nb_int``   |
396+------------+------------------+-------------+-----------------+------------+---------------+-----------+--------------+
397| ``sub``    | ``nb_subtract``  | ``neg``     | ``nb_negative`` | ``rshift`` | ``nb_rshift`` | ``long``  | ``nb_long``  |
398+------------+------------------+-------------+-----------------+------------+---------------+-----------+--------------+
399| ``mul``    | ``nb_multiply``  | ``pos``     | ``nb_positive`` | ``and``    | ``nb_and``    | ``float`` | ``nb_float`` |
400+------------+------------------+-------------+-----------------+------------+---------------+-----------+--------------+
401| ``div``    | ``nb_divide``    | ``abs``     | ``nb_absolute`` | ``or``     | ``nb_or``     | ``oct``   | ``nb_oct``   |
402+------------+------------------+-------------+-----------------+------------+---------------+-----------+--------------+
403| ``mod``    | ``nb_remainder`` | ``nonzero`` | ``nb_nonzero``  | ``coerce`` | ``nb_coerce`` | ``hex``   | ``nb_hex``   |
404+------------+------------------+-------------+-----------------+------------+---------------+-----------+--------------+
405| ``divmod`` | ``nb_divmod``    | ``inv``     | ``nb_invert``   |            |               |           |              |
406+------------+------------------+-------------+-----------------+------------+---------------+-----------+--------------+
407
408Sequence protocol
409-----------------
410
411+----------------+---------------+----------------+------------------+
412| ``len_sq``     | ``sq_length`` | ``getslice``   | ``sq_slice``     |
413+----------------+---------------+----------------+------------------+
414| ``concat``     | ``sq_concat`` | ``setitem_sq`` | ``sq_ass_item``  |
415+----------------+---------------+----------------+------------------+
416| ``repeat``     | ``sq_slice``  | ``setslice``   | ``sq_ass_slice`` |
417+----------------+---------------+----------------+------------------+
418| ``getitem_sq`` | ``sq_item``   | ``contains``   | ``sq_contains``  |
419+----------------+---------------+----------------+------------------+
420
421Mapping protocol
422----------------
423
424+-------------+----------------------+
425| ``len``     | ``mp_length``        |
426+-------------+----------------------+
427| ``getitem`` | ``mp_subscript``     |
428+-------------+----------------------+
429| ``setitem`` | ``mp_ass_subscript`` |
430+-------------+----------------------+
431
[11587]432For example, here is what gets called when you want to know the length of an
433example table. ::
[11577]434
435    int ExampleTable_len_sq(PyObject *self)
436    {
437        PyTRY
438            return SELF_AS(TExampleGenerator).numberOfExamples();
439        PyCATCH_1
440    }
441
[11587]442``PyTRY`` and ``PyCATCH`` take care of C++ exceptions. ``SELF_AS`` is a macro
443for casting, ie unwrapping the points (this is an alternative to ``CAST_TO``).
[11577]444
445
446Getting and Setting Class Attributes
447====================================
448
[11587]449Exporting of most of C++ class fields is already taken care by the lists that
450are compiled by pyprops. There are only a few cases in the entire Orange where
451we needed to manually write specific handlers for setting and getting the
452attributes. This needs to be done if setting needs some special processing or
453when simulating an attribute that does not exist in the underlying C++ class.
[11577]454
[11587]455An example for this is class ``HierarchicalCluster``. It contains results of a
456general, not necessarily binary clustering, so each node in the tree has a list
457``branches`` with all the node's children. Yet, as the usual clustering is
458binary, it would be nice if the node would also support attributes ``left`` and
459``right``. They are not present in C++, but we can write a function that check
460the number of branches; if there are none, it returns ``None``, if there are
461more than two, it complains, while otherwise it returns the first branch. ::
[11577]462
463    PyObject *HierarchicalCluster_get_left(PyObject *self)
464    {
465        PyTRY
466            CAST_TO(THierarchicalCluster, cluster);
467
468            if (!cluster->branches)
469                RETURN_NONE
470
471            if (cluster->branches->size() > 2)
472                PYERROR(PyExc_AttributeError,
473                        "'left' not defined (cluster has more than two subclusters)",
474                        NULL);
475
476            return WrapOrange(cluster->branches->front());
477        PyCATCH
478    }
479
[11587]480As you can see from the example, the function needs to accept a ``PyObject *``
481(the object it``self``) and return a ``PyObject *`` (the attribute value). The
482function name needs to be ``<classname>_get_<attributename>``.
483Setting an attribute is similar; function name should be
484``<classname>_set_<attributename>``, it should accept two Python
485objects (the object and the attribute value) and return an ``int``, where 0
486signifies success and -1 a failure.
[11577]487
[11587]488If you define only one of the two handlers, you'll get a read-only or write-only
489attribute.
[11577]490
491
492Member functions
493================
494
[11587]495We have already shown an example of a member function - the ``ExampleTable``'s
496method ``sort``. The general template is
497``PyObject *<classname>_<methodname>(<arguments>) PYARGS(<arguments-keyword>, <documentation-string>)``.
498In the case of the ``ExampleTable``'s ``sort``, this looks like this. ::
[11577]499
500    PyObject *ExampleTable_sort(PyObject *self, PyObject *args) PYARGS(METH_VARARGS, "() -> None")
501
[11587]502Argument type can be any of the usual Python constants stating the number and
503the kind of arguments, such as ``METH_VARARGS`` or ``METH_O`` - this constant
504gets copied to the corresponding list (browse Python documentation for
505``PyMethodDef``).
[11577]506
507
508Class constants
509===============
510
[11587]511Orange classes, as seen from Python, can also have constants, such as
512``orange.Classifier.GetBoth``. Classifier's ``GetBoth`` is visible as a member
513of the class, the derived classes and all their instances (eg.
514``BayesClassifier.GetBoth`` and ``bayes.GetBoth``).
[11577]515
[11587]516There are several ways to define such constants. If they are simple integers or
517floats, you can use ``PYCLASSCONSTANT_INT`` or ``PYCLASSCONSTANT_FLOAT``, like
518in ::
[11577]519
520    PYCLASSCONSTANT_INT(Classifier, GetBoth, 2)
521
522You can also use the enums from the class, like ::
523
524    PYCLASSCONSTANT_INT(C45TreeNode, Leaf, TC45TreeNode::Leaf)
525
[11587]526Pyxtract will convert the given constant to a Python object (using
527``PyInt_FromLong`` or ``PyFloat_FromDouble>``).
[11577]528
[11587]529When the constant is an object of some other type, use ``PYCLASSCONSTANT``. In
530this form (not used in Orange so far), the third argument can be either an
531instance of ``PyObject *`` or a function call. In either case, the object or
532function must be known at the point where the pyxtract generated file is
533included.
[11577]534
535
536Pickling
537========
538
[11587]539Pickling is taken care of automatically if the class provides a Python
540constructor that can construct the object without arguments (it may *accept*
541arguments, but should be able to do without them. If there is no such
542constructor, the class should provide a ``__reduce__`` method or it should
543explicitly declare that it cannot be pickled. If it doesn't pyxtract will issue
544a warning that the class will not be picklable.
[11577]545
546Here are the rules:
547
548* Classes that provide a ``__reduce__`` method (details follow below) are pickled through that method.
[11587]549
[11577]550* Class ``Orange``, the base class, already provides a ``__reduce__`` method, which is only useful if the constructor accepts empty arguments. So, if the constructor is declared as ``C_NAMED``, ``C_UNNAMED``, ``C_CALL`` or ``C_CALL3``, the class is the class will be picklable. See the warning below.
[11587]551
[11577]552* If the constructor is defined by ``_new`` method, and the ``BASED_ON`` definition is followed be ``ALLOWS_EMPTY``, this signifies that it accepts empty arguments, so it will be picklable just as in the above point. For example, the constructor for the class ``DefaultClassifier`` is defined like this ::
553
554    PyObject *DefaultClassifier_new(PyTypeObject *tpe, PyObject *args)
555        BASED_ON(Classifier, "([defaultVal])") ALLOWS_EMPTY
[11587]556
557and is picklable through code ``Orange.__reduce__``. But again, see the warning
558below.
[11577]559
560* If the constructor is defined as ``ABSTRACT``, there cannot be any instances of this class, so pyxtract will give no warning that it is not picklable.
561* The class can be explicitly defined as not picklable by ``NO_PICKLE`` macro, as in ::
562
563    NO_PICKLE(TabDelimExampleGenerator)
564
[11587]565  Such classes won't be picklable even if they define the appropriate
566  constructors. This effectively defined a ``__reduce__`` method which yields an
567  exception; if you manually provide a ``__reduce__`` method for such a class,
568  pyxtract will detect that the method is multiply defined.
[11577]569
[11587]570* If there are no suitable constructors, no ``__reduce__`` method and no
571  ``ABSTRACT`` or ``NO_PICKLE`` flag, pyxtract gives a warning about that.
[11577]572
[11587]573When the constructor is used, as in points 2 and 3, pickling will only work if
574all fields of the C++ class can be set "manually" from Python, are set through
575the constructor, or are set when assigning other fields. In other words, if
576there are fields that are not
577marked as ``//P`` for pyprops, you will most probably need to manually define
578a ``__reduce__`` method, as in point 1.
579
580The details of what the ``__reduce__`` method must do are described in the
581Python documentation. In our circumstances, it can be implemented in two ways
582which differ in what function is used for unpickling: it can either use the
583class' constructor or we can define a special method for unpickling.
584
585The former usually happens when the class has a read-only property (``//PR``),
586which is set by the constructor. For instance, ``AssociationRule`` has read-only
587fields ``left`` and ``right``, which are needs to be given to the constructor.
588This is the ``__reduce__`` method for the class. ::
[11577]589
590    PyObject *AssociationRule__reduce__(PyObject *self)
591    {
592        PyTRY
593            CAST_TO(TAssociationRule, arule);
594            return Py_BuildValue("O(NN)N", self->ob_type,
595                                       Example_FromWrappedExample(arule->left),
596                                       Example_FromWrappedExample(arule->right),
597                                       packOrangeDictionary(self));
598        PyCATCH
599    }
600
[11587]601As described in the Python documentation, the ``__reduce__`` should return a
602tuple in which the first element is the function that will do the unpickling,
603and the second argument are the arguments for that function. Our unpickling
604function is simply the classes' type (calling a type corresponds to calling a
605constructor) and the arguments for the constructor are the left- and right-hand
606side of the rule. The third element of the tuple is classes' dictionary.
[11577]607
[11587]608When unpickling is more complicated - usually when the class has no constructor
609and contains fields of type ``float *`` or similar - we need a special
610unpickling function. The function needs to be directly in the modules' namespace
611(it cannot be a static method of a class), so we named them
612``__pickleLoader<classname>``. Search for examples of such functions in
613the source code; note that the instance's true class need to be pickled, too.
614Also, check how we use ``TCharBuffer`` throughout the code to store and pickle
615binary data as Python strings.
[11577]616
[11587]617Be careful when manually writing the unpickler: if a C++ class derived from that
618class inherits its ``__reduce__``, the corresponding unpickler will construct an
619instance of a wrong class (unless the unpickler functions through Python's
620constructor, ``ob_type->tp_new``). Hence, classes derived from a class which
621defines an unpickler have to define their own ``__reduce__``, too.
[11577]622
623Non-member functions and constants
624==================================
625
[11587]626Non-member functions are defined in the same way as member functions except
627that their names do not start with the class name. Here is how the ``newmetaid``
628is implemented ::
[11577]629
630    PyObject *newmetaid(PyObject *, PyObject *) PYARGS(0,"() -> int")
631    {
632        PyTRY
633            return PyInt_FromLong(getMetaID());
634        PyCATCH
635    }
636
[11587]637Orange also defines some non-member constants. These are defined in a similar
638fashion as the class constants.
639``PYCONSTANT_INT(<constant-name>, <integer>)`` defines an integer
640constant and ``PYCONSTANT_FLOAT`` would be used for a continuous one.
641``PYCONSTANT`` is used for objects of other types, as the below example that
642defines an (obsolete) constant ``MeasureAttribute_splitGain`` shows. ::
[11577]643
644    PYCONSTANT(MeasureAttribute_splitGain, (PyObject *)&PyOrMeasureAttribute_gainRatio_Type)
645
[11587]646Class constants from the previous section are put in a pyxtract generated file
647that is included at the end of the file in which the constant definitions and
648the corresponding classes are. Global constant modules are included in another
649file, far away from their actual definitions. For this reason, ``PYCONSTANT``
650cannot refer to any functions (the above example is an exception - all class
651types are declared in this same file and are thus available at the moment the
652above code is used). Therefore, if the constant is defined by a function call,
653you need to use another keyword, ``PYCONSTANTFUNC``::
[11577]654
655    PYCONSTANTFUNC(globalRandom, stdRandomGenerator)
656
[11587]657Pyxtract will generate a code which will, prior to calling
658``stdRandomGenerator``, declare it as a function with no arguments that returns
659``PyObject *``. Of course, you will have to define the function somewhere in
660your code, like this::
[11577]661
662    PyObject *stdRandomGenerator()
663    {
664        return WrapOrange(globalRandom);
665    }
666
[11587]667Another example are ``VarTypes``. ``VarTypes`` is a tiny module inside Orange
668that contains nothing but five constants, representing various attribute types.
669From pyxtract perspective, ``VarTypes`` is a constant. This is the complete
670definition. ::
[11577]671
672    PyObject *VarTypes()
673    {
674        PyObject *vartypes=PyModule_New("VarTypes");
675        PyModule_AddIntConstant(vartypes, "None", (int)TValue::NONE);
676        PyModule_AddIntConstant(vartypes, "Discrete", (int)TValue::INTVAR);
677        PyModule_AddIntConstant(vartypes, "Continuous", (int)TValue::FLOATVAR);
678        PyModule_AddIntConstant(vartypes, "Other", (int)TValue::FLOATVAR+1);
679        PyModule_AddIntConstant(vartypes, "String", (int)STRINGVAR);
680        return vartypes;
681    }
682
683    PYCONSTANTFUNC(VarTypes, VarTypes)
684
[11587]685If you want to understand the constants completely, check the Orange's pyxtract
686generated file initialization.px.
[11577]687
688How does it all fit together
689============================
690
[11587]691We will finish the section with a description of the files generated by the two
692scripts. Understanding these may be needed for debugging purposes.
[11577]693
694File specific px files
695----------------------
696
[11587]697For each compiled cpp file, pyxtract creates a px file with the same name. The
698file starts with externs declaring the base classes for the classes whose types
699are defined later on. Then follow class type definitions:
[11577]700
[11587]701* Method definitions (``PyMethodDef``). Nothing exotic here, just a table with
702  the member functions that is pointed to by ``tp_methods`` of the
703  ``PyTypeObject``.
[11577]704
[11587]705* GetSet definitions (``PyGetSetDef``). Similar to methods, a list to be pointed
706  to by ``tp_getset``, which includes the attributes for which special handlers
707  were written.
[11577]708
709* Definitions of doc strings for call operator and constructor.
710
[11587]711* Constants. If the class has any constants, there will be a function named
712  ``void <class-name>_addConstants()``. The function will create a class
713  dictionary in the type's ``tp_dict``, if there is none yet. Then it will store
714  the constants in it. The functions is called at the module initialization,
715  file initialization.px.
[11577]716
[11587]717* Constructors. If the class uses generic constructors (ie, if it's defined by
718  ``C_UNNAMED``, ``C_NAMED``, ``C_CALL`` or ``C_CALL3``), they will need to call
719  a default object constructor, like the below one for ``FloatVariable``.
720  (This supposes the object is derived from ``TOrange``! We will need to get rid
721  of this we want pyxtract to be more general. Maybe an additional argument in
722  ``DATASTRUCTURE``?) ::
[11577]723
724    POrange FloatVariable_default_constructor(PyTypeObject *type)
725    {
726        return POrange(mlnew TFloatVariable(), type);
727    }
[11587]728
729  If the class is abstract, pyxtract defines a constructor that will call
730  ``PyOrType_GenericAbstract``. ``PyOrType_GenericAbstract`` checks the type
731  that the caller wishes to construct; if it is a type derived from this type,
732  it permits it, otherwise it complains that the class is abstract.
[11577]733
734* Aliases. A list of renamed attributes.
735
[11587]736* ``PyTypeObject`` and the numeric, sequence and mapping protocols.
737  ``PyTypeObject`` is named ``PyOr<classname>_Type_inh``.
[11577]738
[11587]739* Definition of conversion functions. This is done by macro
740  ``DEFINE_cc(<classname>)`` which defines
741  ``int ccn_<classname>(PyObject *obj, void *ptr)`` - functions that can
742  be used in ``PyArg_ParseTuple`` for converting an argument (given as
743  ``PyObject *`` to an instance of ``<classname>``. Nothing needs to be
744  programmed for the conversion, it is just a
745  cast: ``*(GCPtr< T##type > *)(ptr) = PyOrange_As##type(obj);``). The
746  difference between ``cc`` and ``ccn`` is that the latter accepts null
747  pointers.
[11577]748
[11587]749* ``TOrangeType`` that (essentially) inherits ``PyTypeObject``. The new
750  definition also includes the RTTI used for wrapping (this way Orange knows
751  which C++ class corresponds to which Python class), a pointer to the default
752  constructor (used by generic constructors), a pointer to list of constructor
753  keywords (``CONSTRUCTOR_KEYWORDS``, keyword arguments that should be ignored
754  in a later call to ``init``) and recognized attributes
755  (``RECOGNIZED_ATTRIBUTES``, attributes that don't yield warnings when set), a
756  list of aliases, and pointers to ``cc_`` and ``ccn_`` functions. The latter
757  are not used by Orange, since it can call the converters directly. They are
758  here because ``TOrangeType`` is exported in a DLL while ``cc_`` and ``ccn_``
759  are not (for the sake of limiting the number of exported symbols).
[11577]760
761
762initialization.px
763-----------------
764
765Initialization.px defines the global module stuff.
766
[11587]767First, here is a list of all ``TOrangeTypes``. The list is used for checking
768whether some Python object is of Orange's type or derived from one, for finding
769a Python class corresponding to a C++ class (based on C++'s RTTI). Orange also
770exports the list as ``orange._orangeClasses``; this is a ``PyCObject`` so it can
771only be used by other Python extensions written in C.
[11577]772
[11587]773Then come declarations of all non-member functions, followed by a
774``PyMethodDef`` structure with them.
[11577]775
[11587]776Finally, here are declarations of functions that return manually constructed
777constants (eg ``VarTypes``) and declarations of functions that add class
778constants (eg ``Classifier_addConstants``). The latter functions were generated
779by pyxtract and reside in the individual px files. Then follows a function that
780calls all the constant related functions declared above. This function also adds
781all class types to the Orange module.
[11577]782
783The main module now only needs to call ``addConstants``.
784
785externs.px
786----------
787
788Externs.px declares symbols for all Orange classes, for instance ::
789
790    extern ORANGE_API TOrangeType PyOrDomain_Type;
791    #define PyOrDomain_Check(op) PyObject_TypeCheck(op, (PyTypeObject *)&PyOrDomain_Type)
792    int cc_Domain(PyObject *, void *);
793    int ccn_Domain(PyObject *, void *);
794    #define PyOrange_AsDomain(op) (GCPtr< TDomain >(PyOrange_AS_Orange(op)))
795
[11587]796**************************
797What and where to include?
798**************************
[11577]799
[11587]800As already mentioned, ppp files should be included (at the beginning) of the
801corresponding cpp files, instead of the hpp file. For instance, domain.ppp is
802included in domain.cpp. Each ppp should be compiled only once, all other files
803needing the definition of ``TDomain`` should still include domain.hpp as usual.
[11577]804
[11587]805File-specific px files are included in the corresponding cpp files.
806lib_kernel.px is included at the end of lib_kernel.cpp, from which it was
807generated. initialization.px should preferably be included in the file that
808initializes the module (function ``initorange`` needs to call ``addConstants``,
809which is declared in initialization.px. These px files contain definitions and
810must be compiled only once. externs.px contains declarations and can be included
811wherever needed.
[11577]812
[11587]813For Microsoft Visual Studio, create a new, blank workspace. Specify the
814directory with orange sources as "Location". Add a new project of type "Win 32
815Dynamic-Link Library"; change the
816location back to d:\ai\orange\source. Make it an empty DLL project.
[11577]817
[11587]818Whatever names you give your module, make sure that the .cpp and .hpp files you
819create as you go on are in orange\source\something (replace "something" with
820something), since the further instructions will suppose it.
Note: See TracBrowser for help on using the repository browser.