Ticket #1150 (new wish)

Opened 3 years ago

Last modified 3 years ago

Multi-dimensional class variables

Reported by: anze Owned by:
Milestone: 3.0 Component: library
Severity: minor Keywords:
Cc: janez, lanz, matija Blocking:
Blocked By:

Description

Orange 2.5 domain can contain a single class variable (class_var) and/or multiple class variables (class_vars). It is possible (and common) to construct a domain that has class_var set and empty class_vars or vice versa. In Orange 3.0, we should merge the concepts of single-class and multi-class domains, since they both represent the dependent variables.

  • If a domain contains one or more class variables, class_vars should contain all of them. class_var should point to the first variable in class_vars.
  • data.Instance get_class should return the value of the first dependent variable or raise exception if domain is classless.
  • indexing instances with index -1 should raise a deprecation warning. If domain has a single class, it should return its value. If domain has more than one class, it should raise an error.
  • to_numpy should default to "am/w". Most of the methods that use it should use as_numpy anyway.

Change History

comment:1 Changed 3 years ago by lanz

I very much agree with the first two points.

For the third, I thought we will break compatibility with orange3? In that case - no deprecation warning, just do what we think is right. IMHO the right thing is either to say an instance is considered as [independent vars] + [dependent vars] and indexing refers to just the first vector (in this case -1 is the last feature) or it is the concatenation of features and labels, in which case -1 is the last class (this might be more practical).

Of course whichever option is chosen -2, -3, ... are also legal indexing options and meta attributes are handled differently (?). At least that is my view of orange3, where instances and tables behave as much as possible as (numpy) vectors and matrices.

Don't know much about the last point but the part "should use as_numpy" sounds about right.

Note: See TracTickets for help on using tickets.