Ticket #986 (closed wish: fixed)

Opened 3 years ago

Last modified 3 years ago

Orange.classification.tree check

Reported by: marko Owned by: matija
Milestone: 2.5 Component: other
Severity: minor Keywords:
Cc: matija Blocking:
Blocked By:

Description

Matija, could you provide some suggestions what I need to change in Orange.classification.tree documentation? Thanks.

Change History

comment:1 Changed 3 years ago by matija

Whoa, that's some long document. I understand, the module has to be thoroughly documented, but it's hard to read-proof such a long text. :)

Those are some things I noticed, but I sure didn't read it all, so many problems could remain unnoticed. If you'd like me to take another look at it, say so here. If you fix those problems and feel the documentation is OK now, close this ticket.

  • Typo: :class:SplitConstructor_Combined and :obj:C45Learner.commandline are written in the documentation (literally).
  • Why's there camelcase (worstAcceptable, treeSize, ...)?
  • "Don’t waste time on studying formatting tricks ( ‘s etc.)" – probably a typo, there should probably be %s?
  • I think it'd be better if the classes were explained later in the text, thus first providing all examples and other explanations. But that's not just a matter of moving thing higher up; some paragraphs would need to be rewritten. A simpler solution would be adding an additional paragraph just after the first example (iris with depth limit 3), explaining the learner(s) support pre- and post-pruning, different feature scoring, binarization, and node splitting algorithms. (Possibly this could be augmented with links to appropriate sections.)
  • There are (lots of) references to the (lower-case) orange module.
  • Typoed sentence: "Anyway, after doing this check that the built C4.5 gives the same results as the original." – probably, "the build script checks ..."

comment:2 Changed 3 years ago by matija

  • Status changed from new to assigned
  • Cc matija added
  • Owner changed from matija to marko

comment:3 Changed 3 years ago by marko

  • Owner changed from marko to matija

Could you please check the new version? :)

comment:4 Changed 3 years ago by matija

Major:

  • The fourth bulletpoint from my previous comment is still valid. I'm leaning towards the idea of simply adding a paragraph just after the first example, explaining what all can be configured and maybe explaining the difference between SplitConstructor, Splitter, Descender etc. This would then be duplicated in documentation, but because those explanations are short, I wouldn't mind reading them twice, each time in a slightly different form.
  • SimpleTreeLearner has to be pointed out in the introduction, or so at least I believe. It's probably what most people would end up using, had they been informed of its existence. (Probably noone will read that far down the documentation by himself, looking for a fast, simple tree...)
  • "The learning instances are copied to a table, unless store_instances is False and they already are in table." – copied to what table, already in which table?

Minor:

  • I had usually been writing :obj:`None` instead of just None; it's what I believe should be used in this document too.
  • "Decide how to divide learning instances." -> "Decide how to divide learning instances, ie. define branching criteria."
  • Some strings are not formatted as fixed-width, and also the quotes are "pretty-printed"; for example: “%V (%^.2m%)” and “%V”.
  • "it print x" -> "it prints x"
  • "[<val1>, <val2>, ...<valn>]" -> "[<val1>, <val2>, ..., <valn>]"
  • Orange.classification.tree.SplitConstructor_OneAgainstOthers is undocumented. :)
  • Missing ending bracket in the first sentence of the section "Splitters".
  • In C45Learner, it should be noted that it constructs C45Classifier.
  • In C45Classifier, you say that it is "a faithful reimplementation of Quinlan's function from C4.5" – which function? You mean a datastructure/class or what? (I don't know the original implementation of C4.5, so this may be a stupid question.)
  • In SimpleTreeLearner: "... and mse for regression.": "mse"->"MSE"

A suggestion for improving implementation (maybe): TreeLearner could figure out that its parameters are set equivalently to SimpleTreeLearner, and it could run that algorithm itself to speed things up. This should of course be reflected in the documentation.

comment:5 Changed 3 years ago by marko

  • Status changed from assigned to closed
  • Resolution set to fixed

In [12352]:

Classification tree documentation updates. Fixes #986.

Note: See TracTickets for help on using tickets.