Orange Forum • View topic - Problem with Orange Canvas

Problem with Orange Canvas

A place to ask questions about methods in Orange and how they are used and other general support.

Problem with Orange Canvas

Postby Sam_Nav » Wed Mar 22, 2006 12:22

Hello everyone,

I am new to data-mining and I found Orange Canvas while searching the internet about data-miners. I found Orange Canvas to be an excellent data-mining platform for beginners. Keep the good work going!!

The problem is that I've started playing with Orange Canvas; trying to classify some classic datasets (e.g. Iris dataset etc.). So I used a Classification Tree algorithm to classify the dataset. All goes right up till this classification. Then I wanted to view the Classification Tree that I had just built. So I tried to add a "Classification Tree Viewer" after the Classification Tree. The problem is that I can't add the "Classification Tree Viewer". When I click on the Classification Tree Viewer Tab, instead of the Viewer getting in the canvas, a new window opens with the following error:


Unhandled exception os type exceptions.AttributeError occured at 12:13:05:

Traceback:

File: orngDoc.py in line 254
Function Name: addWidget
Code: newwidget = orngCanvasItems.CanvasWidget(self.signalManager, self. canvas, self.canvasView, widget, self.canvasDlg.defaultPic, self.canvasDlg)
File: orngCanvasItems.py in line 352
Function name: __init__
Code: self:instance = eval(code)
Exception type: exceptions.AttributeError
Exception value: 'module' object has no attribute 'OWClassificationTreeViewer'



I really don't understand what the problem is. And it happens just when I click on the widget on the tab menu.

However, all the other widgets are working just fine. Even the "Classification Tree Viewer 2D" is working just perfect. Actually I don't understand the difference between these two viewers.

Any help in this matter would be highly appreciated.

Thank you very much again.

Sam.

Postby Blaz » Wed Mar 22, 2006 18:29

Classification Tree Viewer 2D, which has an unfortunate name which will change to Classification Tree Graph, is a graphical presentation of the classification tree, whereas Classification Tree Viewer has list-based presentation (something like an File Explorer in Windows). I could not replicate your error, it may be a bug that was recently fixed - you may consider downloading the latest snapshot for MS Windows (http://www.ailab.si/orange/download/orange-snapshot.exe).

We have just started to write the documentation for widgets. Most of it is still empty, but as I turns out, the one for Classification Tree Graph (that is, Classification Tree Viewer 2D) is there already:
http://www.ailab.si/orange/doc/widgets/catalog/Classify/ClassificationTreeGraph.htm

Postby Janez » Wed Mar 22, 2006 22:26

Sam, thanks for telling us. We couldn't notice it ourselves, it worked on our machines (I tried to explain why in one sentence, but I'm giving up, and you probably don't really mind anyway :-).

It should work in the next snapshot.

Janez

Thanks for your help

Postby Sam_Nav » Thu Mar 23, 2006 14:08

Blaz & Janez,

Thank you very much for helping me out! Really appreciate it!

I just downloaded the latest Orange snapshot available on the website and it's still not working. It might be because I don't have administrator privileges on the computer I installed Orange. I remember an error while installing Python, but the installation finished well.

Anyway, everything is working at the moment except Interactive Tree Builder & Classification Tree Viewer.

I'll try to install Orange on another computer and see how it works. Will keep you posted.

I am also trying to build C45 dll, which in itself is quite a task!!

Thanks again.

Sam.

Postby Janez » Thu Mar 23, 2006 14:42

I distinctly remember commiting the file to the CVS but it seems I remember wrong. I did it now and ran the script that builds the snapshot. You can download and try it, just make sure that the browser doesn't give you the cached file.

Administrative privileges are needed for some Python stuff, but it shouldn't matter to Orange.

Regarding c45: I agree that this is really really annoying, but we just didn't get any response when we asked the Univ of New South Wales for the permission to distribute the binary files. Send me an email to janez [dot ]demsar /AT/ fri.uni-lj.si and I'll send you the file.

Thanks!!!

Postby Sam_Nav » Fri Mar 24, 2006 10:39

Janez,

You guys just rock!! Thanks you very much again. Everything is working just perfectly (including Interactive Trees & Classification Tree Viewer).

It took me quite some time to compile C45.dll, but finally I manged to do it.

I don't understand why Univ of South Wales isn't willing to share the binaries with you; it might be because RuleQuest is now commercializing C5.0. (To be honest, and it's my personal opinion, I don't see quite a lot of difference between C4.5 and C5.0 apart from the processing speed).

Anyway, thanks again and keep the good work going.

Finally one last question (and I know it's really stupid). I've been able to classify the Iris dataset using all the algorithms available in the Classify tab and it's working just fine. Now I just wanted to classify a new set from the model I built. Should I use the Classification widget in the Evaluate tab to do automatic classification on a new set?

Thanks again.

Sam.

Postby Janez » Fri Mar 24, 2006 10:54

You'd like to see how examples from the new data sets are classified?

Feed the data from a file widget to one or more learners and connect all learners to the "Classifications" widget. Classifications will thus get a bunch of models. Now take a new file widget, load some new data and feed it to Classifications. So Classifications will show you the predictions of all models for examples from the second file widget.

Postby Sam_Nav » Fri Mar 24, 2006 11:19

Janez,

Thank you very much for such a quick reply.

I just connected a file widget to a data sampler widget. I then connected the data-sampler wideget to various learners (Naive Bayes, Classification Trees, SVM, CN2 etc.). I finally connected the output of all learner widgets to the Classifications widget. For this try I connected the File widget output to the Classifications widget as well.

For this case, I used the Iris database. The classifications went well and the Classification widget shows me the true class with the predicted class of all the widgets. Just perfect!

However, I've a few questions regarding the test file.

1) I understand that the test file needs to have the same columns as the training file. So does the test file need to have exactly the same characteristics as the training life?

2) Secondly, what about the first three rows (first line: attribute names line; second line: types of attributes; third line: flags)? Do I need to have the first three lines for the test file as well?

3) Now this is stupid (I hope you would forgive me); but can I leave the target column out of the test file? So for the Iris test file, I would just have a file with four columns and no target column.

4) Finally, can I make a XML file of the model? (Just wondering)

Hope you would help me out here. And I apologize again for these really stupid questions, but I just want to clarify a few ideas.

Thanks again.

Sam.

Postby Janez » Fri Mar 24, 2006 11:52

You can find a more technical answer at http://www.ailab.si/orange/doc/reference/fileformats.htm#samedomain, but basically, when you load multiple files, Orange treats them as having the same domain only if everything is exactly the same. So answers to your first three questions are yes, yes, no.

There is a workaround. Say you have a dataset with a complete set of attributes and another file in which some of them are missing. Open the first data set in File Widget and the second data set in "Extended File Widget" (it's just next to the File widget in the toolbar). If you connect the two widgets, the second one will use the same attributes as the former for as long as they have the same name and type.

But I never tried what happens in Classifications if the class is missing in the test dataset. Now I did and: Classifications throws an exception. I wrote that bug down, so we'll fix it some day.


There was a widget for saving a naive Bayes model in XML, but we temporarily removed it since it was useless due to certain canvas limitations. We had nothing of the kind for any other model, but we shall surely at this in the future - soon, I hope.

Thanks!!

Postby Sam_Nav » Fri Mar 24, 2006 11:56

Janez,

Thank you very much for such a detailed reply. Needless to say, I really appreciate it.

I guess that means I can't use Orange Canvas for prediction purposes once a model has been created on the training set. I was thinking that I could put random classes in unseen test file and let Orange do the classifications.

What do you think?

Thanks a lot again (and I really mean it).

Regards,
Sam.

Postby Janez » Fri Mar 24, 2006 13:32

Yes, you can use random classes (or assign all examples to the same class or leave the classes undefined) until we fix the problem in Classifications widget.

Janez

Postby Sam_Nav » Fri Mar 24, 2006 13:51

Janez,

Thanks again for helping me out.

One last question (and I'll stop bothering you after this one): Are there any neural network widgets available at the moment in Orange Canvas?

Best regards,
Sam.

Postby Janez » Fri Mar 24, 2006 14:00

No. We don't like them, we prefer models which can be drawn or at least printed out in a human-interpretable form.

Thanks and another question

Postby Sam_Nav » Mon Mar 27, 2006 13:50

Janez,

Thank you very much for your reply. I totally agree with you about neural networks; for me they are black-boxes and who has to really trust them to use them (and at the same time they aren't really intuitive).

One small question: I am trying to use CN2 algorithm and unfortunately the catalog fopr this widget isn't ready yet. That's why I am bothering you yet again with another question. For getting to know Orange, I've chosen a slightly difficult dataset with 30 variables and 1200 example sets. I just plugged in the CN2 widget and it gave me pretty good results. Now when I opened up the Properties window of CN2 widget, I discovered that max rule length has been set to zero by default! Although in the CN2 results I do see rule lengths of 4. How's that possible if max rule length is set to zero?

And finally to do a very exhautive search of rules using all my 30 variables, do you agree that I should put rule length very high (let's say 99) and at the same time reduce beamwidth (let's say 1 or 2)? Any advice would be highly appreciated as I would really like to know what settings I need to put to get a very exhaustive rule search.

Thank you very much again.

Best regards,
Sam.

Postby Martin » Mon Mar 27, 2006 14:24

Sam,

I am responsible for CN2 widget and algorithm so it might be best that I answer you. If max rule length is set to zero, this means that the max. allowed length of a rule is infinite. I agree that it is a bit confusing and hence I shall add a checkbox for specfying whether rule length should be restricted or not.

Regarding your second question (about the most exhaustive search) there are several possibilities. Setting mininum coverage (min. number of examples that a rule must cover) and max. rule length define how long the learner will be adding conditions, where the most exhaustive variation is thus setting both to zero. The second option is to use higher beamwidth - it will try more rules at each rule length. And the last option is to select weighted covering, which only partially removes examples and allows the algorithm to learn more rules describing the examples with different patterns.

The selection of one of those strategies depends from data used. If you suspect that concepts in data are described with longer rules, you should use the first variant, otherwise you might want to go with the second. Then again, if you believe that there are many "dependent" concepts you should probably also try the third option.

Martin

Next

Return to Questions & Support



cron