Get Started

Download and Install

Download the Orange distribution package from the Orange web page and run the installation file on your local computer.

Run

Locate the Orange program icon. It is probably on your desktop (Win, Linux) or in the Applications folder (Mac). Double click on the icon to run Orange.

Orange Icon

Welcome to Orange

Running Orange opens a welcome screen. From here you may create new data mining workflows or browse through the ones you have already created. If you are running Orange for the first time, start by clicking on the Tutorial icon to browse through tutorial workflows.

welcome.png

Tutorials

From the tutorials window, select any of the preloaded data mining workflows. Here, we will choose the one with hierarchical clustering.

tutorials.png

Choosing a tutorial opens its workflow in Orange canvas. In Orange, data mining workflows consist of computational components called widgets. Widgets do all the work and exchange information. They can communicate through channels. In the workflow below, the File widget sends its data to Data Table and the Distance widget, which, in turn, communicates the computed distances to two other widgets in the workflow.

hierarhical-clustering-workflow.png

Any data mining starts with data. In our hierarchical clustering schema, the File widget reads the data from the file on your computer and send the data to other widgets.

file-widget-icon.png

Double click on the File widget icon to open it. Select "Browse documentation data sets..." and from the list of data files chose iris.tab.

file-widget.png

The File widget will now read the the famous data set on 150 Iris flowers, and send them to the workflow. The changes will propagate through the workflow updating its widgets. Close the window of the File widget, and double click on the Data Table widget to open it. This displays the data that we have just read.

data-table.png

Open and close other widgets to see what they do. In this workflow, the most interesting widget is Hierarchical Clustering that displays the clustering results. Scroll through the dendrogram - the tree-based rendering of the clustering results - to check if the hierarchical clustering correctly identified the three species of Iris.

hierarhical-clustering.png

You may now open other tutorials (from Help menu choose the Welcome tab to open Orange's welcome screen) and play with other workflows and widgets. Or create a workflow of your own.

Your Own Workflow

We first need to start with an empty canvas. Click on New on the Orange's welcome screen, or, if Orange is already running, choose New from the File menu.

We will explore the data on passengers of the HMS Titanic and develop a model to predict the probability of survival based on the passenger's traveling class, gender and age. Let us start by placing the File and Data widgets on the canvas.

canvas-titanic.png

We would like the File widget to read the data and send it to the Data Table for inspection. We need to connect these two widgets to establish the communication between them. Double click on the dashed line besides the File widget and drag the emerging connection onto the Data Table.

file-data-table-connection.png

To load the data, open the File widget (double click on its icon), select "Browse documentation data sets" from the Data File box and choose titanic.tab.

titanic-file.png

The data has loaded and automatically transferred to all the connected widgets. Check this out by opening the Data Table widget.

titanic-data-table.png

Our aim is to develop a predictive model that can estimate the probability of survival from passengers' data. Place the Naive Bayes and the Nomogram widget on the canvas and connect them as shown in the workflow below.

titanic-nbc-workflow.png

The workflow reads the data, builds a naive Bayesian classifier and renders it in a nomogram. Double click on the Nomogram widget to visualize the classifier, and choose the settings according to the following screenshot.

titanic-nomogram.png

Change the values of the input features (status, age, sex) by moving the blue points on the horizontal lines to the middle section in the nomogram and observe the change in survival probabilities. The lowest survival probability is estimated for adult males traveling in the third class. How about the crew? Who had the highest probability of survival? The Naive Bayesian classifier is one of the simplest classification algorithms, yet its rendering may already reveal interesting patterns in the data.

You have now learned how to place widgets on the canvas, connect them to make workflows, read data, build a simple model and visualize it. You may now consider exploring other widgets and their combinations, or load some data of your own and see how Orange can help in the analysis.