Orange Blog

By: AJDA, Jun 5, 2017

Nomogram

One more exciting visualization has been introduced to Orange - a Nomogram. In general, nomograms are graphical devices that can approximate the calculation of some function. A Nomogram widget in Orange visualizes Logistic Regression and Naive Bayes classification models, and compute the class probabilities given a set of attributes values. In the nomogram, we can check how changing of the attribute values affect the class probabilities, and since the widget (like widgets in Orange) is interactive, we can do this on the fly.


By: BLAZ, Apr 25, 2017

Outliers in Traffic Signs

Say I am given a collection of images of traffic signs, and would like to find which signs stick out. That is, which traffic signs look substantially different from the others. I would assume that the traffic signs are not equally important and that some were designed to be noted before the others. I have assembled a small set of regulatory and warning traffic signs and stored the references to their images in a traffic-signs-w.


By: BLAZ, Dec 22, 2016

The Beauty of Random Forest

It is the time of the year when we adore Christmas trees. But these are not the only trees we, at Orange team, think about. In fact, through almost life-long professional deformation of being a data scientist, when I think about trees I would often think about classification and regression trees. And they can be beautiful as well. Not only for their elegance in explaining the hidden patterns, but aesthetically, when rendered in Orange.


By: AJDA, Dec 12, 2016

Dimensionality Reduction by Manifold Learning

The new Orange release (v. 3.3.9) welcomed a few wonderful additions to its widget family, including Manifold Learning widget. The widget reduces the dimensionality of the high-dimensional data and is thus wonderful in combination with visualization widgets. Manifold Learning widget has a simple interface with powerful features. Manifold Learning widget offers five embedding techniques based on scikit-learn library: t-SNE, MDS, Isomap, Locally Linear Embedding and Spectral Embedding. They each handle the mapping differently and also have a specific set of parameters.


By: AJDA, Jul 29, 2016

Pythagorean Trees and Forests

Classification Trees are great, but how about when they overgrow even your 27” screen? Can we make the tree fit snugly onto the screen and still tell the whole story? Well, yes we can. Pythagorean Tree widget will show you the same information as Classification Tree, but way more concisely. Pythagorean Trees represent nodes with squares whose size is proportionate to the number of covered training instances. Once the data is split into two subsets, the corresponding new squares form a right triangle on top of the parent square.


By: AJDA, Jul 18, 2016

Network Analysis with Orange

Visualizing relations between data instances can tell us a lot about our data. Let’s see how this works in Orange. We have a data set on machine learning and data mining conferences and journals, with the number of shared authors for each publication venue reported. We can estimate similarity between two conferences using the author profile of a conference: two conference would be similar if they attract the same authors. The data set is already 9 years old, but obviously, it’s about the principle.


By: AJDA, Apr 14, 2016

Univariate GSoC Success

Google Summer of Code application period has come to an end. We’ve received 34 applications, some of which were of truly high quality. Now it’s upon us to select the top performing candidates, but before that we wanted to have an overlook of the candidate pool. We’ve gathered data from our Google Form application and gave it a quick view in Orange. First, we needed to preprocess the data a bit, since it came in a messy form of strings.


By: AJDA, Mar 23, 2016

All I See is Silhouette

Silhouette plot is such a nice method for visually assessing cluster quality and the degree of cluster membership that we simply couldn’t wait to get it into Orange3. And now we did. What this visualization displays is the average distance between instances within the cluster and instances in the nearest cluster. For a given data instance, the silhouette close to 1 indicates that the data instance is close to the center of the cluster.


By: AJDA, Dec 28, 2015

Color it!

Holiday season is upon us and even the Orange team is in a festive mood. This is why we made a Color widget! This fascinating artsy widget will allow you to play with your data set in a new and exciting way. No more dull visualizations and default color schemes! Set your own colors the way YOU want it to! Care for some magical cyan-to-magenta? Or do you prefer a more festive red-to-green?


By: AJDA, Nov 27, 2015

Mining our own data

Recently we’ve made a short survey that was, upon Orange download, asking people how they found out about Orange, what was their data mining level and where do they work. The main purpose of this is to get a better insight into our user base and to figure out what is the profile of people interested in trying Orange. Here we have some preliminary results that we’ve managed to gather in the past three weeks or so.