Developer utilities (utils)

Orange.utils contains developer utilities.

Reporting progress

class Orange.utils.ConsoleProgressBar(title=, charwidth=40, step=1, output=None)

A class to for printing progress bar reports in the console.

Example

>>> import sys, time
>>> progress = ConsoleProgressBar("Example", output=sys.stdout)
>>> for i in range(100):
...    progress.advance()
...    # Or progress.set_state(i)
...    time.sleep(0.01)
...
...
Example ===================================>100%
__call__(newstate=None)

Set the newstate as the current state of the progress bar. newstate must be in the interval [0, 100].

Note

set_state is the prefered way to set a new steate.

Parameters:newstate (float) – The new state of the progress bar.
__init__(title=, charwidth=40, step=1, output=None)

Initialize the progress bar.

Parameters:
  • title (str) – The title for the progress bar.
  • charwidth (int) –

    The maximum progress bar width in characters.

  • step (int) – A default step used if advance is called without any arguments
  • output (An file like object to print the progress report to.) – The output file. If None (default) then sys.stderr is used.
advance(step=None)

Advance the current state by step. If step is None use the default step as set at class initialization.

clear(i=-1)

Clear the current progress line indicator string.

finish()

Finish the progress bar (i.e. set the state to 100 and print the final newline to the output file).

getstring()

Return the progress indicator string.

printline(string)

Print the string to the output file.

set_state(newstate)

Set the newstate as the current state of the progress bar. newstate must be in the interval [0, 100].

Parameters:newstate (float) – The new state of the progress bar.

Deprecation utility functions

Orange.utils.deprecation_warning(old, new, stacklevel=-2)

Raise a deprecation warning of an obsolete attribute access.

Parameters:
  • old – Old attribute name (used in warning message).
  • new – New attribute name (used in warning message).
Orange.utils.deprecated_members(name_map, wrap_methods=all, in_place=True)

Decorate a class with properties for accessing attributes, and methods with deprecated names. In addition methods from the wrap_methods list will be wrapped to receive mapped keyword arguments.

Parameters:
  • name_map (dict) – A dictionary mapping old into new names.
  • wrap_methods (list) – A list of method names to wrap. Wrapped methods will be called with mapped keyword arguments (by default all methods will be wrapped).
  • in_place (bool) – If True the class will be modified in place, otherwise it will be subclassed (default True).

Example

>>> class A(object):
...     def __init__(self, foo_bar="bar"):
...         self.set_foo_bar(foo_bar)
...     
...     def set_foo_bar(self, foo_bar="bar"):
...         self.foo_bar = foo_bar
...
... A = deprecated_members(
... {"fooBar": "foo_bar", 
...  "setFooBar":"set_foo_bar"},
... wrap_methods=["set_foo_bar", "__init__"])(A)
... 
...
>>> a = A(fooBar="foo")
__main__:1: DeprecationWarning: 'fooBar' is deprecated. Use 'foo_bar' instead!
>>> print a.fooBar, a.foo_bar
foo foo
>>> a.setFooBar("FooBar!")
__main__:1: DeprecationWarning: 'setFooBar' is deprecated. Use 'set_foo_bar' instead!

Note

This decorator does nothing if Orange.utils.environ.orange_no_deprecated_members environment variable is set to True.

Orange.utils.deprecated_keywords(name_map)

Deprecates the keyword arguments of the function.

Example

>>> @deprecated_keywords({"myArg": "my_arg"})
... def my_func(my_arg=None):
...     print my_arg
...
...
>>> my_func(myArg="Arg")
__main__:1: DeprecationWarning: 'myArg' is deprecated. Use 'my_arg' instead!
Arg

Note

This decorator does nothing if Orange.utils.environ.orange_no_deprecated_members environment variable is set to True.

Orange.utils.deprecated_attribute(old_name, new_name)

Return a property object that accesses an attribute named new_name and raises a deprecation warning when doing so.

>>> sys.stderr = sys.stdout

Example

>>> class A(object):
...     def __init__(self):
...         self.my_attr = "123"
...     myAttr = deprecated_attribute("myAttr", "my_attr")
...
...
>>> a = A()
>>> print a.myAttr
...:1: DeprecationWarning: 'myAttr' is deprecated. Use 'my_attr' instead!
123

Note

This decorator does nothing and returns None if Orange.utils.environ.orange_no_deprecated_members environment variable is set to True.

Orange.utils.deprecated_function_name(func)

Return a wrapped function that raises an deprecation warning when called. This should be used for deprecation of module level function names.

Example

>>> def func_a(arg):
...    print "This is func_a  (used to be named funcA) called with", arg
...
...
>>> funcA = deprecated_function_name(func_a)
>>> funcA(None)

Note

This decorator does nothing and if Orange.utils.environ.orange_no_deprecated_members environment variable is set to True.

Submodules

Orange environment configuration (environ)

This module contains some basic customization options for Orange (for now mostly changing directories where orange settings are saved).

How it works

When this module is imported it will first load and parse a global configuration orangerc.cfg (located in the root directory of the orange installation). Further, it will look for and try to load a user specific configuration file located in $(HOME)/.orangerc.cfg or application_dir/orangerc.cfg where application_dir is a variable defined in the global configuration file.

Note

in the configuration files all OS defined environment variables (e.g $HOME, $USER, ...) are available.

After all the parsing has taken place the select variables defined in the configuration will be made available as top level module variables.

Example

To change the location where settings are saved for Orange Canvas on Windows edit the global orangerc.cfg file and in the [directories win32] section change the application_dir variable:

[directories win32]

application_dir = D:/SharedAppData/orange/

In this way we can hard code the path instead of using the system default (defined in the the %APPDATA variable)

Variables

The following variables are exposed as top level module members

install_dir:
Directory where Orange is installed.
canvas_install_dir:
Directory where Orange Canvas is installed.
widget_install_dir:
Directory where Orange Widgets are installed.
icons_install_dir:
Directory where icons for widgets are installed.
doc_install_dir:
Directory with Orange documentation.
dataset_install_dir:
Directory with example data sets.
add_ons_dir:
Directory where system-wide add-ons are installed
add_ons_dir_user:
Directory where user add-ons are installed
application_dir:
Directory where applications can save their data.
output_dir:
Directory where Orange saves settings/data.
default_reports_dir:
Directory where Orange Canvas saves the reports.
orange_settings_dir:
Directory where Orange settings are saved.
canvas_settings_dir:
Directory where Orange Canvas settings are saved.
widget_settings_dir:
Directory where Orange Widgets settings are saved.
buffer_dir:
Directory where Orange.utils.serverfiles downloads are stored.
orange_no_deprecated_members:
If True all deprecated members in Orange 2.5 will not be available.

Counters (counters)

Orange.utils.counters contains a bunch of classes that generate sequences of various kinds.

class Orange.utils.counters.BooleanCounter(bits)

A class which represents a boolean counter. The constructor is given the number of bits and during the iteration the counter returns a list of that length with 0 and 1’s in it.

One way to use the counter is within a for-loop:

>>> for r in BooleanCounter(3):
...    print r
[0, 0, 0]
[0, 0, 1]
[0, 1, 0]
[0, 1, 1]
[1, 0, 0]
[1, 0, 1]
[1, 1, 0]
[1, 1, 1]

You can also call it manually.

>>> r = BooleanCounter(3)
>>> r.next()
[0, 0, 0]
>>> r.next()
[0, 0, 1]
>>> r.next()
[0, 1, 0]
state

The current counter state (the last result of a call to next) is also stored as attribute attribute.

__init__(bits)
Parameters:bits (int) – Number of bits.
next()

Return the next state of the counter.

class Orange.utils.counters.CanonicFuncCounter(places)

Returns all sequences of a given length where no numbers are skipped (see below) and none of the generated sequence is equal to another if only the labels are changed. For instance, [0, 2, 2, 1] and [1, 0, 0, 2] are considered equivalent: if we take the former and replace 0 by 1, 2 by 0 and 1 by 2 we get the second list.

The sequences generated are equivalent to all possible functions from a set of cardinality of the sequences length.

>>> for t in CanonicFuncCounter(4):
...     print t
...
[0, 0, 0, 0]
[0, 0, 0, 1]
[0, 0, 1, 0]
[0, 0, 1, 1]
[0, 0, 1, 2]
[0, 1, 0, 0]
[0, 1, 0, 1]
[0, 1, 0, 2]
[0, 1, 1, 0]
[0, 1, 1, 1]
[0, 1, 1, 2]
[0, 1, 2, 0]
[0, 1, 2, 1]
[0, 1, 2, 2]
[0, 1, 2, 3]
state

The current counter state (the last result of a call to next) is also stored as attribute attribute.

__init__(places)
Parameters:places (int) – Number of places.
next()

Return the next state of the counter.

class Orange.utils.counters.LimitedCounter(limits)

This class is similar to BooleanCounter except that the digits do not count from 0 to 1, but to the limits that are specified individually for each digit.

>>> for t in LimitedCounter([3, 5]):
...     print t
[0, 0]
[0, 1]
[0, 2]
[0, 3]
[0, 4]
[1, 0]
[1, 1]
[1, 2]
[1, 3]
[1, 4]
[2, 0]
[2, 1]
[2, 2]
[2, 3]
[2, 4]
state

The current counter state (the last result of a call to next) is also stored as attribute attribute.

__init__(limits)
Parameters:limits (list) – Domain size per bit position.
next()

Return the next state of the counter.

class Orange.utils.counters.MofNCounter(m, n)

Counter returns all consecutive subset lists of length m out of n where m <= n.

>>> for t in MofNCounter(3,7):
...     print t
...
[0, 1, 2]
[1, 2, 3]
[2, 3, 4]
[3, 4, 5]
[4, 5, 6]
state

The current counter state (the last result of a call to next) is also stored as attribute attribute.

__init__(m, n)
Parameters:
  • m (int) – Length of subset list.
  • n (int) – Total length.
next()

Return the next state of the counter.

class Orange.utils.counters.NondecreasingCounter(places)

Nondecreasing counter generates all non-decreasing integer sequences in which no numbers are skipped, that is, if n is in the sequence, the sequence also includes all numbers between 0 and n. For instance, [0, 0, 1, 0] is illegal since it decreases, and [0, 0, 2, 2] is illegal since it has 2 without having 1 first. Or, with an example

Nondecreasing counter generates all non-decreasing integer sequences in which no numbers are skipped, that is, if n is in the sequence, the sequence also includes all numbers between 0 and n. For instance, [0, 0, 1, 0] is illegal since it decreases, and [0, 0, 2, 2] is illegal since it has 2 without having 1 first. Or, with an example

>>> for t in NondecreasingCounter(4):
...     print t
...
[0, 0, 0, 0]
[0, 0, 0, 1]
[0, 0, 1, 1]
[0, 0, 1, 2]
[0, 1, 1, 1]
[0, 1, 1, 2]
[0, 1, 2, 2]
[0, 1, 2, 3]
state

The current counter state (the last result of a call to next) is also stored as attribute attribute.

__init__(places)
Parameters:places (int) – Number of places.
next()

Return the next state of the counter.

Render (render)

Selection (selection)

Many machine learning techniques generate a set of different solutions or have to choose, as for instance in classification tree induction, between different features. The most trivial solution is to iterate through the candidates, compare them and remember the optimal one. The problem occurs, however, when there are multiple candidates that are equally good, and the naive approaches would select the first or the last one, depending upon the formulation of the if-statement.

Orange.utils.selection provides a class that makes a random choice in such cases. Each new candidate is compared with the currently optimal one; it replaces the optimal if it is better, while if they are equal, one is chosen by random. The number of competing optimal candidates is stored, so in this random choice the probability to select the new candidate (over the current one) is 1/w, where w is the current number of equal candidates, including the present one. One can easily verify that this gives equal chances to all candidates, independent of the order in which they are presented.

Example

The following snippet loads the data set lymphography and prints out the feature with the highest information gain.

part of misc-selection-bestonthefly.py

import Orange

lymphography = Orange.data.Table("zoo")

find_best = Orange.utils.selection.BestOnTheFly(call_compare_on_1st=True)

for attr in lymphography.domain.attributes:
    find_best.candidate((Orange.feature.scoring.GainRatio(attr, lymphography), attr))

print "%5.3f: %s" % find_best.winner()

Our candidates are tuples gain ratios and features, so we set call_compare_on_1st to make the compare function compare the first element (gain ratios). We could achieve the same by initializing the object like this:

part of misc-selection-bestonthefly.py

find_best = Orange.utils.selection.BestOnTheFly(Orange.utils.selection.compare_first_bigger)

The other way to do it is through indices.

misc-selection-bestonthefly.py

find_best = Orange.utils.selection.BestOnTheFly()

for attr in lymphography.domain.attributes:
    find_best.candidate(Orange.feature.scoring.GainRatio(attr, lymphography))

best_index = find_best.winner_index()
print "%5.3f: %s" % (find_best.winner(), lymphography.domain[best_index])

Here we only give gain ratios to BestOnTheFly, so we don’t have to specify a special compare operator. After checking all features we get the index of the optimal one by calling winner_index.

class Orange.utils.selection.BestOnTheFly(compare=<built-in function cmp>, seed=0, call_compare_on_1st=False)

Finds the optimal object in a sequence of objects. The class is fed the candidates one by one, and remembers the winner. It can thus be used by methods that generate different solutions to a problem and need to select the optimal one, but do not want to store them all.

Parameters:
  • compare – compare function.
  • seed (int) – If not given, a random seed of 0 is used to ensure that the same experiment always gives the same results, despite pseudo-randomness.random seed.
  • call_compare_on_1st (bool) – If set, BestOnTheFly will suppose that the candidates are lists are tuples, and it will call compare with the first element of the tuple.
Orange.utils.selection.select_best(x, compare=<built-in function cmp>, seed=0, call_compare_on_1st=False)

Return the optimal object from list x. The function is used if the candidates are already in the list, so using the more complicated BestOnTheFly directly is not needed.

To demonstrate the use of BestOnTheFly see the implementation of selectBest:

def selectBest(x, compare=cmp, seed = 0, call_compare_on_1st = False):
    bs=BestOnTheFly(compare, seed, call_compare_on_1st)
    for i in x:
        bs.candidate(i)
    return bs.winner()
Parameters:
  • x (list) – list of existing candidates.
  • compare – compare function.
  • seed (int) – If not given, a random seed of 0 is used to ensure that the same experiment always gives the same results, despite pseudo-randomness.random seed.
  • call_compare_on_1st (bool) – If set, BestOnTheFly will suppose that the candidates are lists are tuples, and it will call compare with the first element of the tuple.
Return type:

object

Orange.utils.selection.select_best_index(x, compare=<built-in function cmp>, seed=0, call_compare_on_1st=False)

Similar to selectBest except that it doesn’t return the best object but its index in the list x.

Orange.utils.selection.compare_first_bigger(x, y)

Function takes two lists and compares first elements.

Parameters:
  • x (list) – list of values.
  • y (list) – list of values.
Return type:

cmp(x[0], y[0])

Orange.utils.selection.compare_first_smaller(x, y)

Function takes two lists and compares first elements.

Parameters:
  • x (list) – list of values.
  • y (list) – list of values.
Return type:

-cmp(x[0], y[0])

Orange.utils.selection.compare_last_bigger(x, y)

Function takes two lists and compares last elements.

Parameters:
  • x (list) – list of values.
  • y (list) – list of values.
Return type:

cmp(x[0], y[0])

Orange.utils.selection.compare_last_smaller(x, y)

Function takes two lists and compares last elements.

Parameters:
  • x (list) – list of values.
  • y (list) – list of values.
Return type:

-cmp(x[0], y[0])

Orange.utils.selection.compare_bigger(x, y)

Function takes and compares two numbers.

Parameters:
  • x (int) – value.
  • y (int) – value.
Return type:

cmp(x, y)

Orange.utils.selection.compare_smaller(x, y)

Function takes and compares two numbers.

Parameters:
  • x (int) – value.
  • y (int) – value.
Return type:

cmp(x, y)

Server files (serverfiles)

Server files allows users to download files from a common repository residing on the Orange server. It was designed to simplify the download and updates of external data sources for Orange Genomics add-on. Furthermore, an authenticated user can also manage the repository files with this module.

Orange server file repository was created to store large files that do not come with Orange installation, but may be required from the user when running specific Orange functions. A typical example is Orange Bioinformatics package, which relies on large data files storing genome information. These do not come pre-installed, but are rather downloaded from the server when needed and stored in the local repository. The module provides low-level functionality to manage these files, and is used by Orange modules to locate the files from the local repository and update/download them when and if needed.

Each managed file is described by domain and the file name. Domains are like directories - a place where files are put in.

Domain should consist of less than 255 alphanumeric ASCII characters, whereas filenames can be arbitary long and can contain any ASCII character (including “” ~ . / { }). Please, refrain from using not-ASCII character both in domain and filenames. Files can be protected or not. Protected files can only be accessed by authenticated users

Local file management

The files are saved under Orange’s settings directory, subdirectory buffer/bigfiles. Each domain is a subdirectory. A corresponding info file bearing the same name and an extension ”.info” is created with every download. Info files contain title, tags, size and date and time of the file.

Orange.utils.serverfiles.allinfo(domain)

Goes through all files in a domain on a local repository and returns a dictionary, where keys are names of the files and values are their information.

Orange.utils.serverfiles.download(domain, filename, *args, **kwargs)

Downloads file from the repository to local orange installation. To download files as an authenticated user you should also pass an instance of ServerFiles class. Callback can be a function without arguments. It will be called once for each downloaded percent of file: 100 times for the whole file.

Orange.utils.serverfiles.info(domain, filename)

Returns info of a file in a local repository.

Orange.utils.serverfiles.listdomains()

List all file domains in the local repository.

Orange.utils.serverfiles.listfiles(domain)

List all files from a domain in a local repository.

Orange.utils.serverfiles.localpath(domain=None, filename=None)

Return a path for the domain in the local repository. If filename is given, return a path to corresponding file.

Orange.utils.serverfiles.localpath_download(domain, filename, *args, **kwargs)

Return local path for the given domain and file. If file does not exist, download it. Additional arguments are passed to the download function.

Orange.utils.serverfiles.needs_update(domain, filename, serverfiles=None)

True if a file does not exist in the local repository or if there is a newer version on the server.

Orange.utils.serverfiles.remove(domain, filename, *args, **kwargs)

Remove a file from local repository.

Orange.utils.serverfiles.remove_domain(domain, force=False)

Remove a domain. If force is True, domain is removed even if it is not empty (contains files).

Orange.utils.serverfiles.search(sstrings, **kwargs)

Search for files in the local repository where all substrings in a list are contained in at least one chosen field (tag, title, name). Return a list of tuples: first tuple element is the domain of the file, second its name.

Orange.utils.serverfiles.update(domain, filename, serverfiles=None, **kwargs)

Downloads the corresponding file from the server and places it in the local repository, but only if the server copy of the file is newer or the local copy does not exist. An optional ServerFiles object can be passed for authenticated access.

Remote file management
class Orange.utils.serverfiles.ServerFiles(username=None, password=None, server=None, access_code=None)

To work with the repository, you need to create an instance of ServerFiles object. To access the repository as an authenticated user, a username and password should be passed to the constructor. All password protected operations and transfers are secured by SSL; this secures both password and content.

Repository files are set as protected when first uploaded: only authenticated users can see them. They need to be unprotected for public use.

__init__(username=None, password=None, server=None, access_code=None)

Creates a ServerFiles instance. Pass your username and password to use the repository as an authenticated user. If you want to use your access code (as an non-authenticated user), pass it also.

allinfo(domain)

Go through all accessible files in a given domain and return a dictionary, where key is file’s name and value its info.

create_domain(domain)

Create a server domain.

download(domain, filename, target, callback=None)

Downloads file from the repository to a given target name. Callback can be a function without arguments. It will be called once for each downloaded percent of file: 100 times for the whole file.

downloadFH(domain, filename)

Return a file handle to the file that we would like to download.

info(domain, filename)

Return a dictionary containing repository file info. Keys: title, tags, size, datetime.

listdomains()

List all domains on repository.

listfiles(domain)

List all files in a repository domain.

protect(domain, filename, access_code=1)

Hide file from non-authenticated users. If an access code (string) is passed, the file will be available to authenticated users and non-authenticated users with that access code.

protection(domain, filename)

Return file protection. Legend: “0” - public use, “1” - for authenticated users only, anything else represents a specific access code.

remove(domain, filename)

Remove a file from the server repository.

remove_domain(domain, force=False)

Remove a domain. If force is True, domain is removed even if it is not empty (contains files).

search(sstrings, **kwargs)

Search for files on the repository where all substrings in a list are contained in at least one choosen field (tag, title, name). Return a list of tuples: first tuple element is the file’s domain, second its name. As for now the search is performed locally, therefore information on files in repository is transfered on first call of this function.

unprotect(domain, filename)

Put a file into public use.

upload(domain, filename, file, title=, tags=[])

Uploads a file “file” to the domain where it is saved with filename “filename”. If file does not exist yet, set it as protected. Parameter file can be a file handle open for reading or a file name.

Examples

Listing local files, files from the repository and downloading all available files from domain “demo” (serverfiles1.py).

import Orange
sf = Orange.misc.serverfiles

repository = sf.ServerFiles() 

print "My files (in demo)", sf.listfiles('demo') 
print "Repository files", repository.listfiles('demo') 
print "Downloading all files in domain 'demo'" 

for file in repository.listfiles('demo'): 
    print "Datetime for", file, repository.info('demo', file)["datetime"] 
    sf.download('demo', file) 
    
print "My files (in demo) after download", sf.listfiles('demo') 
print "My domains", sf.listdomains() 

A possible output (it depends on the current repository state):

My files []
Repository files ['orngServerFiles.py', 'urllib2_file.py']
Downloading all files in domain 'test'
Datetime for orngServerFiles.py 2008-08-20 12:25:54.624000
Downloading orngServerFiles.py
progress: ===============>100%  10.7 KB       47.0 KB/s    0:00 ETA
Datetime for urllib2_file.py 2008-08-20 12:25:54.827000
Downloading urllib2_file.py
progress: ===============>100%  8.5 KB       37.4 KB/s    0:00 ETA
My files after download ['urllib2_file.py', 'orngServerFiles.py']
My domains ['KEGG', 'gene_sets', 'dictybase', 'NCBI_geneinfo', 'GO', 'miRNA', 'demo', 'Taxonomy', 'GEO']

A domain with a simple file can be built as follows (serverfiles2.py). Of course, the username and password should be valid.

import Orange, sys
sf = Orange.misc.serverfiles

ordinary = sf.ServerFiles()
authenticated = sf.ServerFiles(sys.argv[1], sys.argv[2])
#authentication is provided as command line arguments

try: 
    authenticated.remove_domain('demo2', force=True)
except: 
    pass 
    
authenticated.create_domain('demo2')
authenticated.upload('demo2', 'titanic.tab', 'titanic.tab', \
    title="A sample .tab file", tags=["basic", "data set"])
print "Uploaded."
print "Non-authenticated users see:", ordinary.listfiles('demo2')
print "Authenticated users see:", authenticated.listfiles('demo2')
authenticated.unprotect('demo2', 'titanic.tab')
print "Non-authenticated users now see:", ordinary.listfiles('demo2')
print "orngServerFiles.py file info:"
import pprint; pprint.pprint(ordinary.info('demo2', 'titanic.tab')) 

A possible output:

Uploaded.
Non-authenticated users see: ['']
Authenticated users see: ['titanic.tab']
Non-authenticated users now see: ['titanic.tab']
orngServerFiles.py file info:
{'datetime': '2011-03-15 13:18:53.029000',
 'size': '45112',
 'tags': ['basic', 'data set'],
 'title': 'A sample .tab file'}