source: orange-bioinformatics/docs/modules/obiTaxonomy.htm @ 1623:186d828e3699

Revision 1623:186d828e3699, 4.2 KB checked in by mitar, 2 years ago (diff)

Renamed documentation directory.

Line 
1<html>
2
3<head>
4<title>obiTaxonomy: NCBI Taxonomy</title>
5<link rel=stylesheet href="../style.css" type="text/css">
6<link rel=stylesheet href="style-print.css" type="text/css" media=print>
7</head>
8
9<body>
10<h1>obiTaxonomy: Organism Taxonomy</h1>
11<p>The module obiTaxonomy provides access to the <a href="http://www.ncbi.nlm.nih.gov/Taxonomy/">NCBI's organism taxonomy information</a>.
12The taxonomy information is pre-loaded, that is, becomes available with installation, and for the reasons of response-time it is updated on Orange server and
13not through direct calls to NCBI's site from your local machine. The list of names of the organisms are also updated using
14information from <a href="http://www.genome.jp/kegg/">KEGG</a> and the <a href="http://www.geneontology.org/">Gene Ontology</a>,
15so that when you search for "sgd" or "mmu" you get the expected result.</p>
16
17<p>The module is also used through Orange Bioinformatics for organism name unification across different modules.</p>
18
19<p class=section>Functions</p>
20<dl class=attributes>
21    <dt>name(taxid)</dt>
22    <dd>Return the scientific name for organism with taxid.</dd>
23
24    <dt>other_names(taxid)</dt>
25    <dd>Return a list of (name, name_type) tuples but exclude the scientific name.</dd>
26
27    <dt>search(string, onlySpecies=True, exact=False)</dt>
28    <dd>Search the NCBI taxonomy database for an organism by <code>string</code>, returning a list of
29    taxonomy IDs where any of the organism's names includes a <code>string</code>. If <code>onlySpecies</code> is
30    True then search only in the species and subspecies nodes ob the taxonomy. If <code>exact=True</code>, the entire
31    name has to match a given <code>string</code>.</dd>
32   
33    <dt>lineage(taxid)</dt>
34    <dd>Return a list of taxids ordered from the topmost node (root) to taxid.</dd>
35
36    <dt>to_taxid(code)</dt>
37    <dd>See if the organism <code>code</code> is a valid organism code (codes are obtained from, for instance,
38    KEGG and GO data bases) and return a set of its taxids.</dd>
39
40    <dt>taxids()</dt>
41    <dd>Returns a list of all (about half a million!) NCBI's taxonomy ID's.<dd>
42
43    <dt>common_taxids()</dt>
44    <dd>Returns a list of taxonomy IDs for common organisms (see list of <a href="http://www.ncbi.nlm.nih.gov/Taxonomy/">common organisms from NCBI</a>)
45    These are also the organisms for which the information files, such as Gene Ontology annotation and KEGG pathways will be
46    pre-loaded on Orange server and made available through Orange Bioinformatics database update. If there is
47    an organism you wish to include in this list, please contact the authors or post your wish on <a href="">Orange's Forum</a>.</dd>
48
49    <dt>essential_taxids()</dt>
50    <dd>Returns a set of taxonomy IDs, which is a subset of those returned by <code>common_taxids()</code>. This are also the organisms for which any annotation
51    information will be pre-loaded on your computer upon installation of Orange Bioinformatics.</dd>
52   
53</dl>
54
55<p class=section>Examples</p>
56
57<p>The following script takes the list of taxonomy IDs and prints out their name:</p>
58
59<p class="header"><a href="taxonomy1.py">taxonomy1.py</a></p>
60<xmp class=code>import obiTaxonomy
61
62for taxid in obiTaxonomy.common_taxids():
63    print "%-6s %s" % (taxid, obiTaxonomy.name(taxid))
64</xmp>
65
66<p>The output of the script is:<p>
67
68<xmp class=code>3702   Arabidopsis thaliana
699913   Bos taurus
706239   Caenorhabditis elegans
713055   Chlamydomonas reinhardtii
727955   Danio rerio
73352472 Dictyostelium discoideum AX4
747227   Drosophila melanogaster
75562    Escherichia coli
7611103  Hepatitis C virus
779606   Homo sapiens
7810090  Mus musculus
792104   Mycoplasma pneumoniae
804530   Oryza sativa
815833   Plasmodium falciparum
824754   Pneumocystis carinii
8310116  Rattus norvegicus
844932   Saccharomyces cerevisiae
854896   Schizosaccharomyces pombe
8631033  Takifugu rubripes
878355   Xenopus laevis
884577   Zea mays
89</xmp>
90
91<h2>Update from other Orange modules</h2>
92<p>(this section for developers only) For unification, each module (e.g. obiKEGG, obiGO, ...) should provide the following interface:
93<dl class=section>
94    <dt>from_taxid(taxid)</dt>
95    <dd>Convert taxid to module's internal organism code.</dd>
96   
97    <dt>to_taxid(organism)</dt>
98    <dd>Convert module's internal organism code to taxid.</dd>
99   
100    <dt>organisms()</dt>
101    <dd>Returns a list of tuples (taxid, internal_name)</dd>
102</dl> 
Note: See TracBrowser for help on using the repository browser.