source: orange-bioinformatics/docs/reference/obiMeSH.htm @ 1634:32373d957326

Revision 1634:32373d957326, 10.5 KB checked in by mitar, 2 years ago (diff)

Renamed addon reference documentation.

Line 
1<html>
2<HEAD>
3<LINK REL=StyleSheet HREF="../style.css" TYPE="text/css">
4<LINK REL=StyleSheet HREF="../style-print.css" TYPE="text/css" MEDIA=print></LINK>
5</HEAD>
6
7<BODY>
8<h1>obiMeSH</h1>
9
10<index name="modules+multidimensional scaling">
11
12<p>This module provides the functionality to computer <a href="http://www.nlm.nih.gov/bsd/disted/mesh/">MeSH</a> term enrichment and to annotate (MeSH ontology) chemicals with MeSH terms.</p>
13
14<h2>obiMesh</h2>
15
16<p><INDEX name="classes/obiMeSH (in obiMeSH)">obiMeSH is the main class for all task related to MeSH ontology.</p>
17
18
19<P class=section>Methods</P>
20<DL class=attributes>
21<dt>obiMeSH()</dt>
22<DD>Constructor has no arguments.</DD>
23
24<dt>findFrequentTerms(data, minSizeInTerm, callback=None)</dt>
25<DD>Function returns a dictionary where keys are MeSH terms ids and values are integers representing number of examples annotated with corresponding MeSH term. Data has to be instance of <code>ExampleTable</code>. With argument <code>minSizeInTerm</code> you can select only MeSH terms that have at least <code>minSizeInTerm</code> annotated examples.</DD>
26
27<dt>findEnrichedTerms(reference, cluster, pThreshold=0.05, callback=None)</dt>
28<DD>Function returns a dictionary where keys are MeSH terms and values are lists of four integers (number of annotated reference examples, number of annotated cluster examples, MeSH term enrichment, fold enrichment). With attribute <code>pThreshold</code> you can limit MeSH terms in returned dictionary to terms with enrichment less or equal to defined constant. Data sets (<code>reference</code> and <code>cluster</code>) have to be instances of <code>ExampleTable</code></DD>
29
30<dt>printMeSH(data, selection = ["term","r","c", "p"])</dt>
31<DD>Function performs a pretty print of a dictionary returned by function <code>findFrequentTerms</code> or <code>findEnrichedTerms</code>. When you are printing a dictionary of enriched MeSH terms (returned by <code>findEnrichedTerms</code>) you can also specify their properties and their order to print. At the moment you can choose among "term" (MeSH term name), "desc" (MeSH term description), "r" (number of examples from reference), "c" (number of examples from cluster), "p" (MeSH term enrichment) and "fold" (fold enrichment). </DD>
32<dt>
33   
34<dt>printHtmlMeSH(data, selection = ["term","r","c", "p"])</dt>
35<DD>Similar to previous one except it returns html code. </DD>
36   
37<dt>findTerms(ids, idType="cid")</dt>
38<DD>Function returns a dictionary where keys are members of the list <code>ids</code> and values are lists of MeSH terms that apply to a key. Note that this function is using a local annotation database.</DD><dt>
39       
40<dt>parsePubMed(filename, attributes = ["pmid", "title","abstract","mesh"], skipExamplesWithout = ["mesh"])</dt>
41<DD>Function parses <a href="http>//www.pubmed.gov">PubMed</a> XML file (search results saved in XML format) into Oranges <code>ExampleTable</code>. Of course you can select only certain attributes. At the moment supported attributes are "pmid" (PubMed ID), "title" (article title), "abstract" (article abstract), "mesh" (MeSH terms) and "affilation". </DD><dt>
42   
43<dt>findSubset(examples, meshTerms, callback = None)</dt>
44<DD>Function return a new dataset (subset of <code>examples</code>) with examples that apply to one or more MeSH term from the list <code>meshTerms</code>. Argument <code>examples</code> has to be instance of <code>ExampleTable</code>. </DD><dt>
45   
46<p class=section>Attributes</p>
47<DL class =attributes></dl>
48   
49<dt>toID</dt>
50<DD>Dictionary <code>toID</code> provides mapping between MeSH term and MeSH term ids. Please note that some MeSH terms have more than one MeSH term id (one to many relation).  </DD>
51
52<dt>toName</dt>
53<DD>Dictionary <code>toName</code> provides mapping between MeSH term id and MeSH term. </DD>
54
55<dt>toDesc<dt>
56<DD>Dictionary <code>toName</code> provides mapping between MeSH term and MeSH term description.</code></DD>
57
58<dt>fromCID</dt>
59<DD>Dictionary <code>fromCID</code> provides local mapping between CID (compound id) and a list of MeSH terms.</DD>
60
61
62<h2>Examples</h2>
63<!--
64<h3>Basic operations on MeSH ontology</h3>
65
66<p>In our first example, we will show how to manipulate with MeSH ontology. Let's start with simple mapping between MeSH terms and their ids. This is done by the following code:</p>
67
68<p class="header">part of <a href="mesh1.py">mesh1.py</a> </p>
69
70<xmp class=code>import orange
71import obiMeSH
72</xmp> -->
73
74<h3>Using basic obiMeSH attributes</h3>
75<p>Following example shows usage of main obiMeSH attributes.</p>
76<xmp class=code>>>> import obiMeSH
77>>> d = obiMeSH.obiMeSH()
78>>> dir(d)
79# We look if there is an annotation for chemical with CID 6240 in our local database.
80>>> d.fromCID[6240]
81['Chlorpromazine']
82# We could also used function which is using online database ...
83>>> d.findTerms([1,2,3,4,5,6])
84{1: ['Acetylcarnitine'], 2: [], 3: [], 4: ['Propanolamines'], 5: [], 6: ['Dinitrochlorobenzene']}
85# After we know MeSH terms we can get their description ...
86>>> d.toDesc['Chlorpromazine']
87"The prototypical phenothiazine antipsychotic drug. Like the other drugs in this class chlorpromazine's antipsychotic actions
88are thought to be due to long-term adaptation by the brain to blocking DOPAMINE RECEPTORS. Chlorpromazine has several other
89actions and therapeutic uses, including as an antiemetic and in the treatment of intractable hiccup."
90# or MeSH ids ...
91>>> d.toID['Chlorpromazine']
92['D02.886.369.198', 'D03.494.741.198']
93# which can be easily converted back to MeSH terms.
94>>> d.toName['D03']
95'Heterocyclic Compounds'
96>>> d.toName['D03.494']
97'Heterocyclic Compounds, 3-Ring'
98</xmp>
99
100<h3>Calculating and printing MeSH term enrichment</h3>
101
102<p>In the following example you can see how to calculate and pretty print enriched MeSH terms.</p>
103<p>part of <a href='obiMeSH-calculate-enrichment.py'>obiMeSH-calculate-enrichment.py</a></p>
104
105<xmp class=code>import obiMeSH
106import orange
107
108# load datasets
109reference = orange.ExampleTable('obiMeSH-reference-dataset.tab')
110cluster = orange.ExampleTable('obiMeSH-cluster-dataset.tab')
111
112# find and print enriched MeSH terms with p-value < 0.1
113d = obiMeSH.obiMeSH()
114enrichment = d.findEnrichedTerms(reference, cluster, pThreshold=0.1)
115d.printMeSH(enrichment)
116</xmp>
117
118<!--
119<h3>Parsing PubMed XML data</h3>
120<p>asdf</p>
121
122<h3>Advanced: using MeSH terms relationship data</h3>
123<p>asdf</p>
124 
125
126<p class="header"><a href="mds3.py">mds3.py</a> (uses <a href="reference.tab">reference.tab</a> and <a href="cluster.tab">cluster.tab</a>)</p>
127<XMP class= code>import orange
128import obiMeSH
129
130
131</XMP>
132
133<XMP class= code>i=0 while 100>i:
134    i+=1
135</XMP>
136-->
137
138<h2>pubChemAPI</h2>
139
140<p><INDEX name="classes/pubChemAPI (in obiMeSH)">pubChemAPI is the main class used to query online PubChem database.</p>
141
142
143<P class=section>Methods</P>
144   
145<dt>getSMILE(id, typ)</dt>
146<dl>Functions returns corresponding chemical formula in SMILES format for desired <code>id</code>. Argument <code>typ</code> indicates type of identifier. At the moment there are two possibilities for a value of <code>typ</code> ('cid' for compound id and 'sid' for substance id). </dl>
147
148<dt>getMeSHterms(cid)</dt>   
149<dl>Function returns a list of MeSH terms (classification) for a given CID.</dl>
150
151<dt>getCIDfromSID(sid)</dt>
152<dl>Function returns a list of possible CIDs for a given SID.</dl>
153
154<dt>getPharmActionList(cid)</dt>
155<dl>Function returns a list of pharmacological actions for a given CID.</dl>
156
157<dt>getCIDs(name, weight)</dt>
158<dl>Function returns a list of possible chemical names for a given CID and molecular weight.</dl>
159
160<dt>getCIDs(name)</dt>
161<dl>Function returns a list of possible chemical names for a given CID.</dl>
162</dd>
163
164
165<h2>Examples</h2>
166
167
168<h3>Basic functions</h3>
169
170<p>In the following example you can see how to do basic operations with pubChemAPI library.</p>
171
172<xmp class=code>from obiMeSH import *
173
174api = pubChemAPI()
175chemical = 'Chlorpromazine'
176cid = api.getCIDs(chemical)
177classTerms = api.getMeSHterms(cid[0])
178pharmTerms = api.getPharmActionList(cid[0])
179smiles = api.getSMILE(cid,'cid')
180
181print 'Drug name', chemical
182print 'All available CID numbers', cid
183print 'Classification MeSH terms', classTerms
184print 'Pharmacological action MeSH terms', pharmTerms
185print 'Smiles notation', smiles
186</xmp>
187
188<P>The output should be
189<xmp  class=code>Drug name  Chlorpromazine
190All available CID numbers  [2726, 70413, 522335, 6240, 9683, 6431825, 4926, 84362, 62861, 443037, 23724898, 165214,
191 160588, 122845, 9682, 6474604, 125595, 125358, 6474605, 465099, 465100, 481770, 159916, 465103, 465098, 461555, 107410,
192 465104, 486143, 467415, 114324, 117674, 117673, 3026449, 72287, 6420056, 3916, 24182520, 24182516, 91499, 67356, 66064,
19366062, 6602611, 6444542, 6436410, 6436057, 5282418, 5282417, 5281881, 5281878, 5281032, 36207, 2913535, 17012, 5887, 5566,
1945452, 4917, 4748, 4744, 4078, 3372]
195Classification MeSH terms  ['Organic Chemicals', 'Heterocyclic Compounds, 3-Ring', 'Phenothiazines', 'Sulfur Compounds', 'Heterocyclic Compounds', 'Chlorpromazine']
196Pharmacological action MeSH terms  ['Antiemetics', 'Autonomic Agents', 'Dopamine Antagonists', 'Central Nervous System Agents', 'Chemical Actions and Uses', 'Dopamine Agents', 'Gastrointestinal Agents', 'Molecular Mechanisms of Pharmacological Action', 'Pharmacologic Actions', 'Physiological Effects of Drugs', 'Therapeutic Uses', 'Tranquilizing Agents', 'Central Nervous System Depressants', 'Neurotransmitter Agents', 'Peripheral Nervous System Agents', 'Psychotropic Drugs', 'Antipsychotic Agents']
197Smiles notation  CN(C)CCCN1C2=CC=CC=C2SC3=C1C=C(C=C3)Cl
198</xmp>
199</P>
200
201<h3>Generating drug dataset with pubChemAPI and obiMeSH</h3>
202
203<p>In this example you can see how to print chemical dataset suitable for usage in Orange. We will assume that our initial data is a set of drug names. Note that following procedure is vague and may not be accurate for all the chemicals. The crux of the problem is the fact that function getCIDs returns a list of CIDs. We assume that first CID is the one we are looking for but that is not always true.</p>
204<br>
205<p>part of <a href='obiMeSH-dataset-generator.py'>obiMeSH-dataset-generator.py</a></p>
206<xmp class=code>for i in names:
207    cids = chem.getCIDs(i.strip())
208    if len(cids) > 0:
209        cid = cids[0]       
210        terms = chem.getMeSHterms([int(cid)])[cid]
211        smiles = chem.getSMILE(int(cid),"cid")
212        print cid, "\t", i, "\t", smiles, "\t", terms
213</xmp>
214
215<P>The output should be
216<xmp  class=code>cid    name    smiles  mesh
217string  string  string  string
218
2196427782     1,7-octadiene   CC=CCC=CC   []
22024199357    2-Dimethylamnoethyl cloride     CC(C)C1=CC=C(C=C1)C2C(C(=O)NC(=S)N2)C#N.[K]     []
2213385    5-fluorouracil  C1=C(C(=O)NC(=O)N1)F    ['Pyrimidines', 'Uracil', 'Fluorouracil', 'Pyrimidinones', 'Heterocyclic Compounds, 1-Ring', 'Heterocyclic Compounds']
222</xmp>
223</P>
224
225</body>
226</html>
227
228
Note: See TracBrowser for help on using the repository browser.