2 |
26 Feb 07 |
jari |
1 |
<html> |
2 |
26 Feb 07 |
jari |
2 |
<body bgcolor = "#FFFFCC"><basefont face = "Arial"> |
2 |
26 Feb 07 |
jari |
3 |
<h1>EASE: Annotation Over-representation Analysis</h1> <h2>Parameter Information</h2> |
2 |
26 Feb 07 |
jari |
4 |
<hr size = 10> |
2 |
26 Feb 07 |
jari |
5 |
|
2 |
26 Feb 07 |
jari |
6 |
|
2 |
26 Feb 07 |
jari |
7 |
<h2>File Updates and Configuration</h2> |
2 |
26 Feb 07 |
jari |
8 |
<h3>Select EASE File System</h3> |
2 |
26 Feb 07 |
jari |
9 |
This button enables the selection of a local directory to be used as the source for annotation |
2 |
26 Feb 07 |
jari |
10 |
files for EASE analysis. Multiple file systems can be present to support a variety of array |
2 |
26 Feb 07 |
jari |
11 |
types. Selection of a file system directs file choosers to that area, however, file selections |
2 |
26 Feb 07 |
jari |
12 |
may be made outside of selected base file system if appropriate. Note that the selected directory |
2 |
26 Feb 07 |
jari |
13 |
should be the directory that contains the "Data" directory. In MeV's default data dircetory this |
2 |
26 Feb 07 |
jari |
14 |
would be the 'ease' directory. |
2 |
26 Feb 07 |
jari |
15 |
|
2 |
26 Feb 07 |
jari |
16 |
<h3>Update EASE File System</h3> |
2 |
26 Feb 07 |
jari |
17 |
This button allows the download of EASE annotation file systems for a selected species and clone set. |
2 |
26 Feb 07 |
jari |
18 |
A selection dialog will allow species selection from a variety of plant and animal species. |
2 |
26 Feb 07 |
jari |
19 |
A list of many commercially available arrays for the selected species is also presented for selection. |
2 |
26 Feb 07 |
jari |
20 |
After species and array selection, a dialog will be presented to select a directory as the |
2 |
26 Feb 07 |
jari |
21 |
destination directory for the EASE file system. Zip files will then be downloaded and automatically |
2 |
26 Feb 07 |
jari |
22 |
extracted into the destination directory. The new base directory will be labeled with |
2 |
26 Feb 07 |
jari |
23 |
"ease_" and the selected array name. This new file system can be selected as the default system |
2 |
26 Feb 07 |
jari |
24 |
(Please see the "Select EASE File System" option above.). |
2 |
26 Feb 07 |
jari |
25 |
|
2 |
26 Feb 07 |
jari |
26 |
<h2>Mode Selection</h2> |
2 |
26 Feb 07 |
jari |
27 |
<h3>Cluster Analysis</h3> |
2 |
26 Feb 07 |
jari |
28 |
This mode performs annotation analysis on a selected subset (sample list or cluster) of the |
2 |
26 Feb 07 |
jari |
29 |
full data set loaded in MeV. The output is a list of biological 'themes' represented in |
2 |
26 Feb 07 |
jari |
30 |
the cluster and a statistic reporting the probability that a particular theme is over represented in |
2 |
26 Feb 07 |
jari |
31 |
the cluster relative to it's representation in the entire data set. The resulting table will |
2 |
26 Feb 07 |
jari |
32 |
initially be sorted by this statistic. |
2 |
26 Feb 07 |
jari |
33 |
|
2 |
26 Feb 07 |
jari |
34 |
<h3>Annotation Survey</h3> |
2 |
26 Feb 07 |
jari |
35 |
The survey mode simply produces a list of biological themes that are represented in the data currently loaded |
2 |
26 Feb 07 |
jari |
36 |
in the viewer from which Ease is launched. Note that this could be a subset of the total slide data. |
2 |
26 Feb 07 |
jari |
37 |
If you want to survey all annotation on the slide you have to use a viewer with all of the slide's data |
2 |
26 Feb 07 |
jari |
38 |
loaded. The initial ordering of the output table is based on the prevalence of a theme in the data set (hit count). |
2 |
26 Feb 07 |
jari |
39 |
This mode can be used to cluster genes based on biological themes. The clusters can then be |
2 |
26 Feb 07 |
jari |
40 |
stored and marked (colored) for tracking during cluster analysis. |
2 |
26 Feb 07 |
jari |
41 |
<br> |
2 |
26 Feb 07 |
jari |
42 |
|
2 |
26 Feb 07 |
jari |
43 |
<h2>Parameter Pages</h2> |
2 |
26 Feb 07 |
jari |
44 |
Several parameter input pages are available: |
2 |
26 Feb 07 |
jari |
45 |
<hr> |
2 |
26 Feb 07 |
jari |
46 |
<br> |
2 |
26 Feb 07 |
jari |
47 |
<h2>Population and Cluster Selection</h2> |
2 |
26 Feb 07 |
jari |
48 |
This section permits selection of a cluster for analysis and defines the population to which the |
2 |
26 Feb 07 |
jari |
49 |
cluster should be compared. The population selection panel, on top, allows the user to specify whether the |
2 |
26 Feb 07 |
jari |
50 |
population set of gene indices should be loaded from a file or if the population set should be |
2 |
26 Feb 07 |
jari |
51 |
taken as all indices loaded in the current Multiple Experiment Viewer. Note that if the current |
2 |
26 Feb 07 |
jari |
52 |
viewer does not contain all population indices it is important to use the default option of a |
2 |
26 Feb 07 |
jari |
53 |
population file.<br><br> |
2 |
26 Feb 07 |
jari |
54 |
A population file is a list of indices representing the indices from which the cluster |
2 |
26 Feb 07 |
jari |
55 |
was segregated by statistical or other means. The file format consists of a column of indices with |
2 |
26 Feb 07 |
jari |
56 |
one index per line. The population often represents a set of indices representing each element |
2 |
26 Feb 07 |
jari |
57 |
on the array, however, there are circumstances where one might wish to disregard particular |
2 |
26 Feb 07 |
jari |
58 |
spots such as internal controls. |
2 |
26 Feb 07 |
jari |
59 |
<br><br> |
2 |
26 Feb 07 |
jari |
60 |
The cluster panel, below the population panel, displays gene clusters currently stored in MeV's cluster repository. |
2 |
26 Feb 07 |
jari |
61 |
If no clusters have been saved then a blank browser page or empty table will be displayed and the Cluster Analysis mode option will be |
2 |
26 Feb 07 |
jari |
62 |
disabled. Selecting a row in the cluster table will display the cluster in the expression graph area |
2 |
26 Feb 07 |
jari |
63 |
of the browser. EASE cluster analysis will operate on the selected cluster.. |
2 |
26 Feb 07 |
jari |
64 |
<br> |
2 |
26 Feb 07 |
jari |
65 |
<br> |
2 |
26 Feb 07 |
jari |
66 |
<h2>Annotation Parameters Page</h2> |
2 |
26 Feb 07 |
jari |
67 |
This page has three major parts described below. |
2 |
26 Feb 07 |
jari |
68 |
<h3>MeV Annotation Key</h3> |
2 |
26 Feb 07 |
jari |
69 |
This area contains a drop down list which contains a list of available annotation types which can be |
2 |
26 Feb 07 |
jari |
70 |
used identify genes. Generally it's best to use an index or accession which 'uniquely' identifies |
2 |
26 Feb 07 |
jari |
71 |
the spotted material. |
2 |
26 Feb 07 |
jari |
72 |
<h3>Annotation Conversion File</h3> |
2 |
26 Feb 07 |
jari |
73 |
This optional file provides the mapping from your annotation key (above) to the index used to map to |
2 |
26 Feb 07 |
jari |
74 |
biological themes (GO terms, KEGG pathways, etc.). If your annotation key type is the one used in the |
2 |
26 Feb 07 |
jari |
75 |
linking file (below) then this conversion (mapping) is not needed. |
2 |
26 Feb 07 |
jari |
76 |
<h3>Gene Annotation / Gene Ontology Linking Files</h3> |
2 |
26 Feb 07 |
jari |
77 |
This section allows one to specify one or more annotation files. These files contain gene indices |
2 |
26 Feb 07 |
jari |
78 |
paired with biological themes such as go terms. |
2 |
26 Feb 07 |
jari |
79 |
<h3><i>File Selection Scenario</h3> |
2 |
26 Feb 07 |
jari |
80 |
One possible example of the file linking structure could be:<br> |
2 |
26 Feb 07 |
jari |
81 |
<b>[GenBank#]-->[GenBank#]:[locus_link_id]-->[locus_link_id]:[go_term]</b><br> |
2 |
26 Feb 07 |
jari |
82 |
This shows the progression from 'Annotation Key', to conversion file (converting GenBank# to locus_link_id), |
2 |
26 Feb 07 |
jari |
83 |
to final linking with GO terms. Keep in mind that although shown with a single arrow, in general |
2 |
26 Feb 07 |
jari |
84 |
one gene index will map to many GO terms (or other biological theme or pathway categories).</i> |
2 |
26 Feb 07 |
jari |
85 |
<br> |
2 |
26 Feb 07 |
jari |
86 |
<br> |
2 |
26 Feb 07 |
jari |
87 |
<h2>Statistical Parameters Page</h2> |
2 |
26 Feb 07 |
jari |
88 |
Several sections on this page are used to specify reported statistical and result trimming parameters. |
2 |
26 Feb 07 |
jari |
89 |
<h2>Reported Statistic</h2> |
2 |
26 Feb 07 |
jari |
90 |
<h3>Fisher's Exact Probability</h3> |
2 |
26 Feb 07 |
jari |
91 |
The Fisher's Exact Probability reports the probability that a biological theme is |
2 |
26 Feb 07 |
jari |
92 |
over-represented in the cluster of interest relative to the representation of that theme in the |
2 |
26 Feb 07 |
jari |
93 |
total gene population. For example, suppose that one has a gene |
2 |
26 Feb 07 |
jari |
94 |
list of 50 genes from a population of 10,000 genes. Now suppose that 10 of the 50 genes were related to |
2 |
26 Feb 07 |
jari |
95 |
pathway "A" but only 13 genes in the total population were associated with pathway "A". This scenario |
2 |
26 Feb 07 |
jari |
96 |
would yield a low probability that the observed number of hits (occurrences of pathway "A") within the small |
2 |
26 Feb 07 |
jari |
97 |
sample could be due to chance alone. This statistic is based on the hypergeometric distribution and has |
2 |
26 Feb 07 |
jari |
98 |
benefits over chi-square in that it is appropriate for finite populations. The reference sited for EASE |
2 |
26 Feb 07 |
jari |
99 |
describes this statistic at length. |
2 |
26 Feb 07 |
jari |
100 |
<h3>EASE Score</h3> |
2 |
26 Feb 07 |
jari |
101 |
The EASE Score reported is essentially a jackknifed Fisher's Exact Probability which is arrived at |
2 |
26 Feb 07 |
jari |
102 |
by calculation of the Fisher's Exact where one occurrence (list hit for a term) has been removed. |
2 |
26 Feb 07 |
jari |
103 |
<h2>Multiplicity Corrections</h2> |
2 |
26 Feb 07 |
jari |
104 |
Several p-value corrections can be applied to help correct for the chance of arriving at a significant |
2 |
26 Feb 07 |
jari |
105 |
result when performing multiple tests. |
2 |
26 Feb 07 |
jari |
106 |
<h3>Bonferroni Correction</h3> |
2 |
26 Feb 07 |
jari |
107 |
This correction simply multiplies the statistic by the number of results generated. This is the most |
2 |
26 Feb 07 |
jari |
108 |
stringent correction of the three options. |
2 |
26 Feb 07 |
jari |
109 |
<h3>Bonferroni Step Down Correction</h3> |
2 |
26 Feb 07 |
jari |
110 |
This modified Bonferroni correction ranks the results by the statistic in ascending order. Each |
2 |
26 Feb 07 |
jari |
111 |
value is multiplied by (n-rank) where n is the number of results. In the case of a tie, where two |
2 |
26 Feb 07 |
jari |
112 |
results have the same probability the rank is kept constant until the next element occurs having |
2 |
26 Feb 07 |
jari |
113 |
a higher probability value. The rank is then adjusted for the number of tied elements where rank was constant. |
2 |
26 Feb 07 |
jari |
114 |
<h3>Sidak Method</h3> |
2 |
26 Feb 07 |
jari |
115 |
This correction uses the following formula where v' is the corrected value and k is the rank of the result |
2 |
26 Feb 07 |
jari |
116 |
in terms of original statistic value. In this case ties in rank are handled as described in the step down Bonferroni correction. |
2 |
26 Feb 07 |
jari |
117 |
<br> |
2 |
26 Feb 07 |
jari |
118 |
v' = 1-(1-v)<sup>k</sup> |
2 |
26 Feb 07 |
jari |
119 |
|
2 |
26 Feb 07 |
jari |
120 |
<h3>Resampling Probability Analysis</h3> |
2 |
26 Feb 07 |
jari |
121 |
The resampling option performs a number of analysis iterations in which random |
2 |
26 Feb 07 |
jari |
122 |
gene lists of the original cluster size are selected from the population without replacement. |
2 |
26 Feb 07 |
jari |
123 |
The end result reported for a particular term is the probability of obtaining the determined |
2 |
26 Feb 07 |
jari |
124 |
significance level by chance. |
2 |
26 Feb 07 |
jari |
125 |
|
2 |
26 Feb 07 |
jari |
126 |
<h2>Trim Parameters</h2> |
2 |
26 Feb 07 |
jari |
127 |
The trim parameters can be applied to filter analysis results based on the number of hits |
2 |
26 Feb 07 |
jari |
128 |
or the fraction of genes in the cluster that are represented by an annotation term. Sometimes |
2 |
26 Feb 07 |
jari |
129 |
a term can be found significant but does not represent a large segment of the cluster of interest. |
2 |
26 Feb 07 |
jari |
130 |
These options can be applied to be certain that a minimum number of genes in the cluster fall under |
2 |
26 Feb 07 |
jari |
131 |
that particular annotation class. This feature should be used with caution so that biological |
2 |
26 Feb 07 |
jari |
132 |
themes represented by very few genes are not excluded. |
2 |
26 Feb 07 |
jari |
133 |
</basefont> |
2 |
26 Feb 07 |
jari |
134 |
</body> |
2 |
26 Feb 07 |
jari |
135 |
</html> |