svndigest - svndigest

mev-4.0.01/source/org/tigr/microarray/mev/cluster/gui/impl/dialogs/dialogHelpUtil/dialogHelpPages/ease_parameters.html

: Code
: Comments
: Other

Rev	Date	Author	Line
2	26 Feb 07	jari	1	<html>
2	26 Feb 07	jari	2	<body bgcolor = "#FFFFCC"><basefont face = "Arial">
2	26 Feb 07	jari	3	<h1>EASE: Annotation Over-representation Analysis</h1> <h2>Parameter Information</h2>
2	26 Feb 07	jari	4	<hr size = 10>
2	26 Feb 07	jari	5
2	26 Feb 07	jari	6
2	26 Feb 07	jari	7	<h2>File Updates and Configuration</h2>
2	26 Feb 07	jari	8	<h3>Select EASE File System</h3>
2	26 Feb 07	jari	9	This button enables the selection of a local directory to be used as the source for annotation
2	26 Feb 07	jari	10	files for EASE analysis. Multiple file systems can be present to support a variety of array
2	26 Feb 07	jari	11	types. Selection of a file system directs file choosers to that area, however, file selections
2	26 Feb 07	jari	12	may be made outside of selected base file system if appropriate. Note that the selected directory
2	26 Feb 07	jari	13	should be the directory that contains the "Data" directory. In MeV's default data dircetory this
2	26 Feb 07	jari	14	would be the 'ease' directory.
2	26 Feb 07	jari	15
2	26 Feb 07	jari	16	<h3>Update EASE File System</h3>
2	26 Feb 07	jari	17	This button allows the download of EASE annotation file systems for a selected species and clone set.
2	26 Feb 07	jari	18	A selection dialog will allow species selection from a variety of plant and animal species.
2	26 Feb 07	jari	19	A list of many commercially available arrays for the selected species is also presented for selection.
2	26 Feb 07	jari	20	After species and array selection, a dialog will be presented to select a directory as the
2	26 Feb 07	jari	21	destination directory for the EASE file system. Zip files will then be downloaded and automatically
2	26 Feb 07	jari	22	extracted into the destination directory. The new base directory will be labeled with
2	26 Feb 07	jari	23	"ease_" and the selected array name. This new file system can be selected as the default system
2	26 Feb 07	jari	24	(Please see the "Select EASE File System" option above.).
2	26 Feb 07	jari	25
2	26 Feb 07	jari	26	<h2>Mode Selection</h2>
2	26 Feb 07	jari	27	<h3>Cluster Analysis</h3>
2	26 Feb 07	jari	28	This mode performs annotation analysis on a selected subset (sample list or cluster) of the
2	26 Feb 07	jari	29	full data set loaded in MeV. The output is a list of biological 'themes' represented in
2	26 Feb 07	jari	30	the cluster and a statistic reporting the probability that a particular theme is over represented in
2	26 Feb 07	jari	31	the cluster relative to it's representation in the entire data set. The resulting table will
2	26 Feb 07	jari	32	initially be sorted by this statistic.
2	26 Feb 07	jari	33
2	26 Feb 07	jari	34	<h3>Annotation Survey</h3>
2	26 Feb 07	jari	35	The survey mode simply produces a list of biological themes that are represented in the data currently loaded
2	26 Feb 07	jari	36	in the viewer from which Ease is launched. Note that this could be a subset of the total slide data.
2	26 Feb 07	jari	37	If you want to survey all annotation on the slide you have to use a viewer with all of the slide's data
2	26 Feb 07	jari	38	loaded. The initial ordering of the output table is based on the prevalence of a theme in the data set (hit count).
2	26 Feb 07	jari	39	This mode can be used to cluster genes based on biological themes. The clusters can then be
2	26 Feb 07	jari	40	stored and marked (colored) for tracking during cluster analysis.
2	26 Feb 07	jari	41	<br>
2	26 Feb 07	jari	42
2	26 Feb 07	jari	43	<h2>Parameter Pages</h2>
2	26 Feb 07	jari	44	Several parameter input pages are available:
2	26 Feb 07	jari	45	<hr>
2	26 Feb 07	jari	46	<br>
2	26 Feb 07	jari	47	<h2>Population and Cluster Selection</h2>
2	26 Feb 07	jari	48	This section permits selection of a cluster for analysis and defines the population to which the
2	26 Feb 07	jari	49	cluster should be compared. The population selection panel, on top, allows the user to specify whether the
2	26 Feb 07	jari	50	population set of gene indices should be loaded from a file or if the population set should be
2	26 Feb 07	jari	51	taken as all indices loaded in the current Multiple Experiment Viewer. Note that if the current
2	26 Feb 07	jari	52	viewer does not contain all population indices it is important to use the default option of a
2	26 Feb 07	jari	53	population file.<br><br>
2	26 Feb 07	jari	54	A population file is a list of indices representing the indices from which the cluster
2	26 Feb 07	jari	55	was segregated by statistical or other means. The file format consists of a column of indices with
2	26 Feb 07	jari	56	one index per line. The population often represents a set of indices representing each element
2	26 Feb 07	jari	57	on the array, however, there are circumstances where one might wish to disregard particular
2	26 Feb 07	jari	58	spots such as internal controls.
2	26 Feb 07	jari	59	<br><br>
2	26 Feb 07	jari	60	The cluster panel, below the population panel, displays gene clusters currently stored in MeV's cluster repository.
2	26 Feb 07	jari	61	If no clusters have been saved then a blank browser page or empty table will be displayed and the Cluster Analysis mode option will be
2	26 Feb 07	jari	62	disabled. Selecting a row in the cluster table will display the cluster in the expression graph area
2	26 Feb 07	jari	63	of the browser. EASE cluster analysis will operate on the selected cluster..
2	26 Feb 07	jari	64	<br>
2	26 Feb 07	jari	65	<br>
2	26 Feb 07	jari	66	<h2>Annotation Parameters Page</h2>
2	26 Feb 07	jari	67	This page has three major parts described below.
2	26 Feb 07	jari	68	<h3>MeV Annotation Key</h3>
2	26 Feb 07	jari	69	This area contains a drop down list which contains a list of available annotation types which can be
2	26 Feb 07	jari	70	used identify genes. Generally it's best to use an index or accession which 'uniquely' identifies
2	26 Feb 07	jari	71	the spotted material.
2	26 Feb 07	jari	72	<h3>Annotation Conversion File</h3>
2	26 Feb 07	jari	73	This optional file provides the mapping from your annotation key (above) to the index used to map to
2	26 Feb 07	jari	74	biological themes (GO terms, KEGG pathways, etc.). If your annotation key type is the one used in the
2	26 Feb 07	jari	75	linking file (below) then this conversion (mapping) is not needed.
2	26 Feb 07	jari	76	<h3>Gene Annotation / Gene Ontology Linking Files</h3>
2	26 Feb 07	jari	77	This section allows one to specify one or more annotation files. These files contain gene indices
2	26 Feb 07	jari	78	paired with biological themes such as go terms.
2	26 Feb 07	jari	79	<h3><i>File Selection Scenario</h3>
2	26 Feb 07	jari	80	One possible example of the file linking structure could be:<br>
2	26 Feb 07	jari	81	<b>[GenBank#]-->[GenBank#]:[locus_link_id]-->[locus_link_id]:[go_term]</b><br>
2	26 Feb 07	jari	82	This shows the progression from 'Annotation Key', to conversion file (converting GenBank# to locus_link_id),
2	26 Feb 07	jari	83	to final linking with GO terms. Keep in mind that although shown with a single arrow, in general
2	26 Feb 07	jari	84	one gene index will map to many GO terms (or other biological theme or pathway categories).</i>
2	26 Feb 07	jari	85	<br>
2	26 Feb 07	jari	86	<br>
2	26 Feb 07	jari	87	<h2>Statistical Parameters Page</h2>
2	26 Feb 07	jari	88	Several sections on this page are used to specify reported statistical and result trimming parameters.
2	26 Feb 07	jari	89	<h2>Reported Statistic</h2>
2	26 Feb 07	jari	90	<h3>Fisher's Exact Probability</h3>
2	26 Feb 07	jari	91	The Fisher's Exact Probability reports the probability that a biological theme is
2	26 Feb 07	jari	92	over-represented in the cluster of interest relative to the representation of that theme in the
2	26 Feb 07	jari	93	total gene population. For example, suppose that one has a gene
2	26 Feb 07	jari	94	list of 50 genes from a population of 10,000 genes. Now suppose that 10 of the 50 genes were related to
2	26 Feb 07	jari	95	pathway "A" but only 13 genes in the total population were associated with pathway "A". This scenario
2	26 Feb 07	jari	96	would yield a low probability that the observed number of hits (occurrences of pathway "A") within the small
2	26 Feb 07	jari	97	sample could be due to chance alone. This statistic is based on the hypergeometric distribution and has
2	26 Feb 07	jari	98	benefits over chi-square in that it is appropriate for finite populations. The reference sited for EASE
2	26 Feb 07	jari	99	describes this statistic at length.
2	26 Feb 07	jari	100	<h3>EASE Score</h3>
2	26 Feb 07	jari	101	The EASE Score reported is essentially a jackknifed Fisher's Exact Probability which is arrived at
2	26 Feb 07	jari	102	by calculation of the Fisher's Exact where one occurrence (list hit for a term) has been removed.
2	26 Feb 07	jari	103	<h2>Multiplicity Corrections</h2>
2	26 Feb 07	jari	104	Several p-value corrections can be applied to help correct for the chance of arriving at a significant
2	26 Feb 07	jari	105	result when performing multiple tests.
2	26 Feb 07	jari	106	<h3>Bonferroni Correction</h3>
2	26 Feb 07	jari	107	This correction simply multiplies the statistic by the number of results generated. This is the most
2	26 Feb 07	jari	108	stringent correction of the three options.
2	26 Feb 07	jari	109	<h3>Bonferroni Step Down Correction</h3>
2	26 Feb 07	jari	110	This modified Bonferroni correction ranks the results by the statistic in ascending order. Each
2	26 Feb 07	jari	111	value is multiplied by (n-rank) where n is the number of results. In the case of a tie, where two
2	26 Feb 07	jari	112	results have the same probability the rank is kept constant until the next element occurs having
2	26 Feb 07	jari	113	a higher probability value. The rank is then adjusted for the number of tied elements where rank was constant.
2	26 Feb 07	jari	114	<h3>Sidak Method</h3>
2	26 Feb 07	jari	115	This correction uses the following formula where v' is the corrected value and k is the rank of the result
2	26 Feb 07	jari	116	in terms of original statistic value. In this case ties in rank are handled as described in the step down Bonferroni correction.
2	26 Feb 07	jari	117	<br>
2	26 Feb 07	jari	118	v' = 1-(1-v)<sup>k</sup>
2	26 Feb 07	jari	119
2	26 Feb 07	jari	120	<h3>Resampling Probability Analysis</h3>
2	26 Feb 07	jari	121	The resampling option performs a number of analysis iterations in which random
2	26 Feb 07	jari	122	gene lists of the original cluster size are selected from the population without replacement.
2	26 Feb 07	jari	123	The end result reported for a particular term is the probability of obtaining the determined
2	26 Feb 07	jari	124	significance level by chance.
2	26 Feb 07	jari	125
2	26 Feb 07	jari	126	<h2>Trim Parameters</h2>
2	26 Feb 07	jari	127	The trim parameters can be applied to filter analysis results based on the number of hits
2	26 Feb 07	jari	128	or the fraction of genes in the cluster that are represented by an annotation term. Sometimes
2	26 Feb 07	jari	129	a term can be found significant but does not represent a large segment of the cluster of interest.
2	26 Feb 07	jari	130	These options can be applied to be certain that a minimum number of genes in the cluster fall under
2	26 Feb 07	jari	131	that particular annotation class. This feature should be used with caution so that biological
2	26 Feb 07	jari	132	themes represented by very few genes are not excluded.
2	26 Feb 07	jari	133	</basefont>
2	26 Feb 07	jari	134	</body>
2	26 Feb 07	jari	135	</html>