2 |
26 Feb 07 |
jari |
1 |
<html> |
2 |
26 Feb 07 |
jari |
2 |
<body bgcolor = "#FFFFCC"><basefont face = "Arial"> |
2 |
26 Feb 07 |
jari |
3 |
<h1>SOTA: Self Organizing Tree Algorithm</h1> <h2>Parameter Information</h2> |
2 |
26 Feb 07 |
jari |
4 |
<hr size = 10> |
2 |
26 Feb 07 |
jari |
5 |
|
2 |
26 Feb 07 |
jari |
6 |
<h2>General SOTA Terminology</h2> |
2 |
26 Feb 07 |
jari |
7 |
The topology of the resulting tree is a binary tree structure where each terminal |
2 |
26 Feb 07 |
jari |
8 |
node represents a cluster. |
2 |
26 Feb 07 |
jari |
9 |
<br><br> |
2 |
26 Feb 07 |
jari |
10 |
Centroid Vector: a vector that is representative of the membership of a node. |
2 |
26 Feb 07 |
jari |
11 |
<br><br> |
2 |
26 Feb 07 |
jari |
12 |
Members: Expression Elements associated with a Node. |
2 |
26 Feb 07 |
jari |
13 |
<br><br> |
2 |
26 Feb 07 |
jari |
14 |
Node: a structure which contains a Centroid Vector and a number of associated |
2 |
26 Feb 07 |
jari |
15 |
expression profiles (members). |
2 |
26 Feb 07 |
jari |
16 |
<br><br> |
2 |
26 Feb 07 |
jari |
17 |
Cell: a Node which is the terminal Node in a branch of the tree (a.k.a. leaf node). |
2 |
26 Feb 07 |
jari |
18 |
The members of the cell are considered members of an expression cluster. |
2 |
26 Feb 07 |
jari |
19 |
<br> |
2 |
26 Feb 07 |
jari |
20 |
<h2>Sample Selection</h2> |
2 |
26 Feb 07 |
jari |
21 |
This area is used to select to cluster genes or samples. |
2 |
26 Feb 07 |
jari |
22 |
<h2> Distance Metric Selection </h2> |
2 |
26 Feb 07 |
jari |
23 |
This area allows the selection of the metric to be used to assess gene-to-gene |
2 |
26 Feb 07 |
jari |
24 |
or sample-to-sample distances. The initial metric displayed (choosen) corresponds to the global |
2 |
26 Feb 07 |
jari |
25 |
setting in the Multiple Array Viewer's 'Metrics' menu. Alterations to the |
2 |
26 Feb 07 |
jari |
26 |
chosen metric in this dialog will only alter the metric used for the current |
2 |
26 Feb 07 |
jari |
27 |
algorithm run. The global setting in the main 'Metrics' menu will remain unchanged. |
2 |
26 Feb 07 |
jari |
28 |
<br><br> |
2 |
26 Feb 07 |
jari |
29 |
An appendix in the MeV manual describes the distance metrics offered in MeV. |
2 |
26 Feb 07 |
jari |
30 |
<br> |
2 |
26 Feb 07 |
jari |
31 |
<h2>Growth Termination Criteria</h2> |
2 |
26 Feb 07 |
jari |
32 |
<h3>Max Cycles</h3> |
2 |
26 Feb 07 |
jari |
33 |
This integer value represents the maximum iterations allowed. The resulting number |
2 |
26 Feb 07 |
jari |
34 |
of clusters produced by SOTA is (Max Cycles +1) unless other criteria are satisfied prior |
2 |
26 Feb 07 |
jari |
35 |
the indicated maximum number of cycles. |
2 |
26 Feb 07 |
jari |
36 |
<br> |
2 |
26 Feb 07 |
jari |
37 |
<h3>Max epochs/cycle</h3> |
2 |
26 Feb 07 |
jari |
38 |
This integer value indicates the maximum number of training epochs allowed per cycle. |
2 |
26 Feb 07 |
jari |
39 |
<br> |
2 |
26 Feb 07 |
jari |
40 |
<h3>Max. Cell Diversity</h3> |
2 |
26 Feb 07 |
jari |
41 |
This value represents a maximum variability allowed within a cluster. |
2 |
26 Feb 07 |
jari |
42 |
All resulting clusters will fall below this level of 'diversity' |
2 |
26 Feb 07 |
jari |
43 |
(mean gene to cluster centroid distance) if diversity is used as the cell |
2 |
26 Feb 07 |
jari |
44 |
division criteria. (Unless Max cycles are reached at which time some |
2 |
26 Feb 07 |
jari |
45 |
clusters may still exceed this parameter) |
2 |
26 Feb 07 |
jari |
46 |
<br> |
2 |
26 Feb 07 |
jari |
47 |
|
2 |
26 Feb 07 |
jari |
48 |
<h3>Min Epoch Error Improvement</h3> |
2 |
26 Feb 07 |
jari |
49 |
This value is used as a threshold for signaling the start of a new cycle and a |
2 |
26 Feb 07 |
jari |
50 |
cell division. The tree diversity is monitored during a training epoch and |
2 |
26 Feb 07 |
jari |
51 |
when the diversity fails to improve by more than this value then training has |
2 |
26 Feb 07 |
jari |
52 |
been considered to have stabilized and a new cycle begins. |
2 |
26 Feb 07 |
jari |
53 |
<br> |
2 |
26 Feb 07 |
jari |
54 |
|
2 |
26 Feb 07 |
jari |
55 |
|
2 |
26 Feb 07 |
jari |
56 |
<h3>Run Maximum Number of Cycles (unrestricted growth)</h3> |
2 |
26 Feb 07 |
jari |
57 |
The algorithm will run until Max Cycles or until all of the input |
2 |
26 Feb 07 |
jari |
58 |
set are fully partitioned such that each cluster has one gene or several identical gene vectors. |
2 |
26 Feb 07 |
jari |
59 |
<br> |
2 |
26 Feb 07 |
jari |
60 |
|
2 |
26 Feb 07 |
jari |
61 |
<h2>Centroid Migration and Neighborhood Parameters</h2> |
2 |
26 Feb 07 |
jari |
62 |
<h3>Migration Weights</h3> |
2 |
26 Feb 07 |
jari |
63 |
These values are used to scale the movement of cluster centroids |
2 |
26 Feb 07 |
jari |
64 |
(characteristic gene expression patterns) toward a gene vector which has been |
2 |
26 Feb 07 |
jari |
65 |
associated with a neighborhood. When a gene is associated with a cluster |
2 |
26 Feb 07 |
jari |
66 |
the centroid adapts to become more like the newly associated gene vector. |
2 |
26 Feb 07 |
jari |
67 |
The parent and sister cell migration weights should be smaller than the |
2 |
26 Feb 07 |
jari |
68 |
weight for the winning cell (Cell to which the gene vector is associated.). |
2 |
26 Feb 07 |
jari |
69 |
<br> |
2 |
26 Feb 07 |
jari |
70 |
|
2 |
26 Feb 07 |
jari |
71 |
<h3>Neighborhood Level</h3> |
2 |
26 Feb 07 |
jari |
72 |
This value determines which cells are candidates to accept new expression elements. |
2 |
26 Feb 07 |
jari |
73 |
When elements are considered for redistribution to new node during a cell division |
2 |
26 Feb 07 |
jari |
74 |
candidate cells are determined by moving up the tree toward the root this number of levels. |
2 |
26 Feb 07 |
jari |
75 |
From that node, all cells (terminal nodes) within this subtree are targets for possibly |
2 |
26 Feb 07 |
jari |
76 |
accepting expression vectors. (Each vector moves into the cell to which it is most similar). |
2 |
26 Feb 07 |
jari |
77 |
<br> |
2 |
26 Feb 07 |
jari |
78 |
|
2 |
26 Feb 07 |
jari |
79 |
<h2>Cell Division Criteria</h2> |
2 |
26 Feb 07 |
jari |
80 |
<h3>Use Cell Diversity</h3> |
2 |
26 Feb 07 |
jari |
81 |
Cell diversity is the mean distance between the cell's members (expression profiles) to the cell's |
2 |
26 Feb 07 |
jari |
82 |
centroid vector. When considering which cell to divide, the cell with the greatest diversity |
2 |
26 Feb 07 |
jari |
83 |
is split. (providing it's diversity exceeds Max Cell Diversity (see above)) |
2 |
26 Feb 07 |
jari |
84 |
<br> |
2 |
26 Feb 07 |
jari |
85 |
|
2 |
26 Feb 07 |
jari |
86 |
<h3>Use Cell Variability</h3> |
2 |
26 Feb 07 |
jari |
87 |
Cell variability is the maximum element-to-element distance within a cell. |
2 |
26 Feb 07 |
jari |
88 |
The cell having the largest internal gene-to-gene distance is selected as the next cell to divide. |
2 |
26 Feb 07 |
jari |
89 |
In this case the stopping criteria is changed so that growth |
2 |
26 Feb 07 |
jari |
90 |
continues until the most variable cell falls below a variability criteria generated using |
2 |
26 Feb 07 |
jari |
91 |
the provided pValue (see below) |
2 |
26 Feb 07 |
jari |
92 |
<br> |
2 |
26 Feb 07 |
jari |
93 |
|
2 |
26 Feb 07 |
jari |
94 |
<h3>pValue</h3> |
2 |
26 Feb 07 |
jari |
95 |
This value is used when using variability as the cell division criteria. A distribution of all element to element |
2 |
26 Feb 07 |
jari |
96 |
distances is generated by resampling the data set with each expression vector having randomized ordering of vector elements. |
2 |
26 Feb 07 |
jari |
97 |
The resulting distribution represents random gene to gene distances. The pValue supplied is applied to this |
2 |
26 Feb 07 |
jari |
98 |
resampled distribution to generate a variability cutoff. |
2 |
26 Feb 07 |
jari |
99 |
Clusters falling below this variability cutoff have a probability of having members |
2 |
26 Feb 07 |
jari |
100 |
that are paired by chance at or below the supplied pValue. |
2 |
26 Feb 07 |
jari |
101 |
<br> |
2 |
26 Feb 07 |
jari |
102 |
|
2 |
26 Feb 07 |
jari |
103 |
<h2>Hierarchical Clustering</h2> |
2 |
26 Feb 07 |
jari |
104 |
This check box selects whether to perform hierarchical clustering on the elements in each cluster |
2 |
26 Feb 07 |
jari |
105 |
created. |
2 |
26 Feb 07 |
jari |
106 |
<br> |
2 |
26 Feb 07 |
jari |
107 |
|
2 |
26 Feb 07 |
jari |
108 |
</basefont> |
2 |
26 Feb 07 |
jari |
109 |
</body> |
2 |
26 Feb 07 |
jari |
110 |
</html> |