2 |
26 Feb 07 |
jari |
1 |
<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN"> |
2 |
26 Feb 07 |
jari |
2 |
|
2 |
26 Feb 07 |
jari |
3 |
<HTML> |
2 |
26 Feb 07 |
jari |
4 |
<BODY bgcolor = "#FFFFCC"><basefont face = "Arial"> |
2 |
26 Feb 07 |
jari |
5 |
<h1>KNNC: K-Nearest Neighbors Validation</H1><H2>Parameter Information</H2> |
2 |
26 Feb 07 |
jari |
6 |
<hr size = 10> |
2 |
26 Feb 07 |
jari |
7 |
This option will validate the training set using leave-one-out cross validation, without classifying the unknowns. |
2 |
26 Feb 07 |
jari |
8 |
<br> |
2 |
26 Feb 07 |
jari |
9 |
In the following description "vector" refers to a given gene or experiment, depending on what is being classified. |
2 |
26 Feb 07 |
jari |
10 |
An element of a vector is one of the expression values that consitutes that vector. For a gene vector, its elements would |
2 |
26 Feb 07 |
jari |
11 |
consist of the expression values for that gene across all experiments, while for an experiment vector, its elements would |
2 |
26 Feb 07 |
jari |
12 |
consist of all the gene expression values for that experiment. |
2 |
26 Feb 07 |
jari |
13 |
|
2 |
26 Feb 07 |
jari |
14 |
<H2>Classify genes or experiments</H2> |
2 |
26 Feb 07 |
jari |
15 |
Self-explanatory |
2 |
26 Feb 07 |
jari |
16 |
<BR> |
2 |
26 Feb 07 |
jari |
17 |
<h2>Correlation filter</H2> |
2 |
26 Feb 07 |
jari |
18 |
The correlation filter is used to filter out those vectors of the set to be classified, that are not significantly |
2 |
26 Feb 07 |
jari |
19 |
correlated with at least one member of the training set. The significance of correlation is determined by the |
2 |
26 Feb 07 |
jari |
20 |
p-value, which is calculated by a permutation test in which each vector is permuted a user-specified number of times. |
2 |
26 Feb 07 |
jari |
21 |
<br> |
2 |
26 Feb 07 |
jari |
22 |
<h2>KNN Classification parameters</H2> |
2 |
26 Feb 07 |
jari |
23 |
This is where the user specifies the expected number of classes (which is also the number of classes present |
2 |
26 Feb 07 |
jari |
24 |
in the training set). The number of neighbors is the number of vectors from the training set that are chosen as |
2 |
26 Feb 07 |
jari |
25 |
neighbors to a given vector. Euclidean distance is used to determine the neighborhood. Let’s say we want to |
2 |
26 Feb 07 |
jari |
26 |
classify a gene g. Gene g is assigned to the class that is most frequently represented among its k nearest |
2 |
26 Feb 07 |
jari |
27 |
neighbors from the training set (where k is specified by the user). In case of a tie, gene g remains unassigned. |
2 |
26 Feb 07 |
jari |
28 |
<br> |
2 |
26 Feb 07 |
jari |
29 |
<h2>Create / import training set</H2> |
2 |
26 Feb 07 |
jari |
30 |
If the user chooses to import a previously created training set, on hitting the “Next” button a file chooser is |
2 |
26 Feb 07 |
jari |
31 |
displayed from which the training file can be chosen. If an appropriate file is chosen, the KNN classification editor |
2 |
26 Feb 07 |
jari |
32 |
displayed with the class assignments from the file. If the option to create a new training set from data is chosen, |
2 |
26 Feb 07 |
jari |
33 |
on hitting the “Next” button the classification editor is directly displayed with all vectors set to neutral. |
2 |
26 Feb 07 |
jari |
34 |
<br> |
2 |
26 Feb 07 |
jari |
35 |
<h2>Hierarchical Clustering</H2> |
2 |
26 Feb 07 |
jari |
36 |
This checkbox selects whether to perform hierarchical clustering on the elements in each cluster created. |
2 |
26 Feb 07 |
jari |
37 |
<br> |
2 |
26 Feb 07 |
jari |
38 |
</basefont> |
2 |
26 Feb 07 |
jari |
39 |
</BODY> |
2 |
26 Feb 07 |
jari |
40 |
</HTML> |