svndigest - svndigest

mev-4.0.01/source/org/tigr/microarray/mev/cluster/gui/impl/dialogs/dialogHelpUtil/dialogHelpPages/knnc_parameters2.html

: Code
: Comments
: Other

Rev	Date	Author	Line
2	26 Feb 07	jari	1	<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
2	26 Feb 07	jari	2
2	26 Feb 07	jari	3	<HTML>
2	26 Feb 07	jari	4	<BODY bgcolor = "#FFFFCC"><basefont face = "Arial">
2	26 Feb 07	jari	5	<h1>KNNC: K-Nearest Neighbors Classification</H1><H2>Parameter Information</H2>
2	26 Feb 07	jari	6	<hr size = 10>
2	26 Feb 07	jari	7	In the following description "vector" refers to a given gene or experiment, depending on what is being classified.
2	26 Feb 07	jari	8	An element of a vector is one of the expression values that consitutes that vector. For a gene vector, its elements would
2	26 Feb 07	jari	9	consist of the expression values for that gene across all experiments, while for an experiment vector, its elements would
2	26 Feb 07	jari	10	consist of all the gene expression values for that experiment.
2	26 Feb 07	jari	11
2	26 Feb 07	jari	12	<H2>Classify genes or experiments</H2>
2	26 Feb 07	jari	13	Self-explanatory
2	26 Feb 07	jari	14	<BR>
2	26 Feb 07	jari	15	<H2>Variance filter</H2>
2	26 Feb 07	jari	16	This is the first of two noise-reduction filters that can optionally be applied before classification.
2	26 Feb 07	jari	17	The variance filter keeps only those vectors in the entire data set (including the training set) that
2	26 Feb 07	jari	18	have the highest variance across all elements in that vector. The number of vectors to be retained is specified by
2	26 Feb 07	jari	19	the user. Note that after applying variance filtering, the training set might be smaller than the one initially
2	26 Feb 07	jari	20	specified by the user, as some of the training vectors might have been filtered out.
2	26 Feb 07	jari	21	<br>
2	26 Feb 07	jari	22	<h2>Correlation filter</H2>
2	26 Feb 07	jari	23	The correlation filter is used to filter out those vectors of the set to be classified, that are not significantly
2	26 Feb 07	jari	24	correlated with at least one member of the training set. The significance of correlation is determined by the
2	26 Feb 07	jari	25	p-value, which is calculated by a permutation test in which each vector is permuted a user-specified number of times.
2	26 Feb 07	jari	26	<br>
2	26 Feb 07	jari	27	<h2>KNN Classification parameters</H2>
2	26 Feb 07	jari	28	This is where the user specifies the expected number of classes (which is also the number of classes present
2	26 Feb 07	jari	29	in the training set). The number of neighbors is the number of vectors from the training set that are chosen as
2	26 Feb 07	jari	30	neighbors to a given vector. Euclidean distance is used to determine the neighborhood. Let�s say we want to
2	26 Feb 07	jari	31	classify a gene g. Gene g is assigned to the class that is most frequently represented among its k nearest
2	26 Feb 07	jari	32	neighbors from the training set (where k is specified by the user). In case of a tie, gene g remains unassigned.
2	26 Feb 07	jari	33	<br>
2	26 Feb 07	jari	34	<h2>Create / import training set</H2>
2	26 Feb 07	jari	35	If the user chooses to import a previously created training set, on hitting the �Next� button a file chooser is
2	26 Feb 07	jari	36	displayed from which the training file can be chosen. If an appropriate file is chosen, the KNN classification editor
2	26 Feb 07	jari	37	displayed with the class assignments from the file. If the option to create a new training set from data is chosen,
2	26 Feb 07	jari	38	on hitting the �Next� button the classification editor is directly displayed with all vectors set to neutral.
2	26 Feb 07	jari	39	<br>
2	26 Feb 07	jari	40	<h2>Hierarchical Clustering</H2>
2	26 Feb 07	jari	41	This checkbox selects whether to perform hierarchical clustering on the elements in each cluster created.
2	26 Feb 07	jari	42	<br>
2	26 Feb 07	jari	43	</basefont>
2	26 Feb 07	jari	44	</BODY>
2	26 Feb 07	jari	45	</HTML>