|
Peter Johansson and Jari Häkkinen
Improving missing value imputation of microarray
data by using spot quality weights
BMC Bioinformatics 7, 306
(2006)
Abstract:
Background: Microarray technology has become
popular for gene expression profiling, and many analysis
tools have been developed for data interpretation. Most
of these tools require complete data, but measurement
values are often missing. A way to overcome the problem
of incomplete data is to impute the missing data before
analysis. Many imputation methods have been suggested,
some naïve and other more sophisticated taking into
account correlation in data. However, these methods are
binary in the sense that each spot is considered either
missing or present. Hence, they are depending on a cutoff
separating poor spots from good spots. We suggest a
different approach in which a continuous spot quality
weight is built into the imputation methods, allowing for
smooth imputations of all spots to larger or lesser
degree.
Results: We assessed several imputation methods on
three data sets containing replicate measurements, and
found that weighted methods performed better than
non-weighted methods. Of the compared methods, best
performance and robustness were achieved with the
weighted nearest neighbours method (WeNNI), in which both
spot quality and correlations between genes were included
in the imputation.
Conclusion: Including a measure of spot quality
improves the accuracy of the missing value imputation.
WeNNI, the proposed method is more accurate and less
sensitive to parameters than the widely used kNNimpute
and LSimpute algorithms.
LU TP 05-40
|