C. elegans microarray data seen through a novel nonmetric multidimensional scaling method
Y-h. Taguchi1, Y. Oono2
1tag@granular.com, Department of Physics, Chuo University; 2y-oono@uiuc.edu, Department of Physics, UIUC
We have developed a novel nonmetric multidimensional scaling
method that avoids any intermediate distances needed for
monotone fit. Consequently, the algorithm is maximally nonmetric
and is efficient enough to allow large scale data analysis with a
small computer.
The C. elegans microarray data avaialbale at
http://www.sciencemag.org/feature/data/kim1061603/kimbig.zip
have been analyzed by this method. We have found that the
information captured in the correlation coefficients of the gene
expression levels can be visualized in a 3D space as a thick
spherical shell. The dimensionality of this space is uniquely
specified by the data. The locations of the genes are consistent
with their known annotation results and are stable against the
choice of experiments and genes used in the analysis, provided the
number of randomly chosen experiments is more than 100, and
that of genes more than 1000. The 3D coordinates of the
embedded genes have turned out to be expressed as linear
combinations of small number of the original microarray
experimental data. The source code in fortran is available at
http://www.granular.com/MDS/.