You are here: MIMS > EPrints
MIMS EPrints

2006.403: Microarray Data Analysis Using Probabilistic Methods

2006.403: Xuejun Liu (2006) Microarray Data Analysis Using Probabilistic Methods. PhD thesis, The University of Manchester.

Full text available as:

PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader
1912 Kb


Affymetrix microarrays are currently the most widely used microarray technology. Due to the complexity of microarray experiments, the experimental data is very noisy. Many summarization methods have been developed to provide gene expression levels from Affymetrix probe-level data. Most of the currently popular methods do not provide a measure of uncertainty for the estimated expression level of each gene. The use of probabilistic models can overcome this limitation. This thesis extends a previously developed probabilistic model, mgMOS, to obtain an improved model, multi-mgMOS. This new model provides improved accuracy and is more computationally efficient than other alternatives. It also provides a level of uncertainty associated with the measured gene expression level. This probe-level measurement error provides useful information to help in the downstream analysis of gene expression data.

In order to show the advantage of the probe-level probabilistic model, the obtained uncertainty is propagated in two downstream analyses of gene expression data. One is detecting differential gene expression, another is clustering. A Bayesian hierarchical model is proposed to include probe-level measurement error into the detection of differential gene expression from replicated experiments and a standard model-based clustering method is augmented to incorporate probe-level measurement error. Due to the inclusion of the probe-level measurement error, the downstream probabilistic models become more complicated or intractable. In order to perform inference with these augmented models efficiently, various inference approximation approaches are compared in this thesis, including Maximum a Posteriori, Laplace approximation, a variational method and Markov chain Monte Carlo. Results from both benchmark data sets and a real-world data set demonstrate that the incorporation of the probe-level measurement error improves the performance of the downstream probabilistic analysis.

Item Type:Thesis (PhD)
Additional Information:

Dr. Liu worked with Magnus Rattray in the School of Computer Science

Uncontrolled Keywords:probabilistic models, machine learning, Affymetrix microarrays
Subjects:MSC 2000 > 62 Statistics
MSC 2000 > 68 Computer science
MSC 2000 > 92 Biology and other natural sciences
MIMS number:2006.403
Deposited By:Dr Mark Muldoon
Deposited On:21 November 2006

Download Statistics: last 4 weeks
Repository Staff Only: edit this item