2006.403: Microarray Data Analysis Using Probabilistic Methods
2006.403: Xuejun Liu (2006) Microarray Data Analysis Using Probabilistic Methods. PhD thesis, The University of Manchester.
Full text available as:
|PDF - Requires a PDF viewer such as GSview, Xpdf or Adobe Acrobat Reader|
Affymetrix microarrays are currently the most widely used microarray technology. Due to the complexity of microarray experiments, the experimental data is very noisy. Many summarization methods have been developed to provide gene expression levels from Affymetrix probe-level data. Most of the currently popular methods do not provide a measure of uncertainty for the estimated expression level of each gene. The use of probabilistic models can overcome this limitation. This thesis extends a previously developed probabilistic model, mgMOS, to obtain an improved model, multi-mgMOS. This new model provides improved accuracy and is more computationally efficient than other alternatives. It also provides a level of uncertainty associated with the measured gene expression level. This probe-level measurement error provides useful information to help in the downstream analysis of gene expression data.
In order to show the advantage of the probe-level probabilistic model, the obtained uncertainty is propagated in two downstream analyses of gene expression data. One is detecting differential gene expression, another is clustering. A Bayesian hierarchical model is proposed to include probe-level measurement error into the detection of differential gene expression from replicated experiments and a standard model-based clustering method is augmented to incorporate probe-level measurement error. Due to the inclusion of the probe-level measurement error, the downstream probabilistic models become more complicated or intractable. In order to perform inference with these augmented models efficiently, various inference approximation approaches are compared in this thesis, including Maximum a Posteriori, Laplace approximation, a variational method and Markov chain Monte Carlo. Results from both benchmark data sets and a real-world data set demonstrate that the incorporation of the probe-level measurement error improves the performance of the downstream probabilistic analysis.
|Item Type:||Thesis (PhD)|
Dr. Liu worked with Magnus Rattray in the School of Computer Science
|Uncontrolled Keywords:||probabilistic models, machine learning, Affymetrix microarrays|
|Subjects:||MSC 2000 > 62 Statistics|
MSC 2000 > 68 Computer science
MSC 2000 > 92 Biology and other natural sciences
|Deposited By:||Dr Mark Muldoon|
|Deposited On:||21 November 2006|