Shape Analysis In Protein Structure Alignment

Gkolias, Theodoros (2018) Shape Analysis In Protein Structure Alignment. Doctor of Philosophy (PhD) thesis, University of Kent,.

Download (7MB) Preview


In this Thesis we explore the problem of structural alignment of protein molecules using statistical shape analysis techniques. The structural alignment problem can be divided into three smaller ones: the representation of protein structures, the sampling of possible alignments between the molecules and the evaluation of a given alignment. Previous work done in this field, can be divided in two approaches: an adhoc algorithmic approach from the Bioinformatics literature and an approach using statistical methods either in a likelihood or Bayesian framework. Both approaches address the problem from a different scope. For example, the algorithmic approach is easy to implement but lacks an overall modelling framework, and the Bayesian address this issue but sometimes the implementation is not straightforward. We develop a method which is easy to implement and is based on statistical assumptions. In order to asses the quality of a given alignment we use a size and shape likelihood density which is based in the structure information of the molecules. This likelihood density is also extended to include sequence infor- mation and gap penalty parameters so that biologically meaningful solution can be produced. Furthermore, we develop a search algorithm to explore possible alignments from a given starting point. The results suggest that our approach produces better or equal alignments when it is compared to the most recent struc- tural alignment methods. In most of the cases we managed to achieve a higher number of matched atoms combined with a high TMscore. Moreover, we extended our method using Bayesian techniques to perform alignments based on posterior modes. In our approach, we estimate directly the mode of the posterior distribution which provides the final alignment between two molecules. We also, choose a different approach for treating the mean parameter. In previous methods the mean was either integrated out of the likelihood density or considered as fixed. We choose to assign a prior over it and obtain its posterior mode. Finally, we consider an extension of the likelihood model assuming a Normal density for both the matched and unmatched parts of a molecule and diagonal covariance structure. We explore two different variants. In the first we consider a fixed zero mean for the unmatched parts of the molecules and in the second we consider a common mean for both the matched and unmatched parts. Based on simulated and real results, both models seems to perform well in obtaining high number of matched atoms and high TMscore.

Item Type: Thesis (Doctor of Philosophy (PhD))
Thesis advisor: Kume, Alfred
Uncontrolled keywords: Shape Analysis, Bioinformatics, Protein Structure Alignment
Subjects: Q Science > QA Mathematics (inc Computing science)
Q Science > QA Mathematics (inc Computing science) > QA276 Mathematical statistics
Q Science > QP Physiology (Living systems)
Divisions: Faculties > Sciences > School of Mathematics Statistics and Actuarial Science
SWORD Depositor: System Moodle
Depositing User: System Moodle
Date Deposited: 10 Apr 2018 12:16 UTC
Last Modified: 29 May 2019 20:27 UTC
Resource URI: (The current URI for this page, for reference purposes)
  • Depositors only (login required):


Downloads per month over past year