Fabris, Fabio, Palmer, Daniel, Salama, Khalid M., de Magalhaes, Joao Pedro, Freitas, Alex A. (2019) Using deep learning to associate human genes with age-related diseases. Bioinformatics, 36 (7). pp. 2202-2208. ISSN 1367-4803. (doi:10.1093/bioinformatics/btz887) (KAR id:79932)
PDF (corrected proof)
Publisher pdf
Language: English
This work is licensed under a Creative Commons Attribution 4.0 International License.
|
|
Download this file (PDF/213kB) |
Preview |
Request a format suitable for use with assistive technology e.g. a screenreader | |
PDF
Author's Accepted Manuscript
Language: English |
|
Download this file (PDF/416kB) |
Preview |
Request a format suitable for use with assistive technology e.g. a screenreader | |
Official URL: http://dx.doi.org/10.1093/bioinformatics/btz887 |
Abstract
Motivation: One way to identify genes possibly associated with ageing is to build a classification model (from the machine learning field) capable of classifying genes as associated with multiple age-related diseases. To build this model, we use a pre-compiled list of human genes associated with age-related diseases and apply a novel Deep Neural Network (DNN) method to find associations between gene descriptors (e.g. Gene Ontology terms, protein–protein interaction data and biological pathway information) and age-related diseases. Results: The novelty of our new DNN method is its modular architecture, which has the capability of combining several sources of biological data to predict which ageing-related diseases a gene is associated with (if any). Our DNN method achieves better predictive performance than standard DNN approaches, a Gradient Boosted Tree classifier (a strong baseline method) and a Logistic Regression classifier. Given the DNN model produced by our method, we use two approaches to identify human genes that are not known to be associated with age-related diseases according to our dataset. First, we investigate genes that are close to other disease-associated genes in a complex multi-dimensional feature space learned by the DNN algorithm. Second, using the class label probabilities output by our DNN approach, we identify genes with a high probability of being associated with age-related diseases according to the model. We provide evidence of these putative associations retrieved from the DNN model with literature support. The source code and datasets can be found at: https://github.com/fabiofabris/Bioinfo2019.
Item Type: | Article |
---|---|
DOI/Identification number: | 10.1093/bioinformatics/btz887 |
Uncontrolled keywords: | data mining, machine learning, classification, bioinformatics, ageing, deep learning |
Subjects: | Q Science > Q Science (General) > Q335 Artificial intelligence |
Divisions: | Divisions > Division of Computing, Engineering and Mathematical Sciences > School of Computing |
Depositing User: | Alex Freitas |
Date Deposited: | 03 Feb 2020 18:20 UTC |
Last Modified: | 11 Jan 2024 11:31 UTC |
Resource URI: | https://kar.kent.ac.uk/id/eprint/79932 (The current URI for this page, for reference purposes) |
- Link to SensusAccess
- Export to:
- RefWorks
- EPrints3 XML
- BibTeX
- CSV
- Depositors only (login required):