Investigating biomarkers in Parkinson's disease using machine learning

Loading...
Thumbnail Image

Keywords

Parkinson's disease, machine learning, GWAS data, feature selection

Degree Level

masters

Advisor

Degree Name

M. Sc.

Volume

Issue

Publisher

Memorial University of Newfoundland

Abstract

Genome-Wide Association Studies (GWAS) identify genetic variations in individuals affected with diseases such as Parkinson's disease (PD), whose allele or genotype frequencies are significantly different between the affected individuals and individuals who are free of the disease. GWAS data can be used to identify genetic variations associated with the disease of interest. However, GWAS datasets are extensive and contain many more Single Nucleotide Polymorphisms (SNPs pronounced “snips”) than individual samples. To address these challenges, we used Singular-Vectors Feature Selection (SVFS) and applied it to PD GWAS datasets. We discovered a group of SNPs that are potentially novel PD biomarkers as we found indirect links between them and PD in the literature but have not directly been associated with PD before. Direct association means that current literature directly links a SNP with PD; while an indirect link means that current literature suggests the involvement of a SNP in a disease other than PD but this other disease co-occurs with PD in a significant number of PD patients. These indirectly-linked SNPs open new potential lines of investigation. Directly-linked SNPs identified by our method are rs11248060, rs239748, rs999473, and rs2313982. One can see the full list of identified SNPs in Section 4.4.

Collections