Matches in SemOpenAlex for { <https://semopenalex.org/work/W48427536> ?p ?o ?g. }
- W48427536 abstract "Single nucleotide polymorphisms (SNPs) are DNA sequence variations that occur when a single nucleotide in the genome sequence is altered. Since, variations in DNA sequence can have a major impact on complex human diseases such as obesity, epilepsy, type 2 diabetes, rheumatoid arthritis; SNPs have become increasingly significant in identification of such complex diseases. Recent biological studies point out that a single altered gene may have a small effect on a complex disease, whereas interactions between multiple genes may have a significant role. Therefore, identifying multiple genes associated with complex disorders is essential. In this spirit, combinations of multiple SNPs rather than individual SNPs should be analyzed. However, assessing a very large number of SNP combinations is computationally challenging and due to this challenge, in literature there exist a limited number of studies on extracting statistically significant SNP combinations. In this thesis work, we focus on this challenging problem and develop a five step disease-associated multi-SNP combinations search to identify statistically significant SNP combinations and the significant rules defining the associations between SNPs and a specified disease. The proposed five step multi-SNP combinations procedure is applied to the simulated rheumatoid arthritis data set provided by Genetic Analysis Workshop 15. In each step, statistically significant SNPs are extracted from the available set of SNPs that are not yet classified as significant or insignificant. In the first step, the genome wide association analysis (GWA) is performed on the original complete multi-family data set. Then, in the second step we use the tag SNP selection algorithm to find a smaller subset of informative SNP markers. In literature most tag SNP selection methods are based on the pair wise (two-markers) linkage disequilibrium (LD) measures. But in this thesis, both the pair wise and multiple marker LD measures have been incorporated to improve the genetic coverage. Up to the third step the procedure aims to identify individual significant SNPs. In the third step a genetic algorithm (GA) based feature selection method is performed. It provides a significant combination of SNPs and the GA constructs this combination by maximizing the explanatory power of the selected SNPs while trying to decrease the number of selected SNPs dynamically. Since GA is a probabilistic search approach, at each execution it may provide different SNP combinations. We apply the GA several times to obtain multiple significant SNP combinations, and for each combination we calculate the associated pseudo r-square values and apply some statistical tests to check its significance. We also consider the union and intersection of the SNP combinations, identified by the GA, as potentially significant SNP combinations. After identifying multiple statistically significant SNP combinations, in the fourth and fifth steps we focus on extracting rules to explain the association between the SNPs and the disease. In the fourth step we apply a classification method, called Decision Tree Forest, to calculate the importance values of individual SNPs that belong to at least one of the SNP combinations found by the GA. Since each marker in a SNP combination is in bi-allelic form, genotypes of a SNP can affect the disease status. Different genotypes of SNPs are considered to define candidate rules. Then utilizing the calculated importance values and the occurrence percentage of the candidate rule in the data set, in the fifth step we perform our proposed rule extraction method to select the rules among the candidate ones. In literature there are many classification approaches such as the decision tree, decision forest and random forest. Each of these methods considers SNP interactions which are explanatory for a large subset of patients. However, in real life some SNP interactions that are observed only in a small subset of patients might cause the disease. The existing classification methods do not identify such interactions as significant. However, of the proposed five-step multi-SNP combinations procedure extracts these interactions as well as the others. This is a significant contribution to the research on identifying significant interactions that may cause a human to have the disease." @default.
- W48427536 created "2016-06-24" @default.
- W48427536 creator A5072744411 @default.
- W48427536 date "2010-01-01" @default.
- W48427536 modified "2023-09-27" @default.
- W48427536 title "Identification of disease related significant SNPs" @default.
- W48427536 cites W1504974927 @default.
- W48427536 cites W1523989055 @default.
- W48427536 cites W1559339763 @default.
- W48427536 cites W1565377632 @default.
- W48427536 cites W1569213097 @default.
- W48427536 cites W1581077884 @default.
- W48427536 cites W1608549042 @default.
- W48427536 cites W1730404227 @default.
- W48427536 cites W1826158314 @default.
- W48427536 cites W1849729440 @default.
- W48427536 cites W1973948212 @default.
- W48427536 cites W1981903823 @default.
- W48427536 cites W1985035489 @default.
- W48427536 cites W1987464952 @default.
- W48427536 cites W1996767070 @default.
- W48427536 cites W2001701937 @default.
- W48427536 cites W2009144436 @default.
- W48427536 cites W2009435671 @default.
- W48427536 cites W2015408724 @default.
- W48427536 cites W2016382728 @default.
- W48427536 cites W2021070595 @default.
- W48427536 cites W2022471918 @default.
- W48427536 cites W2030991556 @default.
- W48427536 cites W2042103448 @default.
- W48427536 cites W2042556118 @default.
- W48427536 cites W2045814707 @default.
- W48427536 cites W2054153086 @default.
- W48427536 cites W2071672580 @default.
- W48427536 cites W2078977693 @default.
- W48427536 cites W2082536587 @default.
- W48427536 cites W2095809779 @default.
- W48427536 cites W2100631108 @default.
- W48427536 cites W2103333826 @default.
- W48427536 cites W2105000755 @default.
- W48427536 cites W2107820675 @default.
- W48427536 cites W2113665069 @default.
- W48427536 cites W2115053056 @default.
- W48427536 cites W2119479037 @default.
- W48427536 cites W2119795709 @default.
- W48427536 cites W2124225314 @default.
- W48427536 cites W2129905273 @default.
- W48427536 cites W2139144622 @default.
- W48427536 cites W2140412509 @default.
- W48427536 cites W2141585001 @default.
- W48427536 cites W2149250260 @default.
- W48427536 cites W2152596080 @default.
- W48427536 cites W2157752701 @default.
- W48427536 cites W2159242834 @default.
- W48427536 cites W2162530578 @default.
- W48427536 cites W2165999916 @default.
- W48427536 cites W2168175751 @default.
- W48427536 cites W2169484904 @default.
- W48427536 cites W2592974517 @default.
- W48427536 cites W2799061466 @default.
- W48427536 cites W3203298545 @default.
- W48427536 cites W633021719 @default.
- W48427536 cites W63807868 @default.
- W48427536 hasPublicationYear "2010" @default.
- W48427536 type Work @default.
- W48427536 sameAs 48427536 @default.
- W48427536 citedByCount "0" @default.
- W48427536 crossrefType "dissertation" @default.
- W48427536 hasAuthorship W48427536A5072744411 @default.
- W48427536 hasConcept C104317684 @default.
- W48427536 hasConcept C106208931 @default.
- W48427536 hasConcept C116834253 @default.
- W48427536 hasConcept C122060243 @default.
- W48427536 hasConcept C135763542 @default.
- W48427536 hasConcept C139275648 @default.
- W48427536 hasConcept C141231307 @default.
- W48427536 hasConcept C153209595 @default.
- W48427536 hasConcept C186413461 @default.
- W48427536 hasConcept C197077220 @default.
- W48427536 hasConcept C54355233 @default.
- W48427536 hasConcept C55060382 @default.
- W48427536 hasConcept C59822182 @default.
- W48427536 hasConcept C70721500 @default.
- W48427536 hasConcept C86803240 @default.
- W48427536 hasConceptScore W48427536C104317684 @default.
- W48427536 hasConceptScore W48427536C106208931 @default.
- W48427536 hasConceptScore W48427536C116834253 @default.
- W48427536 hasConceptScore W48427536C122060243 @default.
- W48427536 hasConceptScore W48427536C135763542 @default.
- W48427536 hasConceptScore W48427536C139275648 @default.
- W48427536 hasConceptScore W48427536C141231307 @default.
- W48427536 hasConceptScore W48427536C153209595 @default.
- W48427536 hasConceptScore W48427536C186413461 @default.
- W48427536 hasConceptScore W48427536C197077220 @default.
- W48427536 hasConceptScore W48427536C54355233 @default.
- W48427536 hasConceptScore W48427536C55060382 @default.
- W48427536 hasConceptScore W48427536C59822182 @default.
- W48427536 hasConceptScore W48427536C70721500 @default.
- W48427536 hasConceptScore W48427536C86803240 @default.
- W48427536 hasLocation W484275361 @default.