Matches in SemOpenAlex for { <https://semopenalex.org/work/W2988551063> ?p ?o ?g. }
- W2988551063 abstract "Abstract Principal Component Analysis (PCA) of genetic data is routinely used to infer ancestry and control for population structure in various genetic analyses. However, conducting PCA analyses can be complicated and has several potential pitfalls. These pitfalls include (1) capturing Linkage Disequilibrium (LD) structure instead of population structure, (2) projected PCs that suffer from shrinkage bias, (3) detecting sample outliers, and (4) uneven population sizes. In this work, we explore these potential issues when using PCA, and present efficient solutions to these. Following applications to the UK Biobank and the 1000 Genomes project datasets, we make recommendations for best practices and provide efficient and user-friendly implementations of the proposed solutions in R packages bigsnpr and bigutilsr. For example, we find that PC19 to PC40 in the UK Biobank capture complex LD structure rather than population structure. Using our automatic algorithm for removing long-range LD regions, we recover 16 PCs that capture population structure only. Therefore, we recommend using only 16-18 PCs from the UK Biobank to account for population structure confounding. We also show how to use PCA to restrict analyses to individuals of homogeneous ancestry. Finally, when projecting individual genotypes onto the PCA computed from the 1000 Genomes project data, we find a shrinkage bias that becomes large for PC5 and beyond. We then demonstrate how to obtain unbiased projections efficiently using bigsnpr. Overall, we believe this work would be of interest for anyone using PCA in their analyses of genetic data, as well as for other omics data." @default.
- W2988551063 created "2019-11-22" @default.
- W2988551063 creator A5033000330 @default.
- W2988551063 creator A5036805387 @default.
- W2988551063 creator A5039794476 @default.
- W2988551063 creator A5044233598 @default.
- W2988551063 creator A5078389898 @default.
- W2988551063 date "2019-11-14" @default.
- W2988551063 modified "2023-09-26" @default.
- W2988551063 title "Efficient toolkit implementing best practices for principal component analysis of population genetic data" @default.
- W2988551063 cites W1966775465 @default.
- W2988551063 cites W1974611538 @default.
- W2988551063 cites W1980431326 @default.
- W2988551063 cites W1989638282 @default.
- W2988551063 cites W1992085420 @default.
- W2988551063 cites W2009588715 @default.
- W2988551063 cites W2024753568 @default.
- W2988551063 cites W2027455260 @default.
- W2988551063 cites W2039792137 @default.
- W2988551063 cites W2040730345 @default.
- W2988551063 cites W2041184937 @default.
- W2988551063 cites W2047165046 @default.
- W2988551063 cites W2049454545 @default.
- W2988551063 cites W2086062071 @default.
- W2988551063 cites W2099085143 @default.
- W2988551063 cites W2102213696 @default.
- W2988551063 cites W2104549677 @default.
- W2988551063 cites W2107916366 @default.
- W2988551063 cites W2108169091 @default.
- W2988551063 cites W2127288683 @default.
- W2988551063 cites W2134857847 @default.
- W2988551063 cites W2155496693 @default.
- W2988551063 cites W2157324002 @default.
- W2988551063 cites W2157752701 @default.
- W2988551063 cites W2167680278 @default.
- W2988551063 cites W2168354474 @default.
- W2988551063 cites W2284253967 @default.
- W2988551063 cites W2484383958 @default.
- W2988551063 cites W2794694102 @default.
- W2988551063 cites W2895486342 @default.
- W2988551063 cites W2938965511 @default.
- W2988551063 cites W2949231000 @default.
- W2988551063 cites W2951349772 @default.
- W2988551063 cites W2951456052 @default.
- W2988551063 cites W2957716708 @default.
- W2988551063 cites W2963655370 @default.
- W2988551063 cites W2963985726 @default.
- W2988551063 cites W2966338033 @default.
- W2988551063 doi "https://doi.org/10.1101/841452" @default.
- W2988551063 hasPublicationYear "2019" @default.
- W2988551063 type Work @default.
- W2988551063 sameAs 2988551063 @default.
- W2988551063 citedByCount "3" @default.
- W2988551063 countsByYear W29885510632020 @default.
- W2988551063 countsByYear W29885510632023 @default.
- W2988551063 crossrefType "posted-content" @default.
- W2988551063 hasAuthorship W2988551063A5033000330 @default.
- W2988551063 hasAuthorship W2988551063A5036805387 @default.
- W2988551063 hasAuthorship W2988551063A5039794476 @default.
- W2988551063 hasAuthorship W2988551063A5044233598 @default.
- W2988551063 hasAuthorship W2988551063A5078389898 @default.
- W2988551063 hasBestOaLocation W29885510631 @default.
- W2988551063 hasConcept C104317684 @default.
- W2988551063 hasConcept C116567970 @default.
- W2988551063 hasConcept C124101348 @default.
- W2988551063 hasConcept C135763542 @default.
- W2988551063 hasConcept C153209595 @default.
- W2988551063 hasConcept C154945302 @default.
- W2988551063 hasConcept C185592680 @default.
- W2988551063 hasConcept C197754878 @default.
- W2988551063 hasConcept C198531522 @default.
- W2988551063 hasConcept C27438332 @default.
- W2988551063 hasConcept C2908647359 @default.
- W2988551063 hasConcept C35605836 @default.
- W2988551063 hasConcept C41008148 @default.
- W2988551063 hasConcept C43617362 @default.
- W2988551063 hasConcept C54355233 @default.
- W2988551063 hasConcept C60644358 @default.
- W2988551063 hasConcept C71924100 @default.
- W2988551063 hasConcept C79337645 @default.
- W2988551063 hasConcept C86803240 @default.
- W2988551063 hasConcept C97425143 @default.
- W2988551063 hasConcept C99454951 @default.
- W2988551063 hasConceptScore W2988551063C104317684 @default.
- W2988551063 hasConceptScore W2988551063C116567970 @default.
- W2988551063 hasConceptScore W2988551063C124101348 @default.
- W2988551063 hasConceptScore W2988551063C135763542 @default.
- W2988551063 hasConceptScore W2988551063C153209595 @default.
- W2988551063 hasConceptScore W2988551063C154945302 @default.
- W2988551063 hasConceptScore W2988551063C185592680 @default.
- W2988551063 hasConceptScore W2988551063C197754878 @default.
- W2988551063 hasConceptScore W2988551063C198531522 @default.
- W2988551063 hasConceptScore W2988551063C27438332 @default.
- W2988551063 hasConceptScore W2988551063C2908647359 @default.
- W2988551063 hasConceptScore W2988551063C35605836 @default.
- W2988551063 hasConceptScore W2988551063C41008148 @default.
- W2988551063 hasConceptScore W2988551063C43617362 @default.
- W2988551063 hasConceptScore W2988551063C54355233 @default.
- W2988551063 hasConceptScore W2988551063C60644358 @default.
- W2988551063 hasConceptScore W2988551063C71924100 @default.