Matches in SemOpenAlex for { <https://semopenalex.org/work/W2214774680> ?p ?o ?g. }
- W2214774680 abstract "Author(s): Bhaskar, Anand | Advisor(s): Song, Yun S | Abstract: The recent availability of large-sample high-throughput sequencing data has given us an unprecedented opportunity to very finely resolve the details of historical demographic processes that have shaped the genomes of modern human populations. Such understanding of population demography is important for several applications — to avoid false positives in genome-wide association studies; to calibrate null models of neutral genome evolution in order to find regions under selection; to study the impact of bottlenecks and small founder populations on genetic mutational load; to reconstruct large-scale historical human migration and admixture events; and so on.In this dissertation, we consider some statistical, algorithmic and robustness aspects of demographic inference from genomic variation data. In particular, we study the problem of determining the historical effective size of a population from the sample frequency spectrum (SFS), which measures the distribution of allele frequencies in a sample of sequences drawn from the population.From the statistical or information-theoretic perspective, it is known that this inverse problem does not have a unique solution in general, no matter how large the sample size. For any population allele frequency distribution, there exist infinitely many population size functions that are consistent with this distribution. While such a non-identifiability result might appear to pose a serious problem to statistical inference algorithms, we show that the situation is not so bad in practice. In particular, we prove that if the true population size function is piecewise-defined with each piece belonging to some family of biologically-motivated functions, then the SFS of a finite sample of sequences uniquely determines the underlying demography. We obtain a general bound on the sample size sufficient for identifiability; this bound depends on the number of pieces in the demographic model and on the family of functions for each piece. We also give concrete instantiations of this bound for piecewise-constant and piecewise-exponential models that are commonly used in demographic inference analyses.From the algorithmic perspective, we build on analytic results for the expected SFS of a time-varying population size function and develop an efficient likelihood-based algorithm to infer piecewise-exponentialpopulation size histories from large sample allele frequency data. By considering very large samples, our method can resolve details of the population history from the very recent past that are not otherwise accessible using smaller samples.The third aspect of this dissertation is concerned with understanding the robustness of widely used evolutionary models to violations of model assumptions. Continuous-time evolutionary models like Kingman's coalescent and its dual diffusion process are derived from discrete models of random mating by assuming that the sample size being analyzed is much smaller than the the population size. However, the very large sample datasets being produced due to advances in high-throughput sequencing technologies are approaching the limits of this assumption. To investigate this issue, we develop exact algorithms for computation under the discrete-time Wright-Fisher model and use these algorithms to study the distortions in several genealogical quantities arising due to the coalescent approximation. Our findings indicate that for several demographic models inferred from large-scale sequence data, there can be substantial genealogical deviations introduced by the coalescent approximation that might influence the results of inference studies." @default.
- W2214774680 created "2016-06-24" @default.
- W2214774680 creator A5003822824 @default.
- W2214774680 date "2013-01-01" @default.
- W2214774680 modified "2023-09-26" @default.
- W2214774680 title "Statistical, algorithmic, and robustness aspects of population demographic inference from genomic variation data" @default.
- W2214774680 cites W114397555 @default.
- W2214774680 cites W1448623749 @default.
- W2214774680 cites W1536721400 @default.
- W2214774680 cites W1547520239 @default.
- W2214774680 cites W1561726411 @default.
- W2214774680 cites W1566277748 @default.
- W2214774680 cites W1884759635 @default.
- W2214774680 cites W1897267496 @default.
- W2214774680 cites W1912710617 @default.
- W2214774680 cites W1964547306 @default.
- W2214774680 cites W1967144980 @default.
- W2214774680 cites W1973490559 @default.
- W2214774680 cites W1976593262 @default.
- W2214774680 cites W1982516282 @default.
- W2214774680 cites W1984324163 @default.
- W2214774680 cites W1984726788 @default.
- W2214774680 cites W1986265706 @default.
- W2214774680 cites W1986342834 @default.
- W2214774680 cites W1987754412 @default.
- W2214774680 cites W1990178623 @default.
- W2214774680 cites W1990275569 @default.
- W2214774680 cites W1990824786 @default.
- W2214774680 cites W2001036195 @default.
- W2214774680 cites W2001904007 @default.
- W2214774680 cites W2004209173 @default.
- W2214774680 cites W2009132217 @default.
- W2214774680 cites W2013753628 @default.
- W2214774680 cites W2016503861 @default.
- W2214774680 cites W2019956985 @default.
- W2214774680 cites W2020302614 @default.
- W2214774680 cites W2028704045 @default.
- W2214774680 cites W2036780474 @default.
- W2214774680 cites W2039608769 @default.
- W2214774680 cites W2047923046 @default.
- W2214774680 cites W2058348475 @default.
- W2214774680 cites W2059253408 @default.
- W2214774680 cites W2059545122 @default.
- W2214774680 cites W2065794571 @default.
- W2214774680 cites W2079115388 @default.
- W2214774680 cites W2079544968 @default.
- W2214774680 cites W2082967637 @default.
- W2214774680 cites W2083605491 @default.
- W2214774680 cites W2085642459 @default.
- W2214774680 cites W2089205969 @default.
- W2214774680 cites W2090044071 @default.
- W2214774680 cites W2091705028 @default.
- W2214774680 cites W2093012110 @default.
- W2214774680 cites W2094608047 @default.
- W2214774680 cites W2096005812 @default.
- W2214774680 cites W2097560173 @default.
- W2214774680 cites W2099733570 @default.
- W2214774680 cites W2101549247 @default.
- W2214774680 cites W2103844658 @default.
- W2214774680 cites W2104481553 @default.
- W2214774680 cites W2105990937 @default.
- W2214774680 cites W2107746932 @default.
- W2214774680 cites W2114493739 @default.
- W2214774680 cites W2114761259 @default.
- W2214774680 cites W2115596355 @default.
- W2214774680 cites W2116668053 @default.
- W2214774680 cites W2117715276 @default.
- W2214774680 cites W2119936645 @default.
- W2214774680 cites W2123871098 @default.
- W2214774680 cites W2125300654 @default.
- W2214774680 cites W2128978199 @default.
- W2214774680 cites W2131909281 @default.
- W2214774680 cites W2134599124 @default.
- W2214774680 cites W2136910990 @default.
- W2214774680 cites W2146638375 @default.
- W2214774680 cites W2146973661 @default.
- W2214774680 cites W2147476214 @default.
- W2214774680 cites W2147911093 @default.
- W2214774680 cites W2148401134 @default.
- W2214774680 cites W2150883566 @default.
- W2214774680 cites W2157752701 @default.
- W2214774680 cites W2161644980 @default.
- W2214774680 cites W2161692256 @default.
- W2214774680 cites W2164061579 @default.
- W2214774680 cites W2165141658 @default.
- W2214774680 cites W2165794141 @default.
- W2214774680 cites W2168933019 @default.
- W2214774680 cites W2169215767 @default.
- W2214774680 cites W2170058195 @default.
- W2214774680 cites W2170937702 @default.
- W2214774680 cites W2171777347 @default.
- W2214774680 cites W2319462502 @default.
- W2214774680 cites W2463561659 @default.
- W2214774680 cites W2593899568 @default.
- W2214774680 cites W2764433274 @default.
- W2214774680 cites W61164733 @default.
- W2214774680 cites W76126619 @default.
- W2214774680 hasPublicationYear "2013" @default.
- W2214774680 type Work @default.
- W2214774680 sameAs 2214774680 @default.