Matches in SemOpenAlex for { <https://semopenalex.org/work/W4291017687> ?p ?o ?g. }
- W4291017687 abstract "Viruses are among the shortest yet highly abundant species that harbor minimal instructions to infect cells, adapt, multiply, and exist. However, with the current substantial availability of viral genome sequences, the scientific repertory lacks a complexity landscape that automatically enlights viral genomes' organization, relation, and fundamental characteristics.This work provides a comprehensive landscape of the viral genome's complexity (or quantity of information), identifying the most redundant and complex groups regarding their genome sequence while providing their distribution and characteristics at a large and local scale. Moreover, we identify and quantify inverted repeats abundance in viral genomes. For this purpose, we measure the sequence complexity of each available viral genome using data compression, demonstrating that adequate data compressors can efficiently quantify the complexity of viral genome sequences, including subsequences better represented by algorithmic sources (e.g., inverted repeats). Using a state-of-the-art genomic compressor on an extensive viral genomes database, we show that double-stranded DNA viruses are, on average, the most redundant viruses while single-stranded DNA viruses are the least. Contrarily, double-stranded RNA viruses show a lower redundancy relative to single-stranded RNA. Furthermore, we extend the ability of data compressors to quantify local complexity (or information content) in viral genomes using complexity profiles, unprecedently providing a direct complexity analysis of human herpesviruses. We also conceive a features-based classification methodology that can accurately distinguish viral genomes at different taxonomic levels without direct comparisons between sequences. This methodology combines data compression with simple measures such as GC-content percentage and sequence length, followed by machine learning classifiers.This article presents methodologies and findings that are highly relevant for understanding the patterns of similarity and singularity between viral groups, opening new frontiers for studying viral genomes' organization while depicting the complexity trends and classification components of these genomes at different taxonomic levels. The whole study is supported by an extensive website (https://asilab.github.io/canvas/) for comprehending the viral genome characterization using dynamic and interactive approaches." @default.
- W4291017687 created "2022-08-13" @default.
- W4291017687 creator A5015662517 @default.
- W4291017687 creator A5051693451 @default.
- W4291017687 creator A5057184118 @default.
- W4291017687 creator A5073280030 @default.
- W4291017687 date "2022-01-01" @default.
- W4291017687 modified "2023-09-30" @default.
- W4291017687 title "The complexity landscape of viral genomes" @default.
- W4291017687 cites W1556844948 @default.
- W4291017687 cites W1563088657 @default.
- W4291017687 cites W163747003 @default.
- W4291017687 cites W1970231722 @default.
- W4291017687 cites W1985518458 @default.
- W4291017687 cites W1987370597 @default.
- W4291017687 cites W1988281359 @default.
- W4291017687 cites W2006390792 @default.
- W4291017687 cites W2007512406 @default.
- W4291017687 cites W2017555693 @default.
- W4291017687 cites W2017580487 @default.
- W4291017687 cites W2025753533 @default.
- W4291017687 cites W2032682947 @default.
- W4291017687 cites W2054008728 @default.
- W4291017687 cites W2055072782 @default.
- W4291017687 cites W2055258243 @default.
- W4291017687 cites W2056375209 @default.
- W4291017687 cites W2059040509 @default.
- W4291017687 cites W2061694709 @default.
- W4291017687 cites W2064675550 @default.
- W4291017687 cites W2078689375 @default.
- W4291017687 cites W2083147729 @default.
- W4291017687 cites W2084207273 @default.
- W4291017687 cites W2089284948 @default.
- W4291017687 cites W2090344392 @default.
- W4291017687 cites W2090667881 @default.
- W4291017687 cites W2091777602 @default.
- W4291017687 cites W2096410194 @default.
- W4291017687 cites W2099707172 @default.
- W4291017687 cites W2100238259 @default.
- W4291017687 cites W2107912813 @default.
- W4291017687 cites W2109721582 @default.
- W4291017687 cites W2109825315 @default.
- W4291017687 cites W2113018255 @default.
- W4291017687 cites W2113878642 @default.
- W4291017687 cites W2131694312 @default.
- W4291017687 cites W2132219316 @default.
- W4291017687 cites W2135089739 @default.
- W4291017687 cites W2135625884 @default.
- W4291017687 cites W2136777381 @default.
- W4291017687 cites W2138630447 @default.
- W4291017687 cites W2142604867 @default.
- W4291017687 cites W2143726250 @default.
- W4291017687 cites W2144934389 @default.
- W4291017687 cites W2146028503 @default.
- W4291017687 cites W2150959949 @default.
- W4291017687 cites W2154462988 @default.
- W4291017687 cites W2155490398 @default.
- W4291017687 cites W2173732482 @default.
- W4291017687 cites W2461540377 @default.
- W4291017687 cites W2493667413 @default.
- W4291017687 cites W2501848746 @default.
- W4291017687 cites W2560975170 @default.
- W4291017687 cites W2574133781 @default.
- W4291017687 cites W2582697902 @default.
- W4291017687 cites W2770677113 @default.
- W4291017687 cites W2804435636 @default.
- W4291017687 cites W2809850832 @default.
- W4291017687 cites W2810934702 @default.
- W4291017687 cites W2891646971 @default.
- W4291017687 cites W2902331361 @default.
- W4291017687 cites W2903955444 @default.
- W4291017687 cites W2910721672 @default.
- W4291017687 cites W2922407406 @default.
- W4291017687 cites W2951893579 @default.
- W4291017687 cites W2954821630 @default.
- W4291017687 cites W2963123289 @default.
- W4291017687 cites W2963916141 @default.
- W4291017687 cites W2980059354 @default.
- W4291017687 cites W2980272550 @default.
- W4291017687 cites W2986984493 @default.
- W4291017687 cites W2999044305 @default.
- W4291017687 cites W3004880395 @default.
- W4291017687 cites W3019956990 @default.
- W4291017687 cites W3027273802 @default.
- W4291017687 cites W3034996672 @default.
- W4291017687 cites W3035786310 @default.
- W4291017687 cites W3036227754 @default.
- W4291017687 cites W3036449824 @default.
- W4291017687 cites W3037876060 @default.
- W4291017687 cites W3038746171 @default.
- W4291017687 cites W3041509768 @default.
- W4291017687 cites W3048428396 @default.
- W4291017687 cites W3081718598 @default.
- W4291017687 cites W3098741556 @default.
- W4291017687 cites W3102476541 @default.
- W4291017687 cites W3104596207 @default.
- W4291017687 cites W3113123910 @default.
- W4291017687 cites W3118380717 @default.
- W4291017687 cites W31231482 @default.
- W4291017687 cites W3127499951 @default.