Matches in SemOpenAlex for { <https://semopenalex.org/work/W2019912702> ?p ?o ?g. }
- W2019912702 abstract "The impact of gene annotation quality on functional and comparative genomics makes gene prediction an important process, particularly in non-model species, including many fungi. Sets of homologous protein sequences are rarely complete with respect to the fungal species of interest and are often small or unreliable, especially when closely related species have not been sequenced or annotated in detail. In these cases, protein homology-based evidence fails to correctly annotate many genes, or significantly improve ab initio predictions. Generalised hidden Markov models (GHMM) have proven to be invaluable tools in gene annotation and, recently, RNA-seq has emerged as a cost-effective means to significantly improve the quality of automated gene annotation. As these methods do not require sets of homologous proteins, improving gene prediction from these resources is of benefit to fungal researchers. While many pipelines now incorporate RNA-seq data in training GHMMs, there has been relatively little investigation into additionally combining RNA-seq data at the point of prediction, and room for improvement in this area motivates this study. CodingQuarry is a highly accurate, self-training GHMM fungal gene predictor designed to work with assembled, aligned RNA-seq transcripts. RNA-seq data informs annotations both during gene-model training and in prediction. Our approach capitalises on the high quality of fungal transcript assemblies by incorporating predictions made directly from transcript sequences. Correct predictions are made despite transcript assembly problems, including those caused by overlap between the transcripts of adjacent gene loci. Stringent benchmarking against high-confidence annotation subsets showed CodingQuarry predicted 91.3% of Schizosaccharomyces pombe genes and 90.4% of Saccharomyces cerevisiae genes perfectly. These results are 4-5% better than those of AUGUSTUS, the next best performing RNA-seq driven gene predictor tested. Comparisons against whole genome Sc. pombe and S. cerevisiae annotations further substantiate a 4-5% improvement in the number of correctly predicted genes. We demonstrate the success of a novel method of incorporating RNA-seq data into GHMM fungal gene prediction. This shows that a high quality annotation can be achieved without relying on protein homology or a training set of genes. CodingQuarry is freely available ( https://sourceforge.net/projects/codingquarry/ ), and suitable for incorporation into genome annotation pipelines." @default.
- W2019912702 created "2016-06-24" @default.
- W2019912702 creator A5004869121 @default.
- W2019912702 creator A5020423413 @default.
- W2019912702 creator A5061866289 @default.
- W2019912702 creator A5076836505 @default.
- W2019912702 date "2015-03-11" @default.
- W2019912702 modified "2023-10-10" @default.
- W2019912702 title "CodingQuarry: highly accurate hidden Markov model gene prediction in fungal genomes using RNA-seq transcripts" @default.
- W2019912702 cites W1484347389 @default.
- W2019912702 cites W1487554690 @default.
- W2019912702 cites W1511730061 @default.
- W2019912702 cites W1580768657 @default.
- W2019912702 cites W1966833356 @default.
- W2019912702 cites W1969550578 @default.
- W2019912702 cites W1971813238 @default.
- W2019912702 cites W1996007233 @default.
- W2019912702 cites W2001554597 @default.
- W2019912702 cites W2029607585 @default.
- W2019912702 cites W2032961150 @default.
- W2019912702 cites W2036897871 @default.
- W2019912702 cites W2039269704 @default.
- W2019912702 cites W2058296373 @default.
- W2019912702 cites W2062946300 @default.
- W2019912702 cites W2095691309 @default.
- W2019912702 cites W2095699986 @default.
- W2019912702 cites W2096476896 @default.
- W2019912702 cites W2096525273 @default.
- W2019912702 cites W2097065948 @default.
- W2019912702 cites W2098937968 @default.
- W2019912702 cites W2099753867 @default.
- W2019912702 cites W2103592640 @default.
- W2019912702 cites W2108108770 @default.
- W2019912702 cites W2109087868 @default.
- W2019912702 cites W2111756560 @default.
- W2019912702 cites W2116041602 @default.
- W2019912702 cites W2117688129 @default.
- W2019912702 cites W2117887553 @default.
- W2019912702 cites W2118025286 @default.
- W2019912702 cites W2118751052 @default.
- W2019912702 cites W2123788376 @default.
- W2019912702 cites W2125165191 @default.
- W2019912702 cites W2126419817 @default.
- W2019912702 cites W2128242967 @default.
- W2019912702 cites W2129296076 @default.
- W2019912702 cites W2130412392 @default.
- W2019912702 cites W2140729960 @default.
- W2019912702 cites W2141144953 @default.
- W2019912702 cites W2141165178 @default.
- W2019912702 cites W2141458291 @default.
- W2019912702 cites W2141572089 @default.
- W2019912702 cites W2142678478 @default.
- W2019912702 cites W2147526198 @default.
- W2019912702 cites W2151635271 @default.
- W2019912702 cites W2151776726 @default.
- W2019912702 cites W2154128645 @default.
- W2019912702 cites W2154171675 @default.
- W2019912702 cites W2160053034 @default.
- W2019912702 cites W2162595238 @default.
- W2019912702 cites W2162630069 @default.
- W2019912702 cites W2166098883 @default.
- W2019912702 cites W2168292723 @default.
- W2019912702 cites W2169105416 @default.
- W2019912702 cites W2171263406 @default.
- W2019912702 cites W4212920271 @default.
- W2019912702 cites W4230784961 @default.
- W2019912702 doi "https://doi.org/10.1186/s12864-015-1344-4" @default.
- W2019912702 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/4363200" @default.
- W2019912702 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/25887563" @default.
- W2019912702 hasPublicationYear "2015" @default.
- W2019912702 type Work @default.
- W2019912702 sameAs 2019912702 @default.
- W2019912702 citedByCount "131" @default.
- W2019912702 countsByYear W20199127022015 @default.
- W2019912702 countsByYear W20199127022016 @default.
- W2019912702 countsByYear W20199127022017 @default.
- W2019912702 countsByYear W20199127022018 @default.
- W2019912702 countsByYear W20199127022019 @default.
- W2019912702 countsByYear W20199127022020 @default.
- W2019912702 countsByYear W20199127022021 @default.
- W2019912702 countsByYear W20199127022022 @default.
- W2019912702 countsByYear W20199127022023 @default.
- W2019912702 crossrefType "journal-article" @default.
- W2019912702 hasAuthorship W2019912702A5004869121 @default.
- W2019912702 hasAuthorship W2019912702A5020423413 @default.
- W2019912702 hasAuthorship W2019912702A5061866289 @default.
- W2019912702 hasAuthorship W2019912702A5076836505 @default.
- W2019912702 hasBestOaLocation W20199127021 @default.
- W2019912702 hasConcept C104317684 @default.
- W2019912702 hasConcept C105565629 @default.
- W2019912702 hasConcept C107397762 @default.
- W2019912702 hasConcept C141231307 @default.
- W2019912702 hasConcept C150194340 @default.
- W2019912702 hasConcept C154945302 @default.
- W2019912702 hasConcept C162317418 @default.
- W2019912702 hasConcept C189206191 @default.
- W2019912702 hasConcept C23224414 @default.
- W2019912702 hasConcept C2776321320 @default.
- W2019912702 hasConcept C2908923196 @default.
- W2019912702 hasConcept C41008148 @default.