Matches in SemOpenAlex for { <https://semopenalex.org/work/W2100038328> ?p ?o ?g. }
- W2100038328 abstract "Abstract Background The k -mer hash length is a key factor affecting the output of de novo transcriptome assembly packages using de Bruijn graph algorithms. Assemblies constructed with varying single k -mer choices might result in the loss of unique contiguous sequences (contigs) and relevant biological information. A common solution to this problem is the clustering of single k -mer assemblies. Even though annotation is one of the primary goals of a transcriptome assembly, the success of assembly strategies does not consider the impact of k -mer selection on the annotation output. This study provides an in-depth k -mer selection analysis that is focused on the degree of functional annotation achieved for a non-model organism where no reference genome information is available. Individual k -mers and clustered assemblies (CA) were considered using three representative software packages. Pair-wise comparison analyses (between individual k -mers and CAs) were produced to reveal missing Kyoto Encyclopedia of Genes and Genomes (KEGG) ortholog identifiers (KOIs), and to determine a strategy that maximizes the recovery of biological information in a de novo transcriptome assembly. Results Analyses of single k -mer assemblies resulted in the generation of various quantities of contigs and functional annotations within the selection window of k -mers ( k- 19 to k- 63). For each k -mer in this window, generated assemblies contained certain unique contigs and KOIs that were not present in the other k -mer assemblies. Producing a non-redundant CA of k -mers 19 to 63 resulted in a more complete functional annotation than any single k -mer assembly. However, a fraction of unique annotations remained (~0.19 to 0.27% of total KOIs) in the assemblies of individual k -mers ( k- 19 to k- 63) that were not present in the non-redundant CA. A workflow to recover these unique annotations is presented. Conclusions This study demonstrated that different k -mer choices result in various quantities of unique contigs per single k -mer assembly which affects biological information that is retrievable from the transcriptome. This undesirable effect can be minimized, but not eliminated, with clustering of multi- k assemblies with redundancy removal. The complete extraction of biological information in de novo transcriptomics studies requires both the production of a CA and efforts to identify unique contigs that are present in individual k -mer assemblies but not in the CA." @default.
- W2100038328 created "2016-06-24" @default.
- W2100038328 creator A5035681909 @default.
- W2100038328 creator A5057853675 @default.
- W2100038328 creator A5063527759 @default.
- W2100038328 creator A5075958882 @default.
- W2100038328 date "2012-07-18" @default.
- W2100038328 modified "2023-10-14" @default.
- W2100038328 title "Optimization of de novo transcriptome assembly from high-throughput short read sequencing data improves functional annotation for non-model organisms" @default.
- W2100038328 cites W1971899779 @default.
- W2100038328 cites W2000894445 @default.
- W2100038328 cites W2017416733 @default.
- W2100038328 cites W2033083661 @default.
- W2100038328 cites W2039311276 @default.
- W2100038328 cites W2047281856 @default.
- W2100038328 cites W2097205777 @default.
- W2100038328 cites W2098423662 @default.
- W2100038328 cites W2101972919 @default.
- W2100038328 cites W2112888168 @default.
- W2100038328 cites W2124985265 @default.
- W2100038328 cites W2126419817 @default.
- W2100038328 cites W2128222086 @default.
- W2100038328 cites W2133579817 @default.
- W2100038328 cites W2140544891 @default.
- W2100038328 cites W2156125289 @default.
- W2100038328 cites W2160525528 @default.
- W2100038328 cites W2160969485 @default.
- W2100038328 cites W2161546116 @default.
- W2100038328 cites W2167719367 @default.
- W2100038328 doi "https://doi.org/10.1186/1471-2105-13-170" @default.
- W2100038328 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/3489510" @default.
- W2100038328 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/22808927" @default.
- W2100038328 hasPublicationYear "2012" @default.
- W2100038328 type Work @default.
- W2100038328 sameAs 2100038328 @default.
- W2100038328 citedByCount "32" @default.
- W2100038328 countsByYear W21000383282013 @default.
- W2100038328 countsByYear W21000383282014 @default.
- W2100038328 countsByYear W21000383282015 @default.
- W2100038328 countsByYear W21000383282016 @default.
- W2100038328 countsByYear W21000383282017 @default.
- W2100038328 countsByYear W21000383282018 @default.
- W2100038328 countsByYear W21000383282019 @default.
- W2100038328 countsByYear W21000383282020 @default.
- W2100038328 countsByYear W21000383282021 @default.
- W2100038328 countsByYear W21000383282022 @default.
- W2100038328 crossrefType "journal-article" @default.
- W2100038328 hasAuthorship W2100038328A5035681909 @default.
- W2100038328 hasAuthorship W2100038328A5057853675 @default.
- W2100038328 hasAuthorship W2100038328A5063527759 @default.
- W2100038328 hasAuthorship W2100038328A5075958882 @default.
- W2100038328 hasBestOaLocation W21000383281 @default.
- W2100038328 hasConcept C104317684 @default.
- W2100038328 hasConcept C132525143 @default.
- W2100038328 hasConcept C141231307 @default.
- W2100038328 hasConcept C150194340 @default.
- W2100038328 hasConcept C152724338 @default.
- W2100038328 hasConcept C154945302 @default.
- W2100038328 hasConcept C162317418 @default.
- W2100038328 hasConcept C18949551 @default.
- W2100038328 hasConcept C20218877 @default.
- W2100038328 hasConcept C2776321320 @default.
- W2100038328 hasConcept C41008148 @default.
- W2100038328 hasConcept C50489715 @default.
- W2100038328 hasConcept C54355233 @default.
- W2100038328 hasConcept C59582021 @default.
- W2100038328 hasConcept C70721500 @default.
- W2100038328 hasConcept C80444323 @default.
- W2100038328 hasConcept C81917197 @default.
- W2100038328 hasConcept C86803240 @default.
- W2100038328 hasConceptScore W2100038328C104317684 @default.
- W2100038328 hasConceptScore W2100038328C132525143 @default.
- W2100038328 hasConceptScore W2100038328C141231307 @default.
- W2100038328 hasConceptScore W2100038328C150194340 @default.
- W2100038328 hasConceptScore W2100038328C152724338 @default.
- W2100038328 hasConceptScore W2100038328C154945302 @default.
- W2100038328 hasConceptScore W2100038328C162317418 @default.
- W2100038328 hasConceptScore W2100038328C18949551 @default.
- W2100038328 hasConceptScore W2100038328C20218877 @default.
- W2100038328 hasConceptScore W2100038328C2776321320 @default.
- W2100038328 hasConceptScore W2100038328C41008148 @default.
- W2100038328 hasConceptScore W2100038328C50489715 @default.
- W2100038328 hasConceptScore W2100038328C54355233 @default.
- W2100038328 hasConceptScore W2100038328C59582021 @default.
- W2100038328 hasConceptScore W2100038328C70721500 @default.
- W2100038328 hasConceptScore W2100038328C80444323 @default.
- W2100038328 hasConceptScore W2100038328C81917197 @default.
- W2100038328 hasConceptScore W2100038328C86803240 @default.
- W2100038328 hasIssue "1" @default.
- W2100038328 hasLocation W21000383281 @default.
- W2100038328 hasLocation W21000383282 @default.
- W2100038328 hasLocation W21000383283 @default.
- W2100038328 hasLocation W21000383284 @default.
- W2100038328 hasLocation W21000383285 @default.
- W2100038328 hasOpenAccess W2100038328 @default.
- W2100038328 hasPrimaryLocation W21000383281 @default.
- W2100038328 hasRelatedWork W2038319227 @default.
- W2100038328 hasRelatedWork W2049364822 @default.
- W2100038328 hasRelatedWork W2054081168 @default.
- W2100038328 hasRelatedWork W2119121014 @default.