Matches in SemOpenAlex for { <https://semopenalex.org/work/W2133736457> ?p ?o ?g. }
- W2133736457 abstract "The secondary structure of RNA molecules is intimately related to their function and often more conserved than the sequence. Hence, the important task of searching databases for RNAs requires to match sequence-structure patterns. Unfortunately, current tools for this task have, in the best case, a running time that is only linear in the size of sequence databases. Furthermore, established index data structures for fast sequence matching, like suffix trees or arrays, cannot benefit from the complementarity constraints introduced by the secondary structure of RNAs.We present a novel method and readily applicable software for time efficient matching of RNA sequence-structure patterns in sequence databases. Our approach is based on affix arrays, a recently introduced index data structure, preprocessed from the target database. Affix arrays support bidirectional pattern search, which is required for efficiently handling the structural constraints of the pattern. Structural patterns like stem-loops can be matched inside out, such that the loop region is matched first and then the pairing bases on the boundaries are matched consecutively. This allows to exploit base pairing information for search space reduction and leads to an expected running time that is sublinear in the size of the sequence database. The incorporation of a new chaining approach in the search of RNA sequence-structure patterns enables the description of molecules folding into complex secondary structures with multiple ordered patterns. The chaining approach removes spurious matches from the set of intermediate results, in particular of patterns with little specificity. In benchmark experiments on the Rfam database, our method runs up to two orders of magnitude faster than previous methods.The presented method's sublinear expected running time makes it well suited for RNA sequence-structure pattern matching in large sequence databases. RNA molecules containing several stem-loop substructures can be described by multiple sequence-structure patterns and their matches are efficiently handled by a novel chaining method. Beyond our algorithmic contributions, we provide with Structator a complete and robust open-source software solution for index-based search of RNA sequence-structure patterns. The Structator software is available at http://www.zbh.uni-hamburg.de/Structator." @default.
- W2133736457 created "2016-06-24" @default.
- W2133736457 creator A5006524527 @default.
- W2133736457 creator A5017672883 @default.
- W2133736457 creator A5054057647 @default.
- W2133736457 creator A5060071039 @default.
- W2133736457 creator A5068139203 @default.
- W2133736457 date "2011-05-27" @default.
- W2133736457 modified "2023-10-14" @default.
- W2133736457 title "Structator: fast index-based search for RNA sequence-structure patterns" @default.
- W2133736457 cites W1489909987 @default.
- W2133736457 cites W1497529086 @default.
- W2133736457 cites W1530664063 @default.
- W2133736457 cites W1531414226 @default.
- W2133736457 cites W1549037892 @default.
- W2133736457 cites W1585306636 @default.
- W2133736457 cites W1853093403 @default.
- W2133736457 cites W1950410402 @default.
- W2133736457 cites W1967864248 @default.
- W2133736457 cites W1976021976 @default.
- W2133736457 cites W1976765586 @default.
- W2133736457 cites W1981549166 @default.
- W2133736457 cites W1990853496 @default.
- W2133736457 cites W2009736453 @default.
- W2133736457 cites W2017265532 @default.
- W2133736457 cites W2018216725 @default.
- W2133736457 cites W2020982416 @default.
- W2133736457 cites W2024177760 @default.
- W2133736457 cites W2065398612 @default.
- W2133736457 cites W2095145214 @default.
- W2133736457 cites W2105070391 @default.
- W2133736457 cites W2109062349 @default.
- W2133736457 cites W2111491633 @default.
- W2133736457 cites W2111773652 @default.
- W2133736457 cites W2115215562 @default.
- W2133736457 cites W2121918723 @default.
- W2133736457 cites W2122035121 @default.
- W2133736457 cites W2128356721 @default.
- W2133736457 cites W2133279153 @default.
- W2133736457 cites W2134283755 @default.
- W2133736457 cites W2137640371 @default.
- W2133736457 cites W2139270299 @default.
- W2133736457 cites W2140651796 @default.
- W2133736457 cites W2141670298 @default.
- W2133736457 cites W2143394857 @default.
- W2133736457 cites W2143407915 @default.
- W2133736457 cites W2146120203 @default.
- W2133736457 cites W2149379826 @default.
- W2133736457 cites W2154531880 @default.
- W2133736457 cites W2154845231 @default.
- W2133736457 cites W2158874082 @default.
- W2133736457 cites W2159647614 @default.
- W2133736457 cites W2159707167 @default.
- W2133736457 cites W2170157200 @default.
- W2133736457 cites W2170748882 @default.
- W2133736457 cites W2413886249 @default.
- W2133736457 cites W2911967969 @default.
- W2133736457 cites W938539187 @default.
- W2133736457 doi "https://doi.org/10.1186/1471-2105-12-214" @default.
- W2133736457 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/3154205" @default.
- W2133736457 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/21619640" @default.
- W2133736457 hasPublicationYear "2011" @default.
- W2133736457 type Work @default.
- W2133736457 sameAs 2133736457 @default.
- W2133736457 citedByCount "28" @default.
- W2133736457 countsByYear W21337364572012 @default.
- W2133736457 countsByYear W21337364572013 @default.
- W2133736457 countsByYear W21337364572014 @default.
- W2133736457 countsByYear W21337364572015 @default.
- W2133736457 countsByYear W21337364572016 @default.
- W2133736457 countsByYear W21337364572017 @default.
- W2133736457 countsByYear W21337364572022 @default.
- W2133736457 crossrefType "journal-article" @default.
- W2133736457 hasAuthorship W2133736457A5006524527 @default.
- W2133736457 hasAuthorship W2133736457A5017672883 @default.
- W2133736457 hasAuthorship W2133736457A5054057647 @default.
- W2133736457 hasAuthorship W2133736457A5060071039 @default.
- W2133736457 hasAuthorship W2133736457A5068139203 @default.
- W2133736457 hasBestOaLocation W21337364571 @default.
- W2133736457 hasConcept C11413529 @default.
- W2133736457 hasConcept C124101348 @default.
- W2133736457 hasConcept C154945302 @default.
- W2133736457 hasConcept C15744967 @default.
- W2133736457 hasConcept C2778112365 @default.
- W2133736457 hasConcept C41008148 @default.
- W2133736457 hasConcept C49020025 @default.
- W2133736457 hasConcept C542102704 @default.
- W2133736457 hasConcept C54355233 @default.
- W2133736457 hasConcept C68859911 @default.
- W2133736457 hasConcept C86803240 @default.
- W2133736457 hasConceptScore W2133736457C11413529 @default.
- W2133736457 hasConceptScore W2133736457C124101348 @default.
- W2133736457 hasConceptScore W2133736457C154945302 @default.
- W2133736457 hasConceptScore W2133736457C15744967 @default.
- W2133736457 hasConceptScore W2133736457C2778112365 @default.
- W2133736457 hasConceptScore W2133736457C41008148 @default.
- W2133736457 hasConceptScore W2133736457C49020025 @default.
- W2133736457 hasConceptScore W2133736457C542102704 @default.
- W2133736457 hasConceptScore W2133736457C54355233 @default.
- W2133736457 hasConceptScore W2133736457C68859911 @default.