Matches in SemOpenAlex for { <https://semopenalex.org/work/W2760930003> ?p ?o ?g. }
- W2760930003 abstract "Diseases like cancer can manifest themselves through changes in protein abundance, and microRNAs (miRNAs) play a key role in the modulation of protein quantity. MicroRNAs are used throughout all kingdoms and have been shown to be exploited by viruses to modulate their host environment. Since the experimental detection of miRNAs is difficult, computational methods have been developed. Many such tools employ machine learning for pre-miRNA detection, and many features for miRNA parameterization have been proposed. To train machine learning models, negative data is of importance yet hard to come by; therefore, we recently started to employ pre-miRNAs from one species as positive data versus another species’ pre-miRNAs as negative examples based on sequence motifs and k-mers. Here, we introduce the additional usage of information-theoretic (IT) features. Pre-miRNAs from one species were used as positive and another species’ pre-miRNAs as negative training data for machine learning. The categorization capability of IT and k-mer features was investigated. Both feature sets and their combinations yielded a very high accuracy, which is as good as the previously suggested sequence motif and k-mer based method. However, for obtaining a high performance, a sufficiently large phylogenetic distance between the species and sufficiently high number of pre-miRNAs in the training set is required. To examine the contribution of the IT and k-mer features, an information gain-based feature ranking was performed. Although the top 3 are IT features, 80% of the top 100 features are k-mers. The comparison of all three individual approaches (motifs, IT, and k-mers) shows that the distinction of species based on their pre-miRNAs k-mers are sufficient. IT sequence feature extraction enables the distinction among species and is less computationally expensive than motif calculations. However, since IT features need larger amounts of data to have enough statistics for producing highly accurate results, future categorization into species can be effectively done using k-mers only. The biological reasoning for this is the existence of a codon bias between species which can, at least, be observed in exonic miRNAs. Future work in this direction will be the ab initio detection of pre-miRNA. In addition, prediction of pre-miRNA from RNA-seq can be done." @default.
- W2760930003 created "2017-10-20" @default.
- W2760930003 creator A5043036637 @default.
- W2760930003 creator A5052454991 @default.
- W2760930003 creator A5066611657 @default.
- W2760930003 creator A5084678415 @default.
- W2760930003 creator A5091900270 @default.
- W2760930003 date "2017-10-13" @default.
- W2760930003 modified "2023-10-02" @default.
- W2760930003 title "Categorization of species based on their microRNAs employing sequence motifs, information-theoretic sequence feature extraction, and k-mers" @default.
- W2760930003 cites W1845151788 @default.
- W2760930003 cites W190896647 @default.
- W2760930003 cites W1920564293 @default.
- W2760930003 cites W1954076721 @default.
- W2760930003 cites W1965157323 @default.
- W2760930003 cites W1965555277 @default.
- W2760930003 cites W1966833356 @default.
- W2760930003 cites W1984385688 @default.
- W2760930003 cites W1995082021 @default.
- W2760930003 cites W1995875735 @default.
- W2760930003 cites W2006676204 @default.
- W2760930003 cites W2016106370 @default.
- W2760930003 cites W2017426710 @default.
- W2760930003 cites W2023968607 @default.
- W2760930003 cites W2030573509 @default.
- W2760930003 cites W2046584841 @default.
- W2760930003 cites W2051535323 @default.
- W2760930003 cites W2063555458 @default.
- W2760930003 cites W2067926173 @default.
- W2760930003 cites W2070310544 @default.
- W2760930003 cites W2093213907 @default.
- W2760930003 cites W2094019030 @default.
- W2760930003 cites W2107961632 @default.
- W2760930003 cites W2109553965 @default.
- W2760930003 cites W2109923675 @default.
- W2760930003 cites W2112192177 @default.
- W2760930003 cites W2113466460 @default.
- W2760930003 cites W2114973325 @default.
- W2760930003 cites W2120583228 @default.
- W2760930003 cites W2124351063 @default.
- W2760930003 cites W2125068693 @default.
- W2760930003 cites W2135995702 @default.
- W2760930003 cites W2138154105 @default.
- W2760930003 cites W2138822106 @default.
- W2760930003 cites W2140446903 @default.
- W2760930003 cites W2140751493 @default.
- W2760930003 cites W2144557646 @default.
- W2760930003 cites W2146757999 @default.
- W2760930003 cites W2147154770 @default.
- W2760930003 cites W2153598678 @default.
- W2760930003 cites W2154531054 @default.
- W2760930003 cites W2156847802 @default.
- W2760930003 cites W2157009395 @default.
- W2760930003 cites W2295729186 @default.
- W2760930003 cites W2308910954 @default.
- W2760930003 cites W2410565007 @default.
- W2760930003 cites W2470585343 @default.
- W2760930003 cites W2472053654 @default.
- W2760930003 cites W2478708596 @default.
- W2760930003 cites W2602423701 @default.
- W2760930003 cites W2602538792 @default.
- W2760930003 cites W4253157748 @default.
- W2760930003 doi "https://doi.org/10.1186/s13634-017-0506-8" @default.
- W2760930003 hasPublicationYear "2017" @default.
- W2760930003 type Work @default.
- W2760930003 sameAs 2760930003 @default.
- W2760930003 citedByCount "11" @default.
- W2760930003 countsByYear W27609300032018 @default.
- W2760930003 countsByYear W27609300032019 @default.
- W2760930003 countsByYear W27609300032020 @default.
- W2760930003 countsByYear W27609300032021 @default.
- W2760930003 crossrefType "journal-article" @default.
- W2760930003 hasAuthorship W2760930003A5043036637 @default.
- W2760930003 hasAuthorship W2760930003A5052454991 @default.
- W2760930003 hasAuthorship W2760930003A5066611657 @default.
- W2760930003 hasAuthorship W2760930003A5084678415 @default.
- W2760930003 hasAuthorship W2760930003A5091900270 @default.
- W2760930003 hasBestOaLocation W27609300031 @default.
- W2760930003 hasConcept C104317684 @default.
- W2760930003 hasConcept C119857082 @default.
- W2760930003 hasConcept C124101348 @default.
- W2760930003 hasConcept C138885662 @default.
- W2760930003 hasConcept C145059251 @default.
- W2760930003 hasConcept C154945302 @default.
- W2760930003 hasConcept C189430467 @default.
- W2760930003 hasConcept C193252679 @default.
- W2760930003 hasConcept C2279292 @default.
- W2760930003 hasConcept C2776401178 @default.
- W2760930003 hasConcept C2778112365 @default.
- W2760930003 hasConcept C41008148 @default.
- W2760930003 hasConcept C41895202 @default.
- W2760930003 hasConcept C51679486 @default.
- W2760930003 hasConcept C54355233 @default.
- W2760930003 hasConcept C60644358 @default.
- W2760930003 hasConcept C70721500 @default.
- W2760930003 hasConcept C86803240 @default.
- W2760930003 hasConcept C94124525 @default.
- W2760930003 hasConceptScore W2760930003C104317684 @default.
- W2760930003 hasConceptScore W2760930003C119857082 @default.
- W2760930003 hasConceptScore W2760930003C124101348 @default.