Matches in SemOpenAlex for { <https://semopenalex.org/work/W2110367804> ?p ?o ?g. }
- W2110367804 endingPage "e99982" @default.
- W2110367804 startingPage "e99982" @default.
- W2110367804 abstract "Many open problems in bioinformatics involve elucidating underlying functional signals in biological sequences. DNA sequences, in particular, are characterized by rich architectures in which functional signals are increasingly found to combine local and distal interactions at the nucleotide level. Problems of interest include detection of regulatory regions, splice sites, exons, hypersensitive sites, and more. These problems naturally lend themselves to formulation as classification problems in machine learning. When classification is based on features extracted from the sequences under investigation, success is critically dependent on the chosen set of features.We present an algorithmic framework (EFFECT) for automated detection of functional signals in biological sequences. We focus here on classification problems involving DNA sequences which state-of-the-art work in machine learning shows to be challenging and involve complex combinations of local and distal features. EFFECT uses a two-stage process to first construct a set of candidate sequence-based features and then select a most effective subset for the classification task at hand. Both stages make heavy use of evolutionary algorithms to efficiently guide the search towards informative features capable of discriminating between sequences that contain a particular functional signal and those that do not.To demonstrate its generality, EFFECT is applied to three separate problems of importance in DNA research: the recognition of hypersensitive sites, splice sites, and ALU sites. Comparisons with state-of-the-art algorithms show that the framework is both general and powerful. In addition, a detailed analysis of the constructed features shows that they contain valuable biological information about DNA architecture, allowing biologists and other researchers to directly inspect the features and potentially use the insights obtained to assist wet-laboratory studies on retainment or modification of a specific signal. Code, documentation, and all data for the applications presented here are provided for the community at http://www.cs.gmu.edu/~ashehu/?q=OurTools." @default.
- W2110367804 created "2016-06-24" @default.
- W2110367804 creator A5044722808 @default.
- W2110367804 creator A5057162840 @default.
- W2110367804 creator A5076724368 @default.
- W2110367804 date "2014-07-17" @default.
- W2110367804 modified "2023-09-26" @default.
- W2110367804 title "Effective Automated Feature Construction and Selection for Classification of Biological Sequences" @default.
- W2110367804 cites W1606840064 @default.
- W2110367804 cites W1692895552 @default.
- W2110367804 cites W1963990067 @default.
- W2110367804 cites W196871588 @default.
- W2110367804 cites W1970074386 @default.
- W2110367804 cites W1970966964 @default.
- W2110367804 cites W1974414866 @default.
- W2110367804 cites W1976526581 @default.
- W2110367804 cites W1982179013 @default.
- W2110367804 cites W1983572364 @default.
- W2110367804 cites W1988848004 @default.
- W2110367804 cites W2005687228 @default.
- W2110367804 cites W2009635329 @default.
- W2110367804 cites W2015249729 @default.
- W2110367804 cites W2015870870 @default.
- W2110367804 cites W2017337590 @default.
- W2110367804 cites W2023825247 @default.
- W2110367804 cites W2024587346 @default.
- W2110367804 cites W2026686302 @default.
- W2110367804 cites W2027582332 @default.
- W2110367804 cites W2034691490 @default.
- W2110367804 cites W2035564383 @default.
- W2110367804 cites W2037330912 @default.
- W2110367804 cites W2038303173 @default.
- W2110367804 cites W2038567802 @default.
- W2110367804 cites W2039286232 @default.
- W2110367804 cites W2041956242 @default.
- W2110367804 cites W2042543706 @default.
- W2110367804 cites W2042877132 @default.
- W2110367804 cites W2060178110 @default.
- W2110367804 cites W2063634833 @default.
- W2110367804 cites W2063861171 @default.
- W2110367804 cites W2064199329 @default.
- W2110367804 cites W2066842884 @default.
- W2110367804 cites W2068140368 @default.
- W2110367804 cites W2071294216 @default.
- W2110367804 cites W2074999764 @default.
- W2110367804 cites W2086240273 @default.
- W2110367804 cites W2093799708 @default.
- W2110367804 cites W2098223336 @default.
- W2110367804 cites W2099665470 @default.
- W2110367804 cites W2099961908 @default.
- W2110367804 cites W2100253618 @default.
- W2110367804 cites W2100827887 @default.
- W2110367804 cites W2102831150 @default.
- W2110367804 cites W2103385178 @default.
- W2110367804 cites W2104139298 @default.
- W2110367804 cites W2105801262 @default.
- W2110367804 cites W2109306224 @default.
- W2110367804 cites W2110400205 @default.
- W2110367804 cites W2110863141 @default.
- W2110367804 cites W2112857266 @default.
- W2110367804 cites W2114090084 @default.
- W2110367804 cites W2117135157 @default.
- W2110367804 cites W2117549173 @default.
- W2110367804 cites W2122818453 @default.
- W2110367804 cites W2125062296 @default.
- W2110367804 cites W2125903555 @default.
- W2110367804 cites W2126193532 @default.
- W2110367804 cites W2129061533 @default.
- W2110367804 cites W2130759652 @default.
- W2110367804 cites W2135190479 @default.
- W2110367804 cites W2141408320 @default.
- W2110367804 cites W2143861446 @default.
- W2110367804 cites W2143935213 @default.
- W2110367804 cites W2151945801 @default.
- W2110367804 cites W2152657434 @default.
- W2110367804 cites W2154855927 @default.
- W2110367804 cites W2155014107 @default.
- W2110367804 cites W2156909104 @default.
- W2110367804 cites W2158544517 @default.
- W2110367804 cites W2159164310 @default.
- W2110367804 cites W2159833474 @default.
- W2110367804 cites W2166187656 @default.
- W2110367804 cites W2166204890 @default.
- W2110367804 cites W2168909179 @default.
- W2110367804 cites W2169157770 @default.
- W2110367804 cites W2177510297 @default.
- W2110367804 cites W2419410951 @default.
- W2110367804 cites W4245200611 @default.
- W2110367804 cites W4247292500 @default.
- W2110367804 cites W4256291942 @default.
- W2110367804 doi "https://doi.org/10.1371/journal.pone.0099982" @default.
- W2110367804 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/4102475" @default.
- W2110367804 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/25033270" @default.
- W2110367804 hasPublicationYear "2014" @default.
- W2110367804 type Work @default.
- W2110367804 sameAs 2110367804 @default.
- W2110367804 citedByCount "45" @default.
- W2110367804 countsByYear W21103678042014 @default.