Matches in SemOpenAlex for { <https://semopenalex.org/work/W1972447769> ?p ?o ?g. }
- W1972447769 endingPage "176" @default.
- W1972447769 startingPage "164" @default.
- W1972447769 abstract "Data mining and knowledge discovery techniques have greatly progressed in the last decade. They are now able to handle larger and larger datasets, process heterogeneous information, integrate complex metadata, and extract and visualize new knowledge. Often these advances were driven by new challenges arising from real-world domains, with biology and biotechnology a prime source of diverse and hard (e.g., high volume, high throughput, high variety, and high noise) data analytics problems. The aim of this article is to show the broad spectrum of data mining tasks and challenges present in biological data, and how these challenges have driven us over the years to design new data mining and knowledge discovery procedures for biodata. This is illustrated with the help of two kinds of case studies. The first kind is focused on the field of protein structure prediction, where we have contributed in several areas: by designing, through regression, functions that can distinguish between good and bad models of a protein's predicted structure; by creating new measures to characterize aspects of a protein's structure associated with individual positions in a protein's sequence, measures containing information that might be useful for protein structure prediction; and by creating accurate estimators of these structural aspects. The second kind of case study is focused on omics data analytics, a class of biological data characterized for having extremely high dimensionalities. Our methods were able not only to generate very accurate classification models, but also to discover new biological knowledge that was later ratified by experimentalists. Finally, we describe several strategies to tightly integrate knowledge extraction and data mining in order to create a new class of biodata mining algorithms that can natively embrace the complexity of biological data, efficiently generate accurate information in the form of classification/regression models, and extract valuable new knowledge. Thus, a complete data-to-information-to-knowledge pipeline is presented." @default.
- W1972447769 created "2016-06-24" @default.
- W1972447769 creator A5012372361 @default.
- W1972447769 creator A5043620976 @default.
- W1972447769 creator A5062947296 @default.
- W1972447769 creator A5091154883 @default.
- W1972447769 date "2014-09-01" @default.
- W1972447769 modified "2023-09-27" @default.
- W1972447769 title "Hard Data Analytics Problems Make for Better Data Analysis Algorithms: Bioinformatics as an Example" @default.
- W1972447769 cites W112811424 @default.
- W1972447769 cites W1906461790 @default.
- W1972447769 cites W1983789522 @default.
- W1972447769 cites W1986162418 @default.
- W1972447769 cites W2001903569 @default.
- W1972447769 cites W2005336557 @default.
- W1972447769 cites W2008545402 @default.
- W1972447769 cites W2019492310 @default.
- W1972447769 cites W2045761815 @default.
- W1972447769 cites W2067811518 @default.
- W1972447769 cites W2071659396 @default.
- W1972447769 cites W2073338313 @default.
- W1972447769 cites W2107498970 @default.
- W1972447769 cites W2116172285 @default.
- W1972447769 cites W2117088188 @default.
- W1972447769 cites W2124371326 @default.
- W1972447769 cites W2127142405 @default.
- W1972447769 cites W2129560785 @default.
- W1972447769 cites W2134876494 @default.
- W1972447769 cites W2141097399 @default.
- W1972447769 cites W2143035592 @default.
- W1972447769 cites W2146739527 @default.
- W1972447769 cites W2147649238 @default.
- W1972447769 cites W2171777347 @default.
- W1972447769 cites W2171930212 @default.
- W1972447769 cites W2237837248 @default.
- W1972447769 cites W42416252 @default.
- W1972447769 doi "https://doi.org/10.1089/big.2014.0023" @default.
- W1972447769 hasPubMedCentralId "https://www.ncbi.nlm.nih.gov/pmc/articles/4174911" @default.
- W1972447769 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/25276500" @default.
- W1972447769 hasPublicationYear "2014" @default.
- W1972447769 type Work @default.
- W1972447769 sameAs 1972447769 @default.
- W1972447769 citedByCount "13" @default.
- W1972447769 countsByYear W19724477692015 @default.
- W1972447769 countsByYear W19724477692016 @default.
- W1972447769 countsByYear W19724477692017 @default.
- W1972447769 countsByYear W19724477692018 @default.
- W1972447769 countsByYear W19724477692019 @default.
- W1972447769 countsByYear W19724477692020 @default.
- W1972447769 countsByYear W19724477692022 @default.
- W1972447769 countsByYear W19724477692023 @default.
- W1972447769 crossrefType "journal-article" @default.
- W1972447769 hasAuthorship W1972447769A5012372361 @default.
- W1972447769 hasAuthorship W1972447769A5043620976 @default.
- W1972447769 hasAuthorship W1972447769A5062947296 @default.
- W1972447769 hasAuthorship W1972447769A5091154883 @default.
- W1972447769 hasBestOaLocation W19724477691 @default.
- W1972447769 hasConcept C111919701 @default.
- W1972447769 hasConcept C120567893 @default.
- W1972447769 hasConcept C124101348 @default.
- W1972447769 hasConcept C136197465 @default.
- W1972447769 hasConcept C136764020 @default.
- W1972447769 hasConcept C154945302 @default.
- W1972447769 hasConcept C175801342 @default.
- W1972447769 hasConcept C201797286 @default.
- W1972447769 hasConcept C202444582 @default.
- W1972447769 hasConcept C2522767166 @default.
- W1972447769 hasConcept C33923547 @default.
- W1972447769 hasConcept C41008148 @default.
- W1972447769 hasConcept C60644358 @default.
- W1972447769 hasConcept C79158427 @default.
- W1972447769 hasConcept C86803240 @default.
- W1972447769 hasConcept C93518851 @default.
- W1972447769 hasConcept C9652623 @default.
- W1972447769 hasConcept C98045186 @default.
- W1972447769 hasConceptScore W1972447769C111919701 @default.
- W1972447769 hasConceptScore W1972447769C120567893 @default.
- W1972447769 hasConceptScore W1972447769C124101348 @default.
- W1972447769 hasConceptScore W1972447769C136197465 @default.
- W1972447769 hasConceptScore W1972447769C136764020 @default.
- W1972447769 hasConceptScore W1972447769C154945302 @default.
- W1972447769 hasConceptScore W1972447769C175801342 @default.
- W1972447769 hasConceptScore W1972447769C201797286 @default.
- W1972447769 hasConceptScore W1972447769C202444582 @default.
- W1972447769 hasConceptScore W1972447769C2522767166 @default.
- W1972447769 hasConceptScore W1972447769C33923547 @default.
- W1972447769 hasConceptScore W1972447769C41008148 @default.
- W1972447769 hasConceptScore W1972447769C60644358 @default.
- W1972447769 hasConceptScore W1972447769C79158427 @default.
- W1972447769 hasConceptScore W1972447769C86803240 @default.
- W1972447769 hasConceptScore W1972447769C93518851 @default.
- W1972447769 hasConceptScore W1972447769C9652623 @default.
- W1972447769 hasConceptScore W1972447769C98045186 @default.
- W1972447769 hasIssue "3" @default.
- W1972447769 hasLocation W19724477691 @default.
- W1972447769 hasLocation W19724477692 @default.
- W1972447769 hasLocation W19724477693 @default.
- W1972447769 hasLocation W19724477694 @default.