Matches in SemOpenAlex for { <https://semopenalex.org/work/W2887480377> ?p ?o ?g. }
Showing items 1 to 85 of
85
with 100 items per page.
- W2887480377 abstract "Pathogens such as bacteria and viruses are leading causes of disease worldwide,which makes it essential to identify them in DNA samples. Instead of analysing rawDNA sequences, mathematical models based on Variable Length Markov Chains(VLMCs), known as Genomic signatures, make it possible to classify DNA samplesfaster than with traditional alignment-based methods. To analyse a set of genomicsignatures, we use clustering, which is an unsupervised machine-learning method.For the clustering of VLMCs, an accurate and fast similarity measure (distancefunction) is needed.To analyse distance functions and clusters, we define metrics based primarily onthe taxonomic ranks of the underlying organisms. For the distance functions, weprimarily analysed whether the VLMCs within the same taxonomic rank were closestto each other. For the cluster analysis, we use the silhouette metric to determinehow well separated the clusters are and define the average percentages, sensitivity,and specificity of the captured taxonomic ranks.We present a new distance function for VLMCs, called Frobenius-intersection, whichcorrelates accurately with the well-known Kullback-Liebler distance function, whilealso being several orders of magnitude faster. We use average-link clustering togetherwith the Frobenius-intersection distance to cluster data sets of known virusesand bacteria with relatively short DNA sequences. The clusters of VLMCs correspondaccurately to the Baltimore types of the viruses as well as the viruses’ andbacteria’s taxonomic families. However, most of the classifications of viruses are alsosubdivided into multiple clusters. Moreover, when combining the set of bacteria andviruses, the clusters start to mix the viruses and bacteria before finding all of thetaxonomic families.The clustering of the genomic signatures is accurate with respect to, for instance,taxonomic ordering. Therefore, it can help in identifying unclassified pathogens.Future research may reveal other causes of similarity between the genomic signatures." @default.
- W2887480377 created "2018-08-22" @default.
- W2887480377 creator A5072871153 @default.
- W2887480377 creator A5088672317 @default.
- W2887480377 date "2018-01-01" @default.
- W2887480377 modified "2023-09-24" @default.
- W2887480377 title "Clustering genomic signatures A new distance measure for variable length Markov chains" @default.
- W2887480377 hasPublicationYear "2018" @default.
- W2887480377 type Work @default.
- W2887480377 sameAs 2887480377 @default.
- W2887480377 citedByCount "0" @default.
- W2887480377 crossrefType "dissertation" @default.
- W2887480377 hasAuthorship W2887480377A5072871153 @default.
- W2887480377 hasAuthorship W2887480377A5088672317 @default.
- W2887480377 hasConcept C103278499 @default.
- W2887480377 hasConcept C105795698 @default.
- W2887480377 hasConcept C115961682 @default.
- W2887480377 hasConcept C14036430 @default.
- W2887480377 hasConcept C153180895 @default.
- W2887480377 hasConcept C154945302 @default.
- W2887480377 hasConcept C162324750 @default.
- W2887480377 hasConcept C176217482 @default.
- W2887480377 hasConcept C177264268 @default.
- W2887480377 hasConcept C199360897 @default.
- W2887480377 hasConcept C205649164 @default.
- W2887480377 hasConcept C21547014 @default.
- W2887480377 hasConcept C33923547 @default.
- W2887480377 hasConcept C41008148 @default.
- W2887480377 hasConcept C54355233 @default.
- W2887480377 hasConcept C58103923 @default.
- W2887480377 hasConcept C58640448 @default.
- W2887480377 hasConcept C64543145 @default.
- W2887480377 hasConcept C70721500 @default.
- W2887480377 hasConcept C73555534 @default.
- W2887480377 hasConcept C86803240 @default.
- W2887480377 hasConcept C98763669 @default.
- W2887480377 hasConceptScore W2887480377C103278499 @default.
- W2887480377 hasConceptScore W2887480377C105795698 @default.
- W2887480377 hasConceptScore W2887480377C115961682 @default.
- W2887480377 hasConceptScore W2887480377C14036430 @default.
- W2887480377 hasConceptScore W2887480377C153180895 @default.
- W2887480377 hasConceptScore W2887480377C154945302 @default.
- W2887480377 hasConceptScore W2887480377C162324750 @default.
- W2887480377 hasConceptScore W2887480377C176217482 @default.
- W2887480377 hasConceptScore W2887480377C177264268 @default.
- W2887480377 hasConceptScore W2887480377C199360897 @default.
- W2887480377 hasConceptScore W2887480377C205649164 @default.
- W2887480377 hasConceptScore W2887480377C21547014 @default.
- W2887480377 hasConceptScore W2887480377C33923547 @default.
- W2887480377 hasConceptScore W2887480377C41008148 @default.
- W2887480377 hasConceptScore W2887480377C54355233 @default.
- W2887480377 hasConceptScore W2887480377C58103923 @default.
- W2887480377 hasConceptScore W2887480377C58640448 @default.
- W2887480377 hasConceptScore W2887480377C64543145 @default.
- W2887480377 hasConceptScore W2887480377C70721500 @default.
- W2887480377 hasConceptScore W2887480377C73555534 @default.
- W2887480377 hasConceptScore W2887480377C86803240 @default.
- W2887480377 hasConceptScore W2887480377C98763669 @default.
- W2887480377 hasLocation W28874803771 @default.
- W2887480377 hasOpenAccess W2887480377 @default.
- W2887480377 hasPrimaryLocation W28874803771 @default.
- W2887480377 hasRelatedWork W1821736210 @default.
- W2887480377 hasRelatedWork W1885350542 @default.
- W2887480377 hasRelatedWork W1969234006 @default.
- W2887480377 hasRelatedWork W1996640618 @default.
- W2887480377 hasRelatedWork W2018066308 @default.
- W2887480377 hasRelatedWork W2021267107 @default.
- W2887480377 hasRelatedWork W2030609175 @default.
- W2887480377 hasRelatedWork W2059327440 @default.
- W2887480377 hasRelatedWork W2087286364 @default.
- W2887480377 hasRelatedWork W2110734043 @default.
- W2887480377 hasRelatedWork W2152320416 @default.
- W2887480377 hasRelatedWork W2352569578 @default.
- W2887480377 hasRelatedWork W2468175122 @default.
- W2887480377 hasRelatedWork W2616357386 @default.
- W2887480377 hasRelatedWork W2804499622 @default.
- W2887480377 hasRelatedWork W2990840231 @default.
- W2887480377 hasRelatedWork W3097580977 @default.
- W2887480377 hasRelatedWork W3120796308 @default.
- W2887480377 hasRelatedWork W3194953835 @default.
- W2887480377 hasRelatedWork W2115927566 @default.
- W2887480377 isParatext "false" @default.
- W2887480377 isRetracted "false" @default.
- W2887480377 magId "2887480377" @default.
- W2887480377 workType "dissertation" @default.