Matches in SemOpenAlex for { <https://semopenalex.org/work/W4311013516> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W4311013516 abstract "Abstract Clustering is widely used in bioinformatics and many other fields, with applications from exploratory analysis to prediction. Many types of data have associated uncertainty or measurement error, but this is rarely used to inform the clustering. We present Dirichlet Process Mixtures with Uncertainty (DPMUnc), an extension of a Bayesian nonparametric clustering algorithm which makes use of the uncertainty associated with data points. We show that DPMUnc out-performs existing methods on simulated data. We cluster immune-mediated diseases (IMD) using GWAS summary statistics, which have uncertainty linked with the sample size of the study. DPMUnc separates autoimmune from autoinflammatory diseases and isolates other subgroups such as adult-onset arthritis. We additionally consider how DPMUnc can be used to cluster gene expression datasets that have been summarised using gene signatures. We first introduce a novel procedure for generating a summary of a gene signature on a dataset different to the one where it was discovered. Since the genes in the gene signature are unlikely to be as strongly correlated as in the original dataset, it is important to quantify the variance of the gene signature for each individual. We summarise three public gene expression datasets containing patients with a range of IMD, using three relevant gene signatures. We find association between disease and the clusters returned by DPMUnc, with clustering structure replicated across the datasets. The significance of this work is two-fold. Firstly, we demonstrate that when data has associated uncertainty, this uncertainty should be used to inform clustering and we present a method which does this, DPMUnc. Secondly, we present a procedure for using gene signatures in datasets other than where they were originally defined. We show the value of this procedure by summarising gene expression data from patients with immune-mediated diseases using relevant gene signatures, and clustering these patients using DPMUnc. Author Summary Identifying groups of items that are similar to each other, a process called clustering, has a range of applications. For example, if patients split into two distinct groups this suggests that a disease may have subtypes which should be treated differently. Real data often has measurement error associated with it, but this error is frequently discarded by clustering methods. We propose a clustering method which makes use of the measurement error and use it to cluster diseases linked to the immune system. Gene expression datasets measure the activity level of all ~20,000 genes in the human genome. We propose a procedure for summarising gene expression data using gene signatures, lists of genes produced by highly focused studies. For example, a study might list the genes which increase activity after exposure to a particular virus. The genes in the gene signature may not be as tightly correlated in a new dataset, and so our procedure measures the strength of the gene signature in the new dataset, effectively defining measurement error for the summary. We summarise gene expression datasets related to the immune system using relevant gene signatures and find that our method groups patients with the same disease." @default.
- W4311013516 created "2022-12-22" @default.
- W4311013516 creator A5006577577 @default.
- W4311013516 creator A5058209508 @default.
- W4311013516 creator A5088262269 @default.
- W4311013516 date "2022-12-10" @default.
- W4311013516 modified "2023-10-15" @default.
- W4311013516 title "Bayesian clustering with uncertain data" @default.
- W4311013516 cites W2045949302 @default.
- W4311013516 cites W2071949631 @default.
- W4311013516 cites W2103056503 @default.
- W4311013516 cites W2109363337 @default.
- W4311013516 cites W2122598723 @default.
- W4311013516 cites W2126510876 @default.
- W4311013516 cites W2127371447 @default.
- W4311013516 cites W2141012957 @default.
- W4311013516 cites W2145825942 @default.
- W4311013516 cites W2150593711 @default.
- W4311013516 cites W2150932450 @default.
- W4311013516 cites W2168197335 @default.
- W4311013516 cites W2519132385 @default.
- W4311013516 cites W2592935721 @default.
- W4311013516 cites W2952892633 @default.
- W4311013516 cites W3109129987 @default.
- W4311013516 cites W3139716430 @default.
- W4311013516 cites W4230306435 @default.
- W4311013516 cites W4235169531 @default.
- W4311013516 doi "https://doi.org/10.1101/2022.12.07.519476" @default.
- W4311013516 hasPublicationYear "2022" @default.
- W4311013516 type Work @default.
- W4311013516 citedByCount "0" @default.
- W4311013516 crossrefType "posted-content" @default.
- W4311013516 hasAuthorship W4311013516A5006577577 @default.
- W4311013516 hasAuthorship W4311013516A5058209508 @default.
- W4311013516 hasAuthorship W4311013516A5088262269 @default.
- W4311013516 hasBestOaLocation W43110135161 @default.
- W4311013516 hasConcept C107673813 @default.
- W4311013516 hasConcept C124101348 @default.
- W4311013516 hasConcept C154945302 @default.
- W4311013516 hasConcept C186767784 @default.
- W4311013516 hasConcept C2781280628 @default.
- W4311013516 hasConcept C33704608 @default.
- W4311013516 hasConcept C41008148 @default.
- W4311013516 hasConcept C70721500 @default.
- W4311013516 hasConcept C73555534 @default.
- W4311013516 hasConcept C86803240 @default.
- W4311013516 hasConcept C94641424 @default.
- W4311013516 hasConceptScore W4311013516C107673813 @default.
- W4311013516 hasConceptScore W4311013516C124101348 @default.
- W4311013516 hasConceptScore W4311013516C154945302 @default.
- W4311013516 hasConceptScore W4311013516C186767784 @default.
- W4311013516 hasConceptScore W4311013516C2781280628 @default.
- W4311013516 hasConceptScore W4311013516C33704608 @default.
- W4311013516 hasConceptScore W4311013516C41008148 @default.
- W4311013516 hasConceptScore W4311013516C70721500 @default.
- W4311013516 hasConceptScore W4311013516C73555534 @default.
- W4311013516 hasConceptScore W4311013516C86803240 @default.
- W4311013516 hasConceptScore W4311013516C94641424 @default.
- W4311013516 hasLocation W43110135161 @default.
- W4311013516 hasLocation W43110135162 @default.
- W4311013516 hasOpenAccess W4311013516 @default.
- W4311013516 hasPrimaryLocation W43110135161 @default.
- W4311013516 hasRelatedWork W1990063425 @default.
- W4311013516 hasRelatedWork W2094360749 @default.
- W4311013516 hasRelatedWork W2106304879 @default.
- W4311013516 hasRelatedWork W2170098929 @default.
- W4311013516 hasRelatedWork W2409272344 @default.
- W4311013516 hasRelatedWork W2588165676 @default.
- W4311013516 hasRelatedWork W2588528840 @default.
- W4311013516 hasRelatedWork W3042958706 @default.
- W4311013516 hasRelatedWork W3197105638 @default.
- W4311013516 hasRelatedWork W4297904238 @default.
- W4311013516 isParatext "false" @default.
- W4311013516 isRetracted "false" @default.
- W4311013516 workType "article" @default.