Matches in SemOpenAlex for { <https://semopenalex.org/work/W4288574691> ?p ?o ?g. }
Showing items 1 to 55 of
55
with 100 items per page.
- W4288574691 abstract "Mixtures of Unigrams are one of the simplest and most efficient tools for clustering textual data, as they assume that documents related to the same topic have similar distributions of terms, naturally described by Multinomials. When the classification task is particularly challenging, such as when the document-term matrix is high-dimensional and extremely sparse, a more composite representation can provide better insight on the grouping structure. In this work, we developed a deep version of mixtures of Unigrams for the unsupervised classification of very short documents with a large number of terms, by allowing for models with further deeper latent layers; the proposal is derived in a Bayesian framework. The behaviour of the Deep Mixtures of Unigrams is empirically compared with that of other traditional and state-of-the-art methods, namely $k$-means with cosine distance, $k$-means with Euclidean distance on data transformed according to Semantic Analysis, Partition Around Medoids, Mixture of Gaussians on semantic-based transformed data, hierarchical clustering according to Ward's method with cosine dissimilarity, Latent Dirichlet Allocation, Mixtures of Unigrams estimated via the EM algorithm, Spectral Clustering and Affinity Propagation clustering. The performance is evaluated in terms of both correct classification rate and Adjusted Rand Index. Simulation studies and real data analysis prove that going deep in clustering such data highly improves the classification accuracy." @default.
- W4288574691 created "2022-07-29" @default.
- W4288574691 creator A5051054730 @default.
- W4288574691 creator A5060308577 @default.
- W4288574691 date "2019-02-18" @default.
- W4288574691 modified "2023-09-23" @default.
- W4288574691 title "Deep Mixtures of Unigrams for uncovering Topics in Textual Data" @default.
- W4288574691 doi "https://doi.org/10.48550/arxiv.1902.06615" @default.
- W4288574691 hasPublicationYear "2019" @default.
- W4288574691 type Work @default.
- W4288574691 citedByCount "0" @default.
- W4288574691 crossrefType "posted-content" @default.
- W4288574691 hasAuthorship W4288574691A5051054730 @default.
- W4288574691 hasAuthorship W4288574691A5060308577 @default.
- W4288574691 hasBestOaLocation W42885746911 @default.
- W4288574691 hasConcept C105611402 @default.
- W4288574691 hasConcept C111442797 @default.
- W4288574691 hasConcept C120174047 @default.
- W4288574691 hasConcept C124101348 @default.
- W4288574691 hasConcept C153180895 @default.
- W4288574691 hasConcept C154945302 @default.
- W4288574691 hasConcept C171686336 @default.
- W4288574691 hasConcept C41008148 @default.
- W4288574691 hasConcept C500882744 @default.
- W4288574691 hasConcept C61224824 @default.
- W4288574691 hasConcept C63085389 @default.
- W4288574691 hasConcept C73555534 @default.
- W4288574691 hasConceptScore W4288574691C105611402 @default.
- W4288574691 hasConceptScore W4288574691C111442797 @default.
- W4288574691 hasConceptScore W4288574691C120174047 @default.
- W4288574691 hasConceptScore W4288574691C124101348 @default.
- W4288574691 hasConceptScore W4288574691C153180895 @default.
- W4288574691 hasConceptScore W4288574691C154945302 @default.
- W4288574691 hasConceptScore W4288574691C171686336 @default.
- W4288574691 hasConceptScore W4288574691C41008148 @default.
- W4288574691 hasConceptScore W4288574691C500882744 @default.
- W4288574691 hasConceptScore W4288574691C61224824 @default.
- W4288574691 hasConceptScore W4288574691C63085389 @default.
- W4288574691 hasConceptScore W4288574691C73555534 @default.
- W4288574691 hasLocation W42885746911 @default.
- W4288574691 hasOpenAccess W4288574691 @default.
- W4288574691 hasPrimaryLocation W42885746911 @default.
- W4288574691 hasRelatedWork W1783400983 @default.
- W4288574691 hasRelatedWork W1970698127 @default.
- W4288574691 hasRelatedWork W2076529521 @default.
- W4288574691 hasRelatedWork W2151933617 @default.
- W4288574691 hasRelatedWork W2742735109 @default.
- W4288574691 hasRelatedWork W2972489755 @default.
- W4288574691 hasRelatedWork W3087019468 @default.
- W4288574691 hasRelatedWork W3111251828 @default.
- W4288574691 hasRelatedWork W3135844644 @default.
- W4288574691 hasRelatedWork W3152859144 @default.
- W4288574691 isParatext "false" @default.
- W4288574691 isRetracted "false" @default.
- W4288574691 workType "article" @default.