Matches in SemOpenAlex for { <https://semopenalex.org/work/W2018592249> ?p ?o ?g. }
Showing items 1 to 79 of
79
with 100 items per page.
- W2018592249 abstract "Vector Space Model (VSM) is widely used to represent documents and web pages. It is simple and easy to deal computationally, but it also oversimplifies a document into a vector, susceptible to noise, and cannot explicitly represent underlying topics of a document. A matrix representation of document is proposed in this paper: rows represent distinct terms and columns represent cohesive segments. The matrix model views a document as a set of segments, and each segment is a probability distribution over a limited number of latent topics which can be mapped to clustering structures. The latent topic extraction based on the matrix representation of documents is formulated as a constraint optimization problem in which each matrix (i.e., a document) A_i is factorized into a common base determined by non-negative matrices L and R^top, and a non-negative weight matrix M_i such that the sum of reconstruction error on all documents is minimized. Empirical evaluation demonstrates that it is feasible to use the matrix model for document clustering: (1) compared with vector representation, using matrix representation improves clustering quality consistently, and the proposed approach achieves a relative accuracy improvement up to 66% on the studied datasets, and (2) the proposed method outperforms baseline methods such as k-means and NMF, and complements the state-of-the-art methods like LDA and PLSI. Furthermore, the proposed matrix model allows more refined information retrieval at a segment level instead of at a document level, which enables the return of more relevant documents in information retrieval tasks." @default.
- W2018592249 created "2016-06-24" @default.
- W2018592249 creator A5013881064 @default.
- W2018592249 creator A5016813964 @default.
- W2018592249 creator A5040639891 @default.
- W2018592249 date "2011-12-01" @default.
- W2018592249 modified "2023-09-27" @default.
- W2018592249 title "Document Clustering via Matrix Representation" @default.
- W2018592249 cites W1626945812 @default.
- W2018592249 cites W1880262756 @default.
- W2018592249 cites W1914599625 @default.
- W2018592249 cites W1976391658 @default.
- W2018592249 cites W2013029404 @default.
- W2018592249 cites W2043545458 @default.
- W2018592249 cites W2059745395 @default.
- W2018592249 cites W2065476260 @default.
- W2018592249 cites W2108207327 @default.
- W2018592249 cites W2110096996 @default.
- W2018592249 cites W2113359929 @default.
- W2018592249 cites W2114461971 @default.
- W2018592249 cites W2118718620 @default.
- W2018592249 cites W2124890708 @default.
- W2018592249 cites W2132914434 @default.
- W2018592249 cites W2133576408 @default.
- W2018592249 cites W2134731454 @default.
- W2018592249 cites W2140156730 @default.
- W2018592249 cites W2142522063 @default.
- W2018592249 cites W2147057843 @default.
- W2018592249 cites W2169153112 @default.
- W2018592249 cites W2169279737 @default.
- W2018592249 cites W95524657 @default.
- W2018592249 doi "https://doi.org/10.1109/icdm.2011.59" @default.
- W2018592249 hasPublicationYear "2011" @default.
- W2018592249 type Work @default.
- W2018592249 sameAs 2018592249 @default.
- W2018592249 citedByCount "14" @default.
- W2018592249 countsByYear W20185922492012 @default.
- W2018592249 countsByYear W20185922492013 @default.
- W2018592249 countsByYear W20185922492014 @default.
- W2018592249 countsByYear W20185922492015 @default.
- W2018592249 countsByYear W20185922492017 @default.
- W2018592249 countsByYear W20185922492018 @default.
- W2018592249 crossrefType "proceedings-article" @default.
- W2018592249 hasAuthorship W2018592249A5013881064 @default.
- W2018592249 hasAuthorship W2018592249A5016813964 @default.
- W2018592249 hasAuthorship W2018592249A5040639891 @default.
- W2018592249 hasBestOaLocation W20185922492 @default.
- W2018592249 hasConcept C154945302 @default.
- W2018592249 hasConcept C17744445 @default.
- W2018592249 hasConcept C199539241 @default.
- W2018592249 hasConcept C2776359362 @default.
- W2018592249 hasConcept C41008148 @default.
- W2018592249 hasConcept C73555534 @default.
- W2018592249 hasConcept C94625758 @default.
- W2018592249 hasConceptScore W2018592249C154945302 @default.
- W2018592249 hasConceptScore W2018592249C17744445 @default.
- W2018592249 hasConceptScore W2018592249C199539241 @default.
- W2018592249 hasConceptScore W2018592249C2776359362 @default.
- W2018592249 hasConceptScore W2018592249C41008148 @default.
- W2018592249 hasConceptScore W2018592249C73555534 @default.
- W2018592249 hasConceptScore W2018592249C94625758 @default.
- W2018592249 hasLocation W20185922491 @default.
- W2018592249 hasLocation W20185922492 @default.
- W2018592249 hasOpenAccess W2018592249 @default.
- W2018592249 hasPrimaryLocation W20185922491 @default.
- W2018592249 hasRelatedWork W1549289070 @default.
- W2018592249 hasRelatedWork W1849651648 @default.
- W2018592249 hasRelatedWork W1889934247 @default.
- W2018592249 hasRelatedWork W1999627569 @default.
- W2018592249 hasRelatedWork W2088025572 @default.
- W2018592249 hasRelatedWork W2104995483 @default.
- W2018592249 hasRelatedWork W2388668815 @default.
- W2018592249 hasRelatedWork W2912933387 @default.
- W2018592249 hasRelatedWork W3107474891 @default.
- W2018592249 hasRelatedWork W763609066 @default.
- W2018592249 isParatext "false" @default.
- W2018592249 isRetracted "false" @default.
- W2018592249 magId "2018592249" @default.
- W2018592249 workType "article" @default.