Matches in SemOpenAlex for { <https://semopenalex.org/work/W3207617013> ?p ?o ?g. }
- W3207617013 endingPage "29" @default.
- W3207617013 startingPage "1" @default.
- W3207617013 abstract "Existing topic modeling and text segmentation methodologies generally require large datasets for training, limiting their capabilities when only small collections of text are available. In this work, we reexamine the inter-related problems of “topic identification” and “text segmentation” for sparse document learning, when there is a single new text of interest. In developing a methodology to handle single documents, we face two major challenges. First is sparse information : with access to only one document, we cannot train traditional topic models or deep learning algorithms. Second is significant noise : a considerable portion of words in any single document will produce only noise and not help discern topics or segments. To tackle these issues, we design an unsupervised, computationally efficient methodology called Biclustering Approach to Topic modeling and Segmentation (BATS). BATS leverages three key ideas to simultaneously identify topics and segment text: (i) a new mechanism that uses word order information to reduce sample complexity, (ii) a statistically sound graph-based biclustering technique that identifies latent structures of words and sentences, and (iii) a collection of effective heuristics that remove noise words and award important words to further improve performance. Experiments on six datasets show that our approach outperforms several state-of-the-art baselines when considering topic coherence, topic diversity, segmentation, and runtime comparison metrics." @default.
- W3207617013 created "2021-10-25" @default.
- W3207617013 creator A5019087968 @default.
- W3207617013 creator A5024985714 @default.
- W3207617013 creator A5041033878 @default.
- W3207617013 creator A5067433299 @default.
- W3207617013 creator A5070036007 @default.
- W3207617013 creator A5078544502 @default.
- W3207617013 creator A5078569264 @default.
- W3207617013 date "2021-10-15" @default.
- W3207617013 modified "2023-09-24" @default.
- W3207617013 title "BATS: A Spectral Biclustering Approach to Single Document Topic Modeling and Segmentation" @default.
- W3207617013 cites W1557074680 @default.
- W3207617013 cites W1714665356 @default.
- W3207617013 cites W1880262756 @default.
- W3207617013 cites W1979469248 @default.
- W3207617013 cites W1994448786 @default.
- W3207617013 cites W2024364296 @default.
- W3207617013 cites W2038043464 @default.
- W3207617013 cites W2061922307 @default.
- W3207617013 cites W2063904635 @default.
- W3207617013 cites W2066873261 @default.
- W3207617013 cites W2080179128 @default.
- W3207617013 cites W2089923519 @default.
- W3207617013 cites W2107743791 @default.
- W3207617013 cites W2110485445 @default.
- W3207617013 cites W2119998616 @default.
- W3207617013 cites W2130158951 @default.
- W3207617013 cites W2132914434 @default.
- W3207617013 cites W2133576408 @default.
- W3207617013 cites W2147152072 @default.
- W3207617013 cites W2150593711 @default.
- W3207617013 cites W2150731624 @default.
- W3207617013 cites W2159083595 @default.
- W3207617013 cites W2166753626 @default.
- W3207617013 cites W2171313960 @default.
- W3207617013 cites W2178725228 @default.
- W3207617013 cites W2250539671 @default.
- W3207617013 cites W2416799949 @default.
- W3207617013 cites W2434205482 @default.
- W3207617013 cites W2461271816 @default.
- W3207617013 cites W2472333518 @default.
- W3207617013 cites W2561715741 @default.
- W3207617013 cites W2744808284 @default.
- W3207617013 cites W2749627623 @default.
- W3207617013 cites W2750981579 @default.
- W3207617013 cites W2788615138 @default.
- W3207617013 cites W2896763200 @default.
- W3207617013 cites W2926555354 @default.
- W3207617013 cites W2946516957 @default.
- W3207617013 cites W2962716111 @default.
- W3207617013 cites W2962878234 @default.
- W3207617013 cites W2962943175 @default.
- W3207617013 cites W2963563735 @default.
- W3207617013 cites W2963726741 @default.
- W3207617013 cites W2970507669 @default.
- W3207617013 cites W3020633614 @default.
- W3207617013 cites W3101767658 @default.
- W3207617013 cites W3101919829 @default.
- W3207617013 cites W3200421407 @default.
- W3207617013 doi "https://doi.org/10.1145/3468268" @default.
- W3207617013 hasPublicationYear "2021" @default.
- W3207617013 type Work @default.
- W3207617013 sameAs 3207617013 @default.
- W3207617013 citedByCount "5" @default.
- W3207617013 countsByYear W32076170132021 @default.
- W3207617013 countsByYear W32076170132022 @default.
- W3207617013 crossrefType "journal-article" @default.
- W3207617013 hasAuthorship W3207617013A5019087968 @default.
- W3207617013 hasAuthorship W3207617013A5024985714 @default.
- W3207617013 hasAuthorship W3207617013A5041033878 @default.
- W3207617013 hasAuthorship W3207617013A5067433299 @default.
- W3207617013 hasAuthorship W3207617013A5070036007 @default.
- W3207617013 hasAuthorship W3207617013A5078544502 @default.
- W3207617013 hasAuthorship W3207617013A5078569264 @default.
- W3207617013 hasBestOaLocation W32076170132 @default.
- W3207617013 hasConcept C111919701 @default.
- W3207617013 hasConcept C119857082 @default.
- W3207617013 hasConcept C124101348 @default.
- W3207617013 hasConcept C127705205 @default.
- W3207617013 hasConcept C144817290 @default.
- W3207617013 hasConcept C153180895 @default.
- W3207617013 hasConcept C154945302 @default.
- W3207617013 hasConcept C171686336 @default.
- W3207617013 hasConcept C17212007 @default.
- W3207617013 hasConcept C204321447 @default.
- W3207617013 hasConcept C23123220 @default.
- W3207617013 hasConcept C33704608 @default.
- W3207617013 hasConcept C41008148 @default.
- W3207617013 hasConcept C73555534 @default.
- W3207617013 hasConcept C89600930 @default.
- W3207617013 hasConceptScore W3207617013C111919701 @default.
- W3207617013 hasConceptScore W3207617013C119857082 @default.
- W3207617013 hasConceptScore W3207617013C124101348 @default.
- W3207617013 hasConceptScore W3207617013C127705205 @default.
- W3207617013 hasConceptScore W3207617013C144817290 @default.
- W3207617013 hasConceptScore W3207617013C153180895 @default.
- W3207617013 hasConceptScore W3207617013C154945302 @default.