Matches in SemOpenAlex for { <https://semopenalex.org/work/W2100456378> ?p ?o ?g. }
- W2100456378 endingPage "147" @default.
- W2100456378 startingPage "111" @default.
- W2100456378 abstract "Probabilistic topic models are unsupervised generative models which model document content as a two-step generation process, that is, documents are observed as mixtures of latent concepts or topics, while topics are probability distributions over vocabulary words. Recently, a significant research effort has been invested into transferring the probabilistic topic modeling concept from monolingual to multilingual settings. Novel topic models have been designed to work with parallel and comparable texts. We define multilingual probabilistic topic modeling (MuPTM) and present the first full overview of the current research, methodology, advantages and limitations in MuPTM. As a representative example, we choose a natural extension of the omnipresent LDA model to multilingual settings called bilingual LDA (BiLDA). We provide a thorough overview of this representative multilingual model from its high-level modeling assumptions down to its mathematical foundations. We demonstrate how to use the data representation by means of output sets of (i) per-topic word distributions and (ii) per-document topic distributions coming from a multilingual probabilistic topic model in various real-life cross-lingual tasks involving different languages, without any external language pair dependent translation resource: (1) cross-lingual event-centered news clustering, (2) cross-lingual document classification, (3) cross-lingual semantic similarity, and (4) cross-lingual information retrieval. We also briefly review several other applications present in the relevant literature, and introduce and illustrate two related modeling concepts: topic smoothing and topic pruning. In summary, this article encompasses the current research in multilingual probabilistic topic modeling. By presenting a series of potential applications, we reveal the importance of the language-independent and language pair independent data representations by means of MuPTM. We provide clear directions for future research in the field by providing a systematic overview of how to link and transfer aspect knowledge across corpora written in different languages via the shared space of latent cross-lingual topics, that is, how to effectively employ learned per-topic word distributions and per-document topic distributions of any multilingual probabilistic topic model in various cross-lingual applications." @default.
- W2100456378 created "2016-06-24" @default.
- W2100456378 creator A5014866912 @default.
- W2100456378 creator A5049102441 @default.
- W2100456378 creator A5058894102 @default.
- W2100456378 creator A5075796989 @default.
- W2100456378 date "2015-01-01" @default.
- W2100456378 modified "2023-10-07" @default.
- W2100456378 title "Probabilistic topic modeling in multilingual settings: An overview of its methodology and applications" @default.
- W2100456378 cites W119481552 @default.
- W2100456378 cites W142374489 @default.
- W2100456378 cites W1505113900 @default.
- W2100456378 cites W1556255569 @default.
- W2100456378 cites W1608685024 @default.
- W2100456378 cites W1662133657 @default.
- W2100456378 cites W1963753835 @default.
- W2100456378 cites W1966820603 @default.
- W2100456378 cites W1972594981 @default.
- W2100456378 cites W1975570633 @default.
- W2100456378 cites W1978400666 @default.
- W2100456378 cites W1979606348 @default.
- W2100456378 cites W1979663447 @default.
- W2100456378 cites W1983623418 @default.
- W2100456378 cites W1984251878 @default.
- W2100456378 cites W2000748429 @default.
- W2100456378 cites W2001082470 @default.
- W2100456378 cites W2016181719 @default.
- W2100456378 cites W2020842694 @default.
- W2100456378 cites W2020999234 @default.
- W2100456378 cites W2021232959 @default.
- W2100456378 cites W2028080169 @default.
- W2100456378 cites W2033593667 @default.
- W2100456378 cites W2035121712 @default.
- W2100456378 cites W2041232209 @default.
- W2100456378 cites W2042980227 @default.
- W2100456378 cites W2046878543 @default.
- W2100456378 cites W2047295649 @default.
- W2100456378 cites W2049870103 @default.
- W2100456378 cites W2062711980 @default.
- W2100456378 cites W2063397738 @default.
- W2100456378 cites W2066576933 @default.
- W2100456378 cites W2068463433 @default.
- W2100456378 cites W2071399839 @default.
- W2100456378 cites W2087743880 @default.
- W2100456378 cites W2092094655 @default.
- W2100456378 cites W2095958485 @default.
- W2100456378 cites W2102070017 @default.
- W2100456378 cites W2102997946 @default.
- W2100456378 cites W2104210067 @default.
- W2100456378 cites W2105577415 @default.
- W2100456378 cites W2107676735 @default.
- W2100456378 cites W2107695330 @default.
- W2100456378 cites W2112301665 @default.
- W2100456378 cites W2121415745 @default.
- W2100456378 cites W2122678284 @default.
- W2100456378 cites W2124585778 @default.
- W2100456378 cites W2127365348 @default.
- W2100456378 cites W2129294185 @default.
- W2100456378 cites W2131689821 @default.
- W2100456378 cites W2133837072 @default.
- W2100456378 cites W2133852251 @default.
- W2100456378 cites W2135399361 @default.
- W2100456378 cites W2140903445 @default.
- W2100456378 cites W2144100511 @default.
- W2100456378 cites W2144750001 @default.
- W2100456378 cites W2146536297 @default.
- W2100456378 cites W2146610201 @default.
- W2100456378 cites W2146950091 @default.
- W2100456378 cites W2149021215 @default.
- W2100456378 cites W2150461699 @default.
- W2100456378 cites W2151521349 @default.
- W2100456378 cites W2155188720 @default.
- W2100456378 cites W2163679724 @default.
- W2100456378 cites W2165612380 @default.
- W2100456378 cites W2166098990 @default.
- W2100456378 cites W2166354010 @default.
- W2100456378 cites W2171343266 @default.
- W2100456378 cites W2882319491 @default.
- W2100456378 cites W4206765718 @default.
- W2100456378 cites W4233135949 @default.
- W2100456378 cites W4245107743 @default.
- W2100456378 cites W4246858749 @default.
- W2100456378 cites W74584387 @default.
- W2100456378 doi "https://doi.org/10.1016/j.ipm.2014.08.003" @default.
- W2100456378 hasPublicationYear "2015" @default.
- W2100456378 type Work @default.
- W2100456378 sameAs 2100456378 @default.
- W2100456378 citedByCount "82" @default.
- W2100456378 countsByYear W21004563782015 @default.
- W2100456378 countsByYear W21004563782016 @default.
- W2100456378 countsByYear W21004563782017 @default.
- W2100456378 countsByYear W21004563782018 @default.
- W2100456378 countsByYear W21004563782019 @default.
- W2100456378 countsByYear W21004563782020 @default.
- W2100456378 countsByYear W21004563782021 @default.
- W2100456378 countsByYear W21004563782022 @default.
- W2100456378 countsByYear W21004563782023 @default.
- W2100456378 crossrefType "journal-article" @default.