Matches in SemOpenAlex for { <https://semopenalex.org/work/W82218792> ?p ?o ?g. }
- W82218792 abstract "In this thesis the development and application of probabilistic models of documents is considered. The initial focus is on language models which provide a way of modelling plain text documents. In particular the hierarchical Dirichlet language model, which is derived from simple Bayesian theory, is investigated and is shown to be well approximated by an existing method known as generalised PPM-A. Using this equivalence, generalised PPM-A is extended to produce a language model which while working on the level of individual letter-like symbols is able to make use of the division of the text stream into words. It is shown that the new model can be used in conjunction with a word list to improve performance when very little information from which to learn the statistics of the language is available. The hierarchical Dirichlet model is then applied to the task of information retrieval, producing a new retrieval method which naturally includes document frequency information. This information has traditionally been used in retrieval systems, but previously had either been missing or introduced heuristically in language model based approaches to the problem. The hierarchical approach is also extended to the task of retrieval at the passage level where it is shown to give promising results. Finally, the scope of the investigation is broadened to include documents which contain diagrams as well as plain text. A method is developed to group fragments of digitised ink strokes into perceptually relevant components of a diagram, while at the same time labelling the components with an object class. The approach, which is based on the conditional random field, is shown to work well both in terms of grouping and improving labelling performance when compared to other methods." @default.
- W82218792 created "2016-06-24" @default.
- W82218792 creator A5073624135 @default.
- W82218792 date "2006-01-01" @default.
- W82218792 modified "2023-09-27" @default.
- W82218792 title "Probabilistic Document Modelling" @default.
- W82218792 cites W10097614 @default.
- W82218792 cites W1482214997 @default.
- W82218792 cites W1494864219 @default.
- W82218792 cites W1508165687 @default.
- W82218792 cites W1509562192 @default.
- W82218792 cites W1512277306 @default.
- W82218792 cites W1540124269 @default.
- W82218792 cites W1558322748 @default.
- W82218792 cites W1560013842 @default.
- W82218792 cites W1563141555 @default.
- W82218792 cites W1574901103 @default.
- W82218792 cites W1592871157 @default.
- W82218792 cites W1597533204 @default.
- W82218792 cites W1603920809 @default.
- W82218792 cites W1648885110 @default.
- W82218792 cites W1755360231 @default.
- W82218792 cites W1833224072 @default.
- W82218792 cites W1880262756 @default.
- W82218792 cites W189514790 @default.
- W82218792 cites W1956559956 @default.
- W82218792 cites W1965061793 @default.
- W82218792 cites W1965555277 @default.
- W82218792 cites W1966812932 @default.
- W82218792 cites W1972099155 @default.
- W82218792 cites W1975965284 @default.
- W82218792 cites W1996228297 @default.
- W82218792 cites W1996903695 @default.
- W82218792 cites W2001082470 @default.
- W82218792 cites W2004545875 @default.
- W82218792 cites W2010291682 @default.
- W82218792 cites W2012603689 @default.
- W82218792 cites W2019509999 @default.
- W82218792 cites W2024932032 @default.
- W82218792 cites W2036407715 @default.
- W82218792 cites W2037139490 @default.
- W82218792 cites W2043909051 @default.
- W82218792 cites W2048045485 @default.
- W82218792 cites W2050037476 @default.
- W82218792 cites W2059800182 @default.
- W82218792 cites W2062270497 @default.
- W82218792 cites W2063266501 @default.
- W82218792 cites W2066636486 @default.
- W82218792 cites W2068502090 @default.
- W82218792 cites W2068905009 @default.
- W82218792 cites W2069429561 @default.
- W82218792 cites W2075201173 @default.
- W82218792 cites W2079145130 @default.
- W82218792 cites W2080676333 @default.
- W82218792 cites W2087309226 @default.
- W82218792 cites W2089319476 @default.
- W82218792 cites W2089330921 @default.
- W82218792 cites W2089484716 @default.
- W82218792 cites W2093390569 @default.
- W82218792 cites W2097333193 @default.
- W82218792 cites W2098162425 @default.
- W82218792 cites W2101913237 @default.
- W82218792 cites W2107743791 @default.
- W82218792 cites W2113641473 @default.
- W82218792 cites W2119878143 @default.
- W82218792 cites W2121927366 @default.
- W82218792 cites W2126163471 @default.
- W82218792 cites W2129031807 @default.
- W82218792 cites W2129652681 @default.
- W82218792 cites W2130952710 @default.
- W82218792 cites W2131054109 @default.
- W82218792 cites W2132339004 @default.
- W82218792 cites W2132957691 @default.
- W82218792 cites W2134237567 @default.
- W82218792 cites W2137813581 @default.
- W82218792 cites W2138523425 @default.
- W82218792 cites W2138621811 @default.
- W82218792 cites W2140679639 @default.
- W82218792 cites W2143877328 @default.
- W82218792 cites W2147152072 @default.
- W82218792 cites W2147565737 @default.
- W82218792 cites W2147880316 @default.
- W82218792 cites W2158190429 @default.
- W82218792 cites W2158266063 @default.
- W82218792 cites W2158823144 @default.
- W82218792 cites W2159399018 @default.
- W82218792 cites W2168938909 @default.
- W82218792 cites W2171763016 @default.
- W82218792 cites W2182025540 @default.
- W82218792 cites W2183238488 @default.
- W82218792 cites W2326912991 @default.
- W82218792 cites W2611071497 @default.
- W82218792 cites W3140968660 @default.
- W82218792 cites W80430187 @default.
- W82218792 cites W3145738572 @default.
- W82218792 hasPublicationYear "2006" @default.
- W82218792 type Work @default.
- W82218792 sameAs 82218792 @default.
- W82218792 citedByCount "10" @default.
- W82218792 countsByYear W822187922013 @default.