Matches in SemOpenAlex for { <https://semopenalex.org/work/W196969749> ?p ?o ?g. }
- W196969749 abstract "This thesis deals with the use of Conditional Random Fields (CRFs; Lafferty et al. (2001)) for Natural Language Processing (NLP). CRFs are probabilistic models for sequence labelling which are particularly well suited to NLP. They have many compelling advantages over other popular models such as HiddenMarkovModels andMaximum Entropy Markov Models (Rabiner, 1990; McCallum et al., 2001), and have been applied to a number of NLP tasks with considerable success (e.g., Sha and Pereira (2003) and Smith et al. (2005)). Despite their apparent success, CRFs suffer from two main failings. Firstly, they often over-fit the training sample. This is a consequence of their considerable expressive power, and can be limited by a prior over the model parameters (Sha and Pereira, 2003; Peng and McCallum, 2004). Their second failing is that the standard methods for CRF training are often very slow, sometimes requiring weeks of processing time. This efficiency problem is largely ignored in current literature, although in practise the cost of training prevents the application of CRFs to many new more complex tasks, and also prevents the use of densely connected graphs, whichwould allow formuch richer feature sets. This thesis addresses the issue of training efficiency. Firstly, we demonstrate that the asymptotic time complexity of standard training for a linear chain CRF is quadratic in the size of the label set, linear in the number of features and almost quadratic in the size of the training sample. The cost of inference in cyclic graphs, such as lattice structured Dynamic CRFs (Sutton et al., 2004), is even greater. The complexity of training limits the application of CRFs to large and complex tasks. We compare the accuracy of a number of popular approximate training techniques, which can greatly reduce the training cost. However, for most tasks this saving is coupled with a substantial loss in accuracy. For this reason we propose two novel training methods, which both reduce the resource requirements and improve the scalability of training, such that CRFs can be applied to substantially larger tasks. The first method uses error-correcting output coding, a method originally devised for classifier combination (Freund and Schapire, 1996). This method decomposes the task of training a multiclass CRF (modelling a problem with three or more labels) into a series of simpler binary labelling tasks. Each of these sub-tasks are modelled with a CRF and are trained in the standard manner. Overall, these constituent models are considerably faster to train than a full multiclass CRF; critically, this can also reduce the complexity of training. Once trained these constituent models can be combined to decode unlabelled test instances and this results in similar accuracy to standard training. We introduce a second alternative training method which uses feature constraints to improve the time cost of inference. These constraints tie groups of features in the model which are exploited in sum-product and max-product belief propagation (Pearl, 1988). This leads to faster training and decoding. Overall, even simple tying strategies" @default.
- W196969749 created "2016-06-24" @default.
- W196969749 creator A5078530959 @default.
- W196969749 date "2007-01-01" @default.
- W196969749 modified "2023-09-27" @default.
- W196969749 title "Scaling conditional random fields for natural language processing" @default.
- W196969749 cites W1500698297 @default.
- W196969749 cites W1516193414 @default.
- W196969749 cites W1523415694 @default.
- W196969749 cites W1528056001 @default.
- W196969749 cites W1528941926 @default.
- W196969749 cites W1534730506 @default.
- W196969749 cites W1585221258 @default.
- W196969749 cites W1592796124 @default.
- W196969749 cites W1595781024 @default.
- W196969749 cites W1600295424 @default.
- W196969749 cites W1606480398 @default.
- W196969749 cites W1632114991 @default.
- W196969749 cites W1676820704 @default.
- W196969749 cites W1775188621 @default.
- W196969749 cites W1857926807 @default.
- W196969749 cites W1902387477 @default.
- W196969749 cites W1932968309 @default.
- W196969749 cites W1934019294 @default.
- W196969749 cites W1979711143 @default.
- W196969749 cites W1982972110 @default.
- W196969749 cites W1991133427 @default.
- W196969749 cites W1996430422 @default.
- W196969749 cites W1998235220 @default.
- W196969749 cites W2001792610 @default.
- W196969749 cites W2004384146 @default.
- W196969749 cites W2008652694 @default.
- W196969749 cites W2008830554 @default.
- W196969749 cites W2019599312 @default.
- W196969749 cites W2020999234 @default.
- W196969749 cites W2040870580 @default.
- W196969749 cites W204260652 @default.
- W196969749 cites W2051203581 @default.
- W196969749 cites W2051669046 @default.
- W196969749 cites W2056451646 @default.
- W196969749 cites W2059415415 @default.
- W196969749 cites W2079182758 @default.
- W196969749 cites W2081612620 @default.
- W196969749 cites W2086699924 @default.
- W196969749 cites W2092654472 @default.
- W196969749 cites W2093647425 @default.
- W196969749 cites W2096175520 @default.
- W196969749 cites W2096765155 @default.
- W196969749 cites W2098379588 @default.
- W196969749 cites W2099960657 @default.
- W196969749 cites W2102667697 @default.
- W196969749 cites W2104029044 @default.
- W196969749 cites W2109189215 @default.
- W196969749 cites W2112076978 @default.
- W196969749 cites W2114220616 @default.
- W196969749 cites W2114521167 @default.
- W196969749 cites W2117689109 @default.
- W196969749 cites W2118696796 @default.
- W196969749 cites W2122410182 @default.
- W196969749 cites W2125838338 @default.
- W196969749 cites W2126851059 @default.
- W196969749 cites W2135843243 @default.
- W196969749 cites W2137813581 @default.
- W196969749 cites W2138043057 @default.
- W196969749 cites W2138309709 @default.
- W196969749 cites W2138388410 @default.
- W196969749 cites W2139193890 @default.
- W196969749 cites W2141099517 @default.
- W196969749 cites W2141732516 @default.
- W196969749 cites W2144068644 @default.
- W196969749 cites W2144087279 @default.
- W196969749 cites W2144578941 @default.
- W196969749 cites W2147880316 @default.
- W196969749 cites W2148124601 @default.
- W196969749 cites W2149660837 @default.
- W196969749 cites W2152455533 @default.
- W196969749 cites W2156515921 @default.
- W196969749 cites W2158148237 @default.
- W196969749 cites W2158188757 @default.
- W196969749 cites W2158570381 @default.
- W196969749 cites W2158823144 @default.
- W196969749 cites W2158847908 @default.
- W196969749 cites W2159080219 @default.
- W196969749 cites W2160842254 @default.
- W196969749 cites W2160988325 @default.
- W196969749 cites W2163364417 @default.
- W196969749 cites W2165849002 @default.
- W196969749 cites W2167055186 @default.
- W196969749 cites W2167216307 @default.
- W196969749 cites W2170469979 @default.
- W196969749 cites W2175160295 @default.
- W196969749 cites W2534584408 @default.
- W196969749 cites W2764956387 @default.
- W196969749 cites W2807232057 @default.
- W196969749 cites W28766783 @default.
- W196969749 cites W2962735828 @default.
- W196969749 cites W2994982620 @default.
- W196969749 cites W3029645440 @default.
- W196969749 cites W3140968660 @default.
- W196969749 cites W90568776 @default.