Matches in SemOpenAlex for { <https://semopenalex.org/work/W1831001034> ?p ?o ?g. }
- W1831001034 abstract "Information Extraction (IE) can be defined as the task of automatically extracting preespecified kind of information from a text document. The extracted information is encoded in the required format and then can be used, for example, for text summarization or as accurate index to retrieve new documents.<br/><br/>The main issue when building IE systems is how to obtain the knowledge needed to identify relevant information in a document. Today, IE systems are commonly based on extraction rules or IE patterns to represent the kind of information to be extracted. Most approaches to IE pattern acquisition require expert human intervention in many steps of the acquisition process. This dissertation presents a novel method for acquiring IE patterns, Essence, that significantly reduces the need for human intervention. The method is based on ELA, a specifically designed learning algorithm for acquiring IE patterns from unannotated corpora.<br/><br/>The distinctive features of Essence and ELA are that 1) they permit the automatic acquisition of IE patterns from unrestricted and untagged text representative of the domain, due to 2) their ability to identify regularities around semantically relevant concept-words for the IE task by 3) using non-domain-specific lexical knowledge tools such as WordNet and 4) restricting the human intervention to defining the task, and validating and typifying the set of IE patterns obtained.<br/><br/>Since Essence does not require a corpus annotated with the type of information to be extracted and it does makes use of a general purpose ontology and widely applied syntactic tools, it reduces the expert effort required to build an IE system and therefore also reduces the effort of porting the method to any domain.<br/><br/>In order to Essence be validated we conducted a set of experiments to test the performance of the method. We used Essence to generate IE patterns for a MUC-like task. Nevertheless, the evaluation procedure for MUC competitions does not provide a sound evaluation of IE systems, especially of learning systems. For this reason, we conducted an exhaustive set of experiments to further test the abilities of Essence.<br/>The results of these experiments indicate that the proposed method is able to learn effective IE patterns." @default.
- W1831001034 created "2016-06-24" @default.
- W1831001034 creator A5023133214 @default.
- W1831001034 date "2023-10-07" @default.
- W1831001034 modified "2023-10-18" @default.
- W1831001034 title "Acquiring information extraction patterns from unannotated corpora" @default.
- W1831001034 cites W109237585 @default.
- W1831001034 cites W138033052 @default.
- W1831001034 cites W139994555 @default.
- W1831001034 cites W1489949474 @default.
- W1831001034 cites W1497891933 @default.
- W1831001034 cites W1502749598 @default.
- W1831001034 cites W1507666093 @default.
- W1831001034 cites W1512181064 @default.
- W1831001034 cites W1513874326 @default.
- W1831001034 cites W1520232900 @default.
- W1831001034 cites W1540589956 @default.
- W1831001034 cites W1543686276 @default.
- W1831001034 cites W1543771749 @default.
- W1831001034 cites W1545224087 @default.
- W1831001034 cites W1552513054 @default.
- W1831001034 cites W1553019137 @default.
- W1831001034 cites W1568869800 @default.
- W1831001034 cites W1580155436 @default.
- W1831001034 cites W1594056618 @default.
- W1831001034 cites W1602846073 @default.
- W1831001034 cites W1604574761 @default.
- W1831001034 cites W1605730117 @default.
- W1831001034 cites W1666605052 @default.
- W1831001034 cites W1768418803 @default.
- W1831001034 cites W183602602 @default.
- W1831001034 cites W188328919 @default.
- W1831001034 cites W1940552124 @default.
- W1831001034 cites W1970161214 @default.
- W1831001034 cites W1970406594 @default.
- W1831001034 cites W1971563386 @default.
- W1831001034 cites W197270748 @default.
- W1831001034 cites W1982229380 @default.
- W1831001034 cites W1999138184 @default.
- W1831001034 cites W2009207944 @default.
- W1831001034 cites W2020270115 @default.
- W1831001034 cites W2038826915 @default.
- W1831001034 cites W2046491822 @default.
- W1831001034 cites W2047283406 @default.
- W1831001034 cites W2048360155 @default.
- W1831001034 cites W2048679005 @default.
- W1831001034 cites W2068882115 @default.
- W1831001034 cites W2073308541 @default.
- W1831001034 cites W2076233242 @default.
- W1831001034 cites W2079690930 @default.
- W1831001034 cites W2081580037 @default.
- W1831001034 cites W2093717447 @default.
- W1831001034 cites W2097847889 @default.
- W1831001034 cites W2100299651 @default.
- W1831001034 cites W2101210369 @default.
- W1831001034 cites W2103931177 @default.
- W1831001034 cites W2104884878 @default.
- W1831001034 cites W2108348839 @default.
- W1831001034 cites W2114663556 @default.
- W1831001034 cites W2116184909 @default.
- W1831001034 cites W2124514536 @default.
- W1831001034 cites W2124634352 @default.
- W1831001034 cites W2126185232 @default.
- W1831001034 cites W2136000097 @default.
- W1831001034 cites W2137096228 @default.
- W1831001034 cites W2143349571 @default.
- W1831001034 cites W2147810216 @default.
- W1831001034 cites W2151023586 @default.
- W1831001034 cites W2153752143 @default.
- W1831001034 cites W2156049581 @default.
- W1831001034 cites W2162340487 @default.
- W1831001034 cites W2163915185 @default.
- W1831001034 cites W2164949130 @default.
- W1831001034 cites W2167536328 @default.
- W1831001034 cites W2169318468 @default.
- W1831001034 cites W2170278674 @default.
- W1831001034 cites W2185964905 @default.
- W1831001034 cites W2475291952 @default.
- W1831001034 cites W2496491292 @default.
- W1831001034 cites W2555445555 @default.
- W1831001034 cites W2595408825 @default.
- W1831001034 cites W2612120747 @default.
- W1831001034 cites W26772505 @default.
- W1831001034 cites W2785349534 @default.
- W1831001034 cites W2952340579 @default.
- W1831001034 cites W2952509172 @default.
- W1831001034 cites W2999729612 @default.
- W1831001034 cites W3041070915 @default.
- W1831001034 cites W3089288688 @default.
- W1831001034 cites W40360995 @default.
- W1831001034 doi "https://doi.org/10.5821/dissertation-2117-93982" @default.
- W1831001034 hasPublicationYear "2023" @default.
- W1831001034 type Work @default.
- W1831001034 sameAs 1831001034 @default.
- W1831001034 citedByCount "2" @default.
- W1831001034 countsByYear W18310010342014 @default.
- W1831001034 crossrefType "dissertation" @default.
- W1831001034 hasAuthorship W1831001034A5023133214 @default.
- W1831001034 hasBestOaLocation W18310010341 @default.
- W1831001034 hasConcept C134306372 @default.