Matches in SemOpenAlex for { <https://semopenalex.org/work/W2153476384> ?p ?o ?g. }
- W2153476384 endingPage "274" @default.
- W2153476384 startingPage "268" @default.
- W2153476384 abstract "The problem of automatically extracting structured information from texts is an important, unsolved problem within the field of Natural Language Processing. The extraction of such information can facilitate activities such as the building of knowledge bases, automaticsummarisation and sentiment analysis. A human reader can easily discern the events described in a text, along with the participants and the relationships between them,but using a computer to automatically discover the same information is much more challenging. Particular focus has been given to extracting relations between the entities in a text, such as those representing geographical locations, personal and social relationships, and employment. In this thesis, we consider two closely related entity relationships, which are interesting, frequent and have not been tackled previously, which we refer to collectivelyas entity instantiations.We define an entity instantiation as an entity relation in which a set of entities is introduced, and either a member or subset of this set is mentioned. In the example below,we see a set membership instantiation, between ‘several EU countries’ and ‘the UK’, along with a subset instantiation, between the same set and ‘the low countries’. Inflation has increased sharply in several EU countries. In the UK, this has accompanied a drop in interest rates, but in the low countries rates have remained steady. This thesis details the creation of the first corpus of entity instantiations. The final corpus consists of 4,521 instantiations, 2,118 of which are intersentential, and 2,403 of which are intrasentential, annotated over 75 Penn Treebank Wall Street Journal newswire texts. The subsequent annotation study shows high levels of inter-annotator agreement and ourcorpus study analyses the annotated entity instantiations in terms of their internal structure, the distance between arguments and their syntactic relationship, finding a particularly strong link between syntactic parent-child relationships and sentence-internal entity instantiations.To establish that the accurate automatic identification of entity instantiations is possible, we develop the first instantiation identification algorithm, which uses a supervised machine learning approach. The feature set draws on surface, syntactic, contextual, salience and knowledge features to aid classification. We separately apply our classifier to intersentential and intrasentential entity instantiations and experiment with both balanced data, with a 50/50 positive/negative split, and the original unbalanced corpus. The classifier records highly significant performance increases over both unigram-basedand majority class baselines on the balanced data, and also on the original distribution of intrasentential instantiations.In order to take advantage of the aforementioned link between syntax and intrasentential entity instantiations, tree kernels were employed to learn directly from the syntactic parse trees which contain the two potential participants in an intrasentential instantiation.The tree kernel features perform similarly to the unstructured feature set, with a much shorter development time. Combining tree kernels with unstructured features gives further improvements over both the baselines, and either method in isolation. We also apply our entity instantiations to the difficult problem of implicit discourse relation classification, hypothesising that introducing features identifying the presence of an entity instantiation between the arguments of a discourse relation can improve classification performance. Our experiments show that an entity instantiation is a strong indicator of the presence of an Expansion.Instantiation discourse relation. We create a binary Expansion.Instantiation classifier, based on the feature set detailed in Sporlederand Lascarides (2008), but augment it by adding entity instantiation features based on gold standard annotations. The classifier which includes entity instantiation data performs significantly better than the same classifier without entity instantiation data. We also experiment with the incorporation of machine-identified entity instantiations. However, our entity instantiation classifier is not sufficiently accurate to impact on discourse relation classification." @default.
- W2153476384 created "2016-06-24" @default.
- W2153476384 creator A5031894916 @default.
- W2153476384 creator A5082971401 @default.
- W2153476384 date "2011-09-01" @default.
- W2153476384 modified "2023-09-23" @default.
- W2153476384 title "Modelling Entity Instantiations" @default.
- W2153476384 cites W101214240 @default.
- W2153476384 cites W102059219 @default.
- W2153476384 cites W109969595 @default.
- W2153476384 cites W110692952 @default.
- W2153476384 cites W115166160 @default.
- W2153476384 cites W117665274 @default.
- W2153476384 cites W128995279 @default.
- W2153476384 cites W131863957 @default.
- W2153476384 cites W138910040 @default.
- W2153476384 cites W141052578 @default.
- W2153476384 cites W14680811 @default.
- W2153476384 cites W1480643256 @default.
- W2153476384 cites W1493490255 @default.
- W2153476384 cites W1495022714 @default.
- W2153476384 cites W1495061682 @default.
- W2153476384 cites W1495981708 @default.
- W2153476384 cites W1502749598 @default.
- W2153476384 cites W1521626219 @default.
- W2153476384 cites W1528859321 @default.
- W2153476384 cites W1529628505 @default.
- W2153476384 cites W1543515964 @default.
- W2153476384 cites W1549016832 @default.
- W2153476384 cites W1550588214 @default.
- W2153476384 cites W1560360748 @default.
- W2153476384 cites W1570601904 @default.
- W2153476384 cites W1572758419 @default.
- W2153476384 cites W1574440611 @default.
- W2153476384 cites W1576504150 @default.
- W2153476384 cites W1576520375 @default.
- W2153476384 cites W1577423537 @default.
- W2153476384 cites W1587871245 @default.
- W2153476384 cites W1588383342 @default.
- W2153476384 cites W1588592216 @default.
- W2153476384 cites W1593784001 @default.
- W2153476384 cites W1602846073 @default.
- W2153476384 cites W1605174196 @default.
- W2153476384 cites W1632114991 @default.
- W2153476384 cites W1647671624 @default.
- W2153476384 cites W1710827551 @default.
- W2153476384 cites W1730134856 @default.
- W2153476384 cites W17883611 @default.
- W2153476384 cites W1806015339 @default.
- W2153476384 cites W1890589545 @default.
- W2153476384 cites W1894881655 @default.
- W2153476384 cites W1930023685 @default.
- W2153476384 cites W1933365859 @default.
- W2153476384 cites W1966216266 @default.
- W2153476384 cites W1967080262 @default.
- W2153476384 cites W1968679607 @default.
- W2153476384 cites W1968905923 @default.
- W2153476384 cites W1971579567 @default.
- W2153476384 cites W1980366261 @default.
- W2153476384 cites W1981082061 @default.
- W2153476384 cites W1982246600 @default.
- W2153476384 cites W1985661114 @default.
- W2153476384 cites W1987411183 @default.
- W2153476384 cites W1991145433 @default.
- W2153476384 cites W1991169218 @default.
- W2153476384 cites W1997587194 @default.
- W2153476384 cites W2004763266 @default.
- W2153476384 cites W2006793114 @default.
- W2153476384 cites W2006860812 @default.
- W2153476384 cites W2009611543 @default.
- W2153476384 cites W2012167895 @default.
- W2153476384 cites W2015765684 @default.
- W2153476384 cites W2015933299 @default.
- W2153476384 cites W2017337590 @default.
- W2153476384 cites W2017483382 @default.
- W2153476384 cites W2019416425 @default.
- W2153476384 cites W2020082880 @default.
- W2153476384 cites W2020270115 @default.
- W2153476384 cites W2020734619 @default.
- W2153476384 cites W2023482952 @default.
- W2153476384 cites W2026789015 @default.
- W2153476384 cites W2027869740 @default.
- W2153476384 cites W2032327522 @default.
- W2153476384 cites W2035432878 @default.
- W2153476384 cites W2038721957 @default.
- W2153476384 cites W2039217078 @default.
- W2153476384 cites W2040960947 @default.
- W2153476384 cites W2042128417 @default.
- W2153476384 cites W2044276975 @default.
- W2153476384 cites W2045738181 @default.
- W2153476384 cites W2046491822 @default.
- W2153476384 cites W2047283406 @default.
- W2153476384 cites W2052286186 @default.
- W2153476384 cites W2053154970 @default.
- W2153476384 cites W2053238041 @default.
- W2153476384 cites W2053463056 @default.
- W2153476384 cites W2059933135 @default.
- W2153476384 cites W2067025414 @default.