Matches in SemOpenAlex for { <https://semopenalex.org/work/W2003530519> ?p ?o ?g. }
Showing items 1 to 88 of
88
with 100 items per page.
- W2003530519 endingPage "450" @default.
- W2003530519 startingPage "425" @default.
- W2003530519 abstract "IceMorph:An Automated Morphological Analyzer and English-Language Lookup Tool for Old Icelandic1 Timothy R. Tangherlini, Aurelijus Vijūnas, Kryztof Urban, and Peter M. Broadwell Introduction The advent of inexpensive computing and the creation of large machine-actionable corpora consisting of well-structured digital texts have made it possible to analyze and mark for morphosyn-tactic features significant amounts of text (> 1,000,000 tokens) with a high degree of accuracy (> 80 percent) rapidly and automatically. Although the problem of automatically tagging text with part-of-speech [End Page 425] (POS) information has been largely solved for languages with little morphonological complexity,2 more complex languages, such as Old Icelandic (OIc) and other ancient languages, continue to pose problems for automated systems. Despite these difficulties, rich morphosyntactic markup that includes lemmatization holds great promise for both linguistic and textual scholarship. Accurate markup would enable the development of sophisticated online study environments that allow researchers to perform complex searches, make comparisons across multiple texts, and generate calculations concerning word-use and syntactical patterns. Our work, focusing on Old Icelandic, confirms that even for morphonologically complex Indo-European languages, the information gain offered by automatic morphosyntactic analysis of texts, measured as the percentage of correctly tagged tokens, sentences, and complete texts over the extant corpus, offers a marked improvement over previously available hand-marked texts (Rögnvaldsson and Helgadóttir 2008, 2011).3 Even the most detailed and accurate indexes produced in the past centuries—such as Ordförrådet i de älsta isländska handskrifterna (Larsson 1891), which provides an accurate and exhaustive word-form index for a number of the oldest Old Icelandic manuscripts (ranging from late twelfth- to mid-thirteenth-century manuscripts)—offer only minimal coverage when compared to the very large number of extant Old Icelandic texts. For a researcher interested in the study of the entire Old Icelandic corpus (or a large sub-corpus of Old Icelandic literature), these early handbooks, no matter how accurately compiled, are of limited use. Unfortunately, it is not economically feasible to extend the earlier practice of manual encoding to a greater number of manuscripts; the manual compilation of handbooks is costly and requires tremendous amounts of time, expertise, and energy. The old “paper-and-pen” approach does not, to borrow a term from computer science, “scale” well. A dream of many researchers in Old Icelandic is to be able to work with a large number of texts (and manuscript witnesses to texts)—or even a comprehensive corpus—that include the high level of morpho-syntactic detail of the early handbooks mentioned above. Similarly, [End Page 426] historical linguists (especially syntacticians) are eager to work with a much larger parsed corpus of Old Icelandic texts than is currently available. Recent work, such as that of the Icelandic Parsed Historical Corpus group (IcePaHC) (Wallenberg et al. 2011) is a major step toward making such resources available, as it provides a considerable number of texts tagged in a semi-supervised fashion, and moves us closer to a comprehensive parsed Old Icelandic corpus. Yet, it is unlikely that IcePaHC alone will provide adequate coverage for Old Icelandic textual research, in part because it is focused on the historical development of Icelandic up through the present, and in part because it provides limited lemmatization of the texts. As such, IcePaHC diverges from our project, which has as its sole focus the morphosyntactic analysis and lemmatization of Old Icelandic texts. We believe that the computational methods developed by our group can augment those of IcePaHC and others, and have the potential to not only extend the necessarily limited scope of the earlier historical handbooks, but also increase considerably the number of richly marked texts available to researchers.4 Automatic morphosyntactic analysis of Old Icelandic offers an efficient method for accurately tagging millions of tokens in the growing corpus of machine-actionable texts. Rögnvaldsson and Helgadóttir, for instance, estimate the total number of tokens in their target Old Icelandic corpus at ~1.6 million (Rögnvaldsson and Helgadóttir 2011, 67). This estimate is only a fraction of the overall Old Icelandic corpus, as their corpus does not include the poetic corpus, the Kings..." @default.
- W2003530519 created "2016-06-24" @default.
- W2003530519 creator A5012913329 @default.
- W2003530519 creator A5023138416 @default.
- W2003530519 creator A5037673211 @default.
- W2003530519 creator A5083866825 @default.
- W2003530519 date "2014-01-01" @default.
- W2003530519 modified "2023-10-16" @default.
- W2003530519 title "IceMorph: An Automated Morphological Analyzer and English-Language Lookup Tool for Old Icelandic" @default.
- W2003530519 cites W1499353465 @default.
- W2003530519 cites W1575203283 @default.
- W2003530519 cites W1578977377 @default.
- W2003530519 cites W1647671624 @default.
- W2003530519 cites W1978984912 @default.
- W2003530519 cites W2003495731 @default.
- W2003530519 cites W2011395968 @default.
- W2003530519 cites W2016209609 @default.
- W2003530519 cites W2032767277 @default.
- W2003530519 cites W2081687495 @default.
- W2003530519 cites W2081772263 @default.
- W2003530519 cites W2100299687 @default.
- W2003530519 cites W2155114362 @default.
- W2003530519 cites W2156712632 @default.
- W2003530519 cites W2293341446 @default.
- W2003530519 cites W2334801970 @default.
- W2003530519 cites W2518667411 @default.
- W2003530519 cites W2802544722 @default.
- W2003530519 cites W2913846571 @default.
- W2003530519 cites W596169545 @default.
- W2003530519 doi "https://doi.org/10.1353/scd.2014.0036" @default.
- W2003530519 hasPublicationYear "2014" @default.
- W2003530519 type Work @default.
- W2003530519 sameAs 2003530519 @default.
- W2003530519 citedByCount "0" @default.
- W2003530519 crossrefType "journal-article" @default.
- W2003530519 hasAuthorship W2003530519A5012913329 @default.
- W2003530519 hasAuthorship W2003530519A5023138416 @default.
- W2003530519 hasAuthorship W2003530519A5037673211 @default.
- W2003530519 hasAuthorship W2003530519A5083866825 @default.
- W2003530519 hasConcept C136764020 @default.
- W2003530519 hasConcept C138885662 @default.
- W2003530519 hasConcept C154945302 @default.
- W2003530519 hasConcept C161831844 @default.
- W2003530519 hasConcept C178300618 @default.
- W2003530519 hasConcept C204321447 @default.
- W2003530519 hasConcept C2776957530 @default.
- W2003530519 hasConcept C41008148 @default.
- W2003530519 hasConcept C41895202 @default.
- W2003530519 hasConcept C45874996 @default.
- W2003530519 hasConcept C60048249 @default.
- W2003530519 hasConcept C78458016 @default.
- W2003530519 hasConcept C86803240 @default.
- W2003530519 hasConcept C8797682 @default.
- W2003530519 hasConceptScore W2003530519C136764020 @default.
- W2003530519 hasConceptScore W2003530519C138885662 @default.
- W2003530519 hasConceptScore W2003530519C154945302 @default.
- W2003530519 hasConceptScore W2003530519C161831844 @default.
- W2003530519 hasConceptScore W2003530519C178300618 @default.
- W2003530519 hasConceptScore W2003530519C204321447 @default.
- W2003530519 hasConceptScore W2003530519C2776957530 @default.
- W2003530519 hasConceptScore W2003530519C41008148 @default.
- W2003530519 hasConceptScore W2003530519C41895202 @default.
- W2003530519 hasConceptScore W2003530519C45874996 @default.
- W2003530519 hasConceptScore W2003530519C60048249 @default.
- W2003530519 hasConceptScore W2003530519C78458016 @default.
- W2003530519 hasConceptScore W2003530519C86803240 @default.
- W2003530519 hasConceptScore W2003530519C8797682 @default.
- W2003530519 hasIssue "4" @default.
- W2003530519 hasLocation W20035305191 @default.
- W2003530519 hasOpenAccess W2003530519 @default.
- W2003530519 hasPrimaryLocation W20035305191 @default.
- W2003530519 hasRelatedWork W1556925648 @default.
- W2003530519 hasRelatedWork W2333691506 @default.
- W2003530519 hasRelatedWork W2791579156 @default.
- W2003530519 hasRelatedWork W2966776769 @default.
- W2003530519 hasRelatedWork W2982490159 @default.
- W2003530519 hasRelatedWork W4233770330 @default.
- W2003530519 hasRelatedWork W4299801215 @default.
- W2003530519 hasRelatedWork W657831969 @default.
- W2003530519 hasRelatedWork W1551406738 @default.
- W2003530519 hasRelatedWork W2572765386 @default.
- W2003530519 hasVolume "86" @default.
- W2003530519 isParatext "false" @default.
- W2003530519 isRetracted "false" @default.
- W2003530519 magId "2003530519" @default.
- W2003530519 workType "article" @default.