Matches in SemOpenAlex for { <https://semopenalex.org/work/W3178155035> ?p ?o ?g. }
- W3178155035 endingPage "131" @default.
- W3178155035 startingPage "104" @default.
- W3178155035 abstract "This paper discusses the process of part-of-speech tagging the Corpus of Early English Correspondence Extension (CEECE), as well as the end result. The process involved normalisation of historical spelling variation, conversion from a legacy format into TEI-XML, and finally, tokenisation and tagging by the CLAWS software. At each stage, we had to face and work around problems such as whether to retain original spelling variants in corpus markup, how to implement overlapping hierarchies in XML, and how to calculate the accuracy of tagging in a way that acknowledges errors in tokenisation. The final tagged corpus is estimated to have an accuracy of 94.5 per cent (in the C7 tagset), which is circa two percentage points (pp) lower than that of present-day corpora but respectable for Late Modern English. The most accurate tag groups include pronouns and numerals, whereas adjectives and adverbs are among the least accurate. Normalisation increased the overall accuracy of tagging by circa 3.7pp. The combination of POS tagging and social metadata will make the corpus attractive to linguists interested in the interplay between language-internal and -external factors affecting variation and change." @default.
- W3178155035 created "2021-07-19" @default.
- W3178155035 creator A5000404764 @default.
- W3178155035 creator A5017797301 @default.
- W3178155035 creator A5063116902 @default.
- W3178155035 creator A5084099087 @default.
- W3178155035 date "2021-01-01" @default.
- W3178155035 modified "2023-10-18" @default.
- W3178155035 title "The burden of legacy: Producing the Tagged Corpus of Early English Correspondence Extension (TCEECE)" @default.
- W3178155035 cites W1524429405 @default.
- W3178155035 cites W1972626100 @default.
- W3178155035 cites W2059822824 @default.
- W3178155035 cites W2070727180 @default.
- W3178155035 cites W2095503871 @default.
- W3178155035 cites W2137171871 @default.
- W3178155035 cites W2206693671 @default.
- W3178155035 cites W2251948173 @default.
- W3178155035 cites W2419628098 @default.
- W3178155035 cites W260880220 @default.
- W3178155035 cites W2724671435 @default.
- W3178155035 cites W2729795264 @default.
- W3178155035 cites W2745212158 @default.
- W3178155035 cites W2796567301 @default.
- W3178155035 cites W2802769976 @default.
- W3178155035 cites W2913006223 @default.
- W3178155035 cites W2968817374 @default.
- W3178155035 cites W2994934388 @default.
- W3178155035 cites W3009098180 @default.
- W3178155035 cites W3029267689 @default.
- W3178155035 cites W3135142735 @default.
- W3178155035 cites W4206014004 @default.
- W3178155035 cites W4233128241 @default.
- W3178155035 cites W4242476331 @default.
- W3178155035 cites W4301420590 @default.
- W3178155035 cites W612903867 @default.
- W3178155035 cites W642889819 @default.
- W3178155035 doi "https://doi.org/10.32714/ricl.09.01.07" @default.
- W3178155035 hasPublicationYear "2021" @default.
- W3178155035 type Work @default.
- W3178155035 sameAs 3178155035 @default.
- W3178155035 citedByCount "1" @default.
- W3178155035 countsByYear W31781550352023 @default.
- W3178155035 crossrefType "journal-article" @default.
- W3178155035 hasAuthorship W3178155035A5000404764 @default.
- W3178155035 hasAuthorship W3178155035A5017797301 @default.
- W3178155035 hasAuthorship W3178155035A5063116902 @default.
- W3178155035 hasAuthorship W3178155035A5084099087 @default.
- W3178155035 hasBestOaLocation W31781550351 @default.
- W3178155035 hasConcept C121332964 @default.
- W3178155035 hasConcept C136764020 @default.
- W3178155035 hasConcept C138885662 @default.
- W3178155035 hasConcept C154945302 @default.
- W3178155035 hasConcept C161831844 @default.
- W3178155035 hasConcept C199360897 @default.
- W3178155035 hasConcept C204321447 @default.
- W3178155035 hasConcept C23123220 @default.
- W3178155035 hasConcept C2777801307 @default.
- W3178155035 hasConcept C2778029271 @default.
- W3178155035 hasConcept C2778334786 @default.
- W3178155035 hasConcept C41008148 @default.
- W3178155035 hasConcept C41895202 @default.
- W3178155035 hasConcept C44870925 @default.
- W3178155035 hasConcept C45874996 @default.
- W3178155035 hasConcept C8797682 @default.
- W3178155035 hasConcept C93518851 @default.
- W3178155035 hasConceptScore W3178155035C121332964 @default.
- W3178155035 hasConceptScore W3178155035C136764020 @default.
- W3178155035 hasConceptScore W3178155035C138885662 @default.
- W3178155035 hasConceptScore W3178155035C154945302 @default.
- W3178155035 hasConceptScore W3178155035C161831844 @default.
- W3178155035 hasConceptScore W3178155035C199360897 @default.
- W3178155035 hasConceptScore W3178155035C204321447 @default.
- W3178155035 hasConceptScore W3178155035C23123220 @default.
- W3178155035 hasConceptScore W3178155035C2777801307 @default.
- W3178155035 hasConceptScore W3178155035C2778029271 @default.
- W3178155035 hasConceptScore W3178155035C2778334786 @default.
- W3178155035 hasConceptScore W3178155035C41008148 @default.
- W3178155035 hasConceptScore W3178155035C41895202 @default.
- W3178155035 hasConceptScore W3178155035C44870925 @default.
- W3178155035 hasConceptScore W3178155035C45874996 @default.
- W3178155035 hasConceptScore W3178155035C8797682 @default.
- W3178155035 hasConceptScore W3178155035C93518851 @default.
- W3178155035 hasIssue "1" @default.
- W3178155035 hasLocation W31781550351 @default.
- W3178155035 hasLocation W31781550352 @default.
- W3178155035 hasOpenAccess W3178155035 @default.
- W3178155035 hasPrimaryLocation W31781550351 @default.
- W3178155035 hasRelatedWork W1521911848 @default.
- W3178155035 hasRelatedWork W1585034923 @default.
- W3178155035 hasRelatedWork W1965294778 @default.
- W3178155035 hasRelatedWork W2076264610 @default.
- W3178155035 hasRelatedWork W2240497660 @default.
- W3178155035 hasRelatedWork W2361349944 @default.
- W3178155035 hasRelatedWork W3107474891 @default.
- W3178155035 hasRelatedWork W1551406738 @default.
- W3178155035 hasRelatedWork W2112842618 @default.
- W3178155035 hasRelatedWork W2594596051 @default.
- W3178155035 hasVolume "9" @default.