Matches in SemOpenAlex for { <https://semopenalex.org/work/W2182026414> ?p ?o ?g. }
Showing items 1 to 92 of
92
with 100 items per page.
- W2182026414 abstract "Recently there have been several initiatives to create locally accessible large scale corpora based on the contents of the Internet. In this paper we present a survey on several such corpora created for different languages. We compare their distinctive features and the amount of additional annotations provided by the developers of those corpora. Text corpora are vital linguistic resources in natural language processing. Recent development in technology, one of the symptom of which is frequent switching to 64bit machines and operating systems, has allowed for compilation and efficient processing of billion-word and larger corpora. There have been several initiatives to create large scale corpora, and the need for those is constantly growing. In this paper we describe some of those initiatives. In the following sections we firstly address the question whether it is necessary to create corpora of that size. Next, we describe some of such corpora, compare their scale, features and the amount of additional annotations performed on them. We also dedicate a separate section to compare corpora in the Japanese language. Finally, we conclude the paper and list up some of the possible applications of large corpora." @default.
- W2182026414 created "2016-06-24" @default.
- W2182026414 creator A5015548774 @default.
- W2182026414 creator A5052933412 @default.
- W2182026414 creator A5063147657 @default.
- W2182026414 creator A5065713476 @default.
- W2182026414 date "2012-01-01" @default.
- W2182026414 modified "2023-09-26" @default.
- W2182026414 title "A Survey on Large Scale Web Based Corpora" @default.
- W2182026414 cites W1543096214 @default.
- W2182026414 cites W1551583332 @default.
- W2182026414 cites W1587985368 @default.
- W2182026414 cites W1983774632 @default.
- W2182026414 cites W2055866867 @default.
- W2182026414 cites W2077700035 @default.
- W2182026414 cites W2079656678 @default.
- W2182026414 cites W2115054880 @default.
- W2182026414 cites W2133322356 @default.
- W2182026414 cites W2155870214 @default.
- W2182026414 cites W2162095731 @default.
- W2182026414 cites W2288218587 @default.
- W2182026414 cites W2363741699 @default.
- W2182026414 cites W2406530069 @default.
- W2182026414 cites W24675619 @default.
- W2182026414 cites W53767355 @default.
- W2182026414 cites W2053131585 @default.
- W2182026414 hasPublicationYear "2012" @default.
- W2182026414 type Work @default.
- W2182026414 sameAs 2182026414 @default.
- W2182026414 citedByCount "0" @default.
- W2182026414 crossrefType "journal-article" @default.
- W2182026414 hasAuthorship W2182026414A5015548774 @default.
- W2182026414 hasAuthorship W2182026414A5052933412 @default.
- W2182026414 hasAuthorship W2182026414A5063147657 @default.
- W2182026414 hasAuthorship W2182026414A5065713476 @default.
- W2182026414 hasConcept C110875604 @default.
- W2182026414 hasConcept C121332964 @default.
- W2182026414 hasConcept C136764020 @default.
- W2182026414 hasConcept C138885662 @default.
- W2182026414 hasConcept C154945302 @default.
- W2182026414 hasConcept C203005215 @default.
- W2182026414 hasConcept C204321447 @default.
- W2182026414 hasConcept C23123220 @default.
- W2182026414 hasConcept C2474386 @default.
- W2182026414 hasConcept C2778755073 @default.
- W2182026414 hasConcept C2985367798 @default.
- W2182026414 hasConcept C41008148 @default.
- W2182026414 hasConcept C41895202 @default.
- W2182026414 hasConcept C62520636 @default.
- W2182026414 hasConcept C90805587 @default.
- W2182026414 hasConceptScore W2182026414C110875604 @default.
- W2182026414 hasConceptScore W2182026414C121332964 @default.
- W2182026414 hasConceptScore W2182026414C136764020 @default.
- W2182026414 hasConceptScore W2182026414C138885662 @default.
- W2182026414 hasConceptScore W2182026414C154945302 @default.
- W2182026414 hasConceptScore W2182026414C203005215 @default.
- W2182026414 hasConceptScore W2182026414C204321447 @default.
- W2182026414 hasConceptScore W2182026414C23123220 @default.
- W2182026414 hasConceptScore W2182026414C2474386 @default.
- W2182026414 hasConceptScore W2182026414C2778755073 @default.
- W2182026414 hasConceptScore W2182026414C2985367798 @default.
- W2182026414 hasConceptScore W2182026414C41008148 @default.
- W2182026414 hasConceptScore W2182026414C41895202 @default.
- W2182026414 hasConceptScore W2182026414C62520636 @default.
- W2182026414 hasConceptScore W2182026414C90805587 @default.
- W2182026414 hasLocation W21820264141 @default.
- W2182026414 hasOpenAccess W2182026414 @default.
- W2182026414 hasPrimaryLocation W21820264141 @default.
- W2182026414 hasRelatedWork W119453913 @default.
- W2182026414 hasRelatedWork W1830951065 @default.
- W2182026414 hasRelatedWork W1968848598 @default.
- W2182026414 hasRelatedWork W2134287790 @default.
- W2182026414 hasRelatedWork W2185853379 @default.
- W2182026414 hasRelatedWork W2250251061 @default.
- W2182026414 hasRelatedWork W2251542277 @default.
- W2182026414 hasRelatedWork W2252238828 @default.
- W2182026414 hasRelatedWork W2392575348 @default.
- W2182026414 hasRelatedWork W2397016496 @default.
- W2182026414 hasRelatedWork W2594145042 @default.
- W2182026414 hasRelatedWork W2599814363 @default.
- W2182026414 hasRelatedWork W2763856713 @default.
- W2182026414 hasRelatedWork W2913434276 @default.
- W2182026414 hasRelatedWork W3021161450 @default.
- W2182026414 hasRelatedWork W3030901678 @default.
- W2182026414 hasRelatedWork W3137010024 @default.
- W2182026414 hasRelatedWork W3204363142 @default.
- W2182026414 hasRelatedWork W1872130062 @default.
- W2182026414 hasRelatedWork W2586548518 @default.
- W2182026414 isParatext "false" @default.
- W2182026414 isRetracted "false" @default.
- W2182026414 magId "2182026414" @default.
- W2182026414 workType "article" @default.