Matches in SemOpenAlex for { <https://semopenalex.org/work/W2004459722> ?p ?o ?g. }
Showing items 1 to 46 of
46
with 100 items per page.
- W2004459722 abstract "This paper describes a method of detecting Japanese Katakana variants from a large corpus. Katakana words, which are mainly used as loanwords, cause problems with information retrieval and so on, because transliteration creates several variations in spelling and all of these can be orthographic. Previous works manually defined Katakana rewrite rules such as %Y (be) and %t%' (ve) being replaceable with each other, for generating variants and also defined the weight of each operation to edit one string into another to detect these variants. However, these previous researches have not been able to keep up with the ever-increasing number of loanwords and their variants. With our method proposed in this paper, the weight of each edit operation is mechanically assigned based on Web data. In experiments, it performed almost as well as one with manually determined weights. Thus, the advantages of our method are: 1) need no expertise in linguistics to determine weight of each operation, and 2) able to keep up with new Katakana loanwords only by collecting text data from Web and acquiring new weights of edit operations automatically. It also achieved 98.6% recall and 86.3% precision in the task of extracting Katakana variant pairs from 38 year's worth of corpora of Japanese newspaper articles." @default.
- W2004459722 created "2016-06-24" @default.
- W2004459722 creator A5020912760 @default.
- W2004459722 creator A5025153902 @default.
- W2004459722 date "2005-08-15" @default.
- W2004459722 modified "2023-10-02" @default.
- W2004459722 title "Web-based acquisition of Japanese katakana variants" @default.
- W2004459722 cites W2010392031 @default.
- W2004459722 cites W2011806618 @default.
- W2004459722 cites W2032571942 @default.
- W2004459722 cites W2060165117 @default.
- W2004459722 cites W2097734711 @default.
- W2004459722 doi "https://doi.org/10.1145/1076034.1076093" @default.
- W2004459722 hasPublicationYear "2005" @default.
- W2004459722 type Work @default.
- W2004459722 sameAs 2004459722 @default.
- W2004459722 citedByCount "4" @default.
- W2004459722 countsByYear W20044597222013 @default.
- W2004459722 crossrefType "proceedings-article" @default.
- W2004459722 hasAuthorship W2004459722A5020912760 @default.
- W2004459722 hasAuthorship W2004459722A5025153902 @default.
- W2004459722 hasConcept C136764020 @default.
- W2004459722 hasConcept C154945302 @default.
- W2004459722 hasConcept C204321447 @default.
- W2004459722 hasConcept C41008148 @default.
- W2004459722 hasConceptScore W2004459722C136764020 @default.
- W2004459722 hasConceptScore W2004459722C154945302 @default.
- W2004459722 hasConceptScore W2004459722C204321447 @default.
- W2004459722 hasConceptScore W2004459722C41008148 @default.
- W2004459722 hasLocation W20044597221 @default.
- W2004459722 hasOpenAccess W2004459722 @default.
- W2004459722 hasPrimaryLocation W20044597221 @default.
- W2004459722 hasRelatedWork W1552159754 @default.
- W2004459722 hasRelatedWork W2131420137 @default.
- W2004459722 hasRelatedWork W2148757832 @default.
- W2004459722 hasRelatedWork W2293457016 @default.
- W2004459722 hasRelatedWork W2368651715 @default.
- W2004459722 hasRelatedWork W2611614995 @default.
- W2004459722 hasRelatedWork W2748952813 @default.
- W2004459722 hasRelatedWork W2789919619 @default.
- W2004459722 hasRelatedWork W3107474891 @default.
- W2004459722 hasRelatedWork W3169305685 @default.
- W2004459722 isParatext "false" @default.
- W2004459722 isRetracted "false" @default.
- W2004459722 magId "2004459722" @default.
- W2004459722 workType "article" @default.