Matches in SemOpenAlex for { <https://semopenalex.org/work/W2742700688> ?p ?o ?g. }
Showing items 1 to 94 of
94
with 100 items per page.
- W2742700688 endingPage "305" @default.
- W2742700688 startingPage "305" @default.
- W2742700688 abstract "Diacritic Restoration is a necessity in the processing of languages with Latinbased scripts that utilizes letters outside the basic Latin alphabet used by English language. Yorùbá is one such languages, marking underdot (dot-below)on three characters and tone marks on all seven vowels and two syllabic nasals. The problem of restoring underdotted characters has been fairly addressed using character as linguistic units for restoration. However, the existing characterbased approaches and word-based approach has not been able to sufficiently address restoration of tone marks in Yorùbá. We address in this study tone marks restoration as a subset of diacritic restoration.We proposed using the syllable (derived from word) as the linguistic token for tone marks restoration. In our experimental setup, we used Yoruba text collected from various sources as data with total word count of 250,336 words. These words, on syllabification, yielded 464,274 syllables. The syllables were divided into training and testing data in different proportions ranging from 99% used for training and 1% used for testing to 70% used for training and 30% used for testing. The aim of evaluation different proportions was to determine how the ratio of training-to-test data affect the variations that may occur in the result. We applied Memory-based learning to train the models. We also set up a similar experiment using character token to be able to compare the performance.The result showed that using syllable was able to increase accuracy at word level to 96.23% and an average of almost 15% over that gotten from using character. We also found out that using 75% of data for training and the remaining 25% for testing gives the results with the least variation in a ten-fold cross validation test. Hybridizing the syllable „based approach with other methods like lexicon lookup might likely lead to improvement over the current result." @default.
- W2742700688 created "2017-08-17" @default.
- W2742700688 creator A5022332556 @default.
- W2742700688 creator A5032599441 @default.
- W2742700688 creator A5060870831 @default.
- W2742700688 date "2017-01-01" @default.
- W2742700688 modified "2023-10-16" @default.
- W2742700688 title "RESTORING TONE-MARKS IN STANDARD YORÙBÁ ELECTRONIC TEXT: IMPROVED MODEL" @default.
- W2742700688 cites W149366531 @default.
- W2742700688 cites W1496925889 @default.
- W2742700688 cites W1501212164 @default.
- W2742700688 cites W1575976310 @default.
- W2742700688 cites W195358446 @default.
- W2742700688 cites W2009840120 @default.
- W2742700688 cites W2098846312 @default.
- W2742700688 cites W2102131353 @default.
- W2742700688 cites W2109613320 @default.
- W2742700688 cites W2135421390 @default.
- W2742700688 cites W2149995043 @default.
- W2742700688 cites W2169141432 @default.
- W2742700688 cites W2182640088 @default.
- W2742700688 cites W2403070565 @default.
- W2742700688 doi "https://doi.org/10.7494/csci.2017.18.3.2128" @default.
- W2742700688 hasPublicationYear "2017" @default.
- W2742700688 type Work @default.
- W2742700688 sameAs 2742700688 @default.
- W2742700688 citedByCount "11" @default.
- W2742700688 countsByYear W27427006882018 @default.
- W2742700688 countsByYear W27427006882019 @default.
- W2742700688 countsByYear W27427006882020 @default.
- W2742700688 countsByYear W27427006882021 @default.
- W2742700688 countsByYear W27427006882022 @default.
- W2742700688 crossrefType "journal-article" @default.
- W2742700688 hasAuthorship W2742700688A5022332556 @default.
- W2742700688 hasAuthorship W2742700688A5032599441 @default.
- W2742700688 hasAuthorship W2742700688A5060870831 @default.
- W2742700688 hasBestOaLocation W27427006881 @default.
- W2742700688 hasConcept C109089402 @default.
- W2742700688 hasConcept C130727458 @default.
- W2742700688 hasConcept C138885662 @default.
- W2742700688 hasConcept C14999030 @default.
- W2742700688 hasConcept C154945302 @default.
- W2742700688 hasConcept C204321447 @default.
- W2742700688 hasConcept C2524010 @default.
- W2742700688 hasConcept C2777568999 @default.
- W2742700688 hasConcept C2779211743 @default.
- W2742700688 hasConcept C2779581591 @default.
- W2742700688 hasConcept C2780583480 @default.
- W2742700688 hasConcept C2780861071 @default.
- W2742700688 hasConcept C28490314 @default.
- W2742700688 hasConcept C33923547 @default.
- W2742700688 hasConcept C41008148 @default.
- W2742700688 hasConcept C41895202 @default.
- W2742700688 hasConcept C520968082 @default.
- W2742700688 hasConcept C90805587 @default.
- W2742700688 hasConceptScore W2742700688C109089402 @default.
- W2742700688 hasConceptScore W2742700688C130727458 @default.
- W2742700688 hasConceptScore W2742700688C138885662 @default.
- W2742700688 hasConceptScore W2742700688C14999030 @default.
- W2742700688 hasConceptScore W2742700688C154945302 @default.
- W2742700688 hasConceptScore W2742700688C204321447 @default.
- W2742700688 hasConceptScore W2742700688C2524010 @default.
- W2742700688 hasConceptScore W2742700688C2777568999 @default.
- W2742700688 hasConceptScore W2742700688C2779211743 @default.
- W2742700688 hasConceptScore W2742700688C2779581591 @default.
- W2742700688 hasConceptScore W2742700688C2780583480 @default.
- W2742700688 hasConceptScore W2742700688C2780861071 @default.
- W2742700688 hasConceptScore W2742700688C28490314 @default.
- W2742700688 hasConceptScore W2742700688C33923547 @default.
- W2742700688 hasConceptScore W2742700688C41008148 @default.
- W2742700688 hasConceptScore W2742700688C41895202 @default.
- W2742700688 hasConceptScore W2742700688C520968082 @default.
- W2742700688 hasConceptScore W2742700688C90805587 @default.
- W2742700688 hasIssue "3" @default.
- W2742700688 hasLocation W27427006881 @default.
- W2742700688 hasOpenAccess W2742700688 @default.
- W2742700688 hasPrimaryLocation W27427006881 @default.
- W2742700688 hasRelatedWork W2017959170 @default.
- W2742700688 hasRelatedWork W2040310682 @default.
- W2742700688 hasRelatedWork W2050631214 @default.
- W2742700688 hasRelatedWork W2129139599 @default.
- W2742700688 hasRelatedWork W2134541711 @default.
- W2742700688 hasRelatedWork W2169373125 @default.
- W2742700688 hasRelatedWork W2355417428 @default.
- W2742700688 hasRelatedWork W2742700688 @default.
- W2742700688 hasRelatedWork W3162848039 @default.
- W2742700688 hasRelatedWork W4286543243 @default.
- W2742700688 hasVolume "18" @default.
- W2742700688 isParatext "false" @default.
- W2742700688 isRetracted "false" @default.
- W2742700688 magId "2742700688" @default.
- W2742700688 workType "article" @default.