Matches in SemOpenAlex for { <https://semopenalex.org/work/W4287662979> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W4287662979 abstract "We present an open-source speech corpus for the Kazakh language. The Kazakh speech corpus (KSC) contains around 332 hours of transcribed audio comprising over 153,000 utterances spoken by participants from different regions and age groups, as well as both genders. It was carefully inspected by native Kazakh speakers to ensure high quality. The KSC is the largest publicly available database developed to advance various Kazakh speech and language processing applications. In this paper, we first describe the data collection and preprocessing procedures followed by a description of the database specifications. We also share our experience and challenges faced during the database construction, which might benefit other researchers planning to build a speech corpus for a low-resource language. To demonstrate the reliability of the database, we performed preliminary speech recognition experiments. The experimental results imply that the quality of audio and transcripts is promising (2.8% character error rate and 8.7% word error rate on the test set). To enable experiment reproducibility and ease the corpus usage, we also released an ESPnet recipe for our speech recognition models." @default.
- W4287662979 created "2022-07-25" @default.
- W4287662979 creator A5008567170 @default.
- W4287662979 creator A5044782949 @default.
- W4287662979 creator A5047231885 @default.
- W4287662979 creator A5047310084 @default.
- W4287662979 creator A5064373757 @default.
- W4287662979 creator A5071916620 @default.
- W4287662979 date "2020-09-22" @default.
- W4287662979 modified "2023-09-27" @default.
- W4287662979 title "A Crowdsourced Open-Source Kazakh Speech Corpus and Initial Speech Recognition Baseline" @default.
- W4287662979 doi "https://doi.org/10.48550/arxiv.2009.10334" @default.
- W4287662979 hasPublicationYear "2020" @default.
- W4287662979 type Work @default.
- W4287662979 citedByCount "0" @default.
- W4287662979 crossrefType "posted-content" @default.
- W4287662979 hasAuthorship W4287662979A5008567170 @default.
- W4287662979 hasAuthorship W4287662979A5044782949 @default.
- W4287662979 hasAuthorship W4287662979A5047231885 @default.
- W4287662979 hasAuthorship W4287662979A5047310084 @default.
- W4287662979 hasAuthorship W4287662979A5064373757 @default.
- W4287662979 hasAuthorship W4287662979A5071916620 @default.
- W4287662979 hasBestOaLocation W42876629791 @default.
- W4287662979 hasConcept C111368507 @default.
- W4287662979 hasConcept C12725497 @default.
- W4287662979 hasConcept C127313418 @default.
- W4287662979 hasConcept C138885662 @default.
- W4287662979 hasConcept C14999030 @default.
- W4287662979 hasConcept C154945302 @default.
- W4287662979 hasConcept C155635449 @default.
- W4287662979 hasConcept C157968479 @default.
- W4287662979 hasConcept C204321447 @default.
- W4287662979 hasConcept C2781297163 @default.
- W4287662979 hasConcept C28490314 @default.
- W4287662979 hasConcept C34736171 @default.
- W4287662979 hasConcept C40969351 @default.
- W4287662979 hasConcept C41008148 @default.
- W4287662979 hasConcept C41895202 @default.
- W4287662979 hasConcept C61328038 @default.
- W4287662979 hasConcept C73411735 @default.
- W4287662979 hasConcept C91863865 @default.
- W4287662979 hasConceptScore W4287662979C111368507 @default.
- W4287662979 hasConceptScore W4287662979C12725497 @default.
- W4287662979 hasConceptScore W4287662979C127313418 @default.
- W4287662979 hasConceptScore W4287662979C138885662 @default.
- W4287662979 hasConceptScore W4287662979C14999030 @default.
- W4287662979 hasConceptScore W4287662979C154945302 @default.
- W4287662979 hasConceptScore W4287662979C155635449 @default.
- W4287662979 hasConceptScore W4287662979C157968479 @default.
- W4287662979 hasConceptScore W4287662979C204321447 @default.
- W4287662979 hasConceptScore W4287662979C2781297163 @default.
- W4287662979 hasConceptScore W4287662979C28490314 @default.
- W4287662979 hasConceptScore W4287662979C34736171 @default.
- W4287662979 hasConceptScore W4287662979C40969351 @default.
- W4287662979 hasConceptScore W4287662979C41008148 @default.
- W4287662979 hasConceptScore W4287662979C41895202 @default.
- W4287662979 hasConceptScore W4287662979C61328038 @default.
- W4287662979 hasConceptScore W4287662979C73411735 @default.
- W4287662979 hasConceptScore W4287662979C91863865 @default.
- W4287662979 hasLocation W42876629791 @default.
- W4287662979 hasOpenAccess W4287662979 @default.
- W4287662979 hasPrimaryLocation W42876629791 @default.
- W4287662979 hasRelatedWork W1531539397 @default.
- W4287662979 hasRelatedWork W2009810445 @default.
- W4287662979 hasRelatedWork W2060471688 @default.
- W4287662979 hasRelatedWork W2784059283 @default.
- W4287662979 hasRelatedWork W2790613087 @default.
- W4287662979 hasRelatedWork W3088421464 @default.
- W4287662979 hasRelatedWork W3112480982 @default.
- W4287662979 hasRelatedWork W3128983566 @default.
- W4287662979 hasRelatedWork W4223610296 @default.
- W4287662979 hasRelatedWork W4287662979 @default.
- W4287662979 isParatext "false" @default.
- W4287662979 isRetracted "false" @default.
- W4287662979 workType "article" @default.