Matches in SemOpenAlex for { <https://semopenalex.org/work/W2254575701> ?p ?o ?g. }
Showing items 1 to 67 of
67
with 100 items per page.
- W2254575701 abstract "The thesis objective is to design and build a high quality Hidden Markov Model (HMM-)based Text-To-Speech (TTS) system for Vietnamese – a tonal language. The system is called VTED (Vietnamese TExt-tospeech Development system). In view of the great importance of lexical tones, a “tonophone” – an allophone in tonal context – was proposed as a new speech unit in our TTS system. A new training corpus, VDTS (Vietnamese Di-Tonophone Speech corpus), was designed for 100% coverage of di-phones in tonal contexts (i.e. di-tonophones) using the greedy algorithm from a huge raw text. A total of about 4,000 sentences of VDTS were recorded and pre-processed as a training corpus of VTED.In the HMM-based speech synthesis, although pause duration can be modeled as a phoneme, the appearanceof pauses cannot be predicted by HMMs. Lower phrasing levels above words may not be completely modeled with basic features. This research aimed at automatic prosodic phrasing for Vietnamese TTS using durational clues alone as it appeared too difficult to disentangle intonation from lexical tones. Syntactic blocks, i.e. syntactic phrases with a bounded number of syllables (n), were proposed for predicting final lengthening (n = 6) and pause appearance (n = 10). Improvements for final lengthening were done by some strategies of grouping single syntactic blocks. The quality of the predictive J48-decision-tree model for pause appearance using syntactic blocks combining with syntactic link and POS (Part-Of-Speech) features reached F-score of 81.4% Precision=87.6%, Recall=75.9%), much better than that of the model with only POS (F-score=43.6%)or syntactic link (F-score=52.6%) alone.The architecture of the system was proposed on the basis of the core architecture of HTS with an extension of a Natural Language Processing part for Vietnamese. Pause appearance was predicted by the proposed model. Contextual feature set included phone identity features, locational features, tone-related features, and prosodic features (i.e. POS, final lengthening, break levels). Mary TTS was chosen as a platform for implementing VTED. In the MOS (Mean Opinion Score) test, the first VTED, trained with the old corpus and basic features, was rather good, 0.81 (on a 5 point MOS scale) higher than the previous system – HoaSung (using the non-uniform unit selection with the same training corpus); but still 1.2-1.5 point lower than the natural speech. The quality of the final VTED, trained with the new corpus and prosodic phrasing model, progressed by about 1.04 compared to the first VTED, and its gap with the natural speech was much lessened. In the tone intelligibility test, the final VTED received a high correct rate of 95.4%, only 2.6% lower than the natural speech, and 18% higher than the initial one. The error rate of the first VTED in the intelligibility test with the Latin square design was about 6-12% higher than the natural speech depending on syllable, tone or phone levels. The final one diverged about only 0.4-1.4% from the natural speech." @default.
- W2254575701 created "2016-06-24" @default.
- W2254575701 creator A5050347338 @default.
- W2254575701 date "2015-09-24" @default.
- W2254575701 modified "2023-09-23" @default.
- W2254575701 title "HMM-based Vietnamese Text-To-Speech : Prosodic Phrasing Modeling, Corpus Design System Design, and Evaluation" @default.
- W2254575701 cites W3139965388 @default.
- W2254575701 hasPublicationYear "2015" @default.
- W2254575701 type Work @default.
- W2254575701 sameAs 2254575701 @default.
- W2254575701 citedByCount "1" @default.
- W2254575701 countsByYear W22545757012020 @default.
- W2254575701 crossrefType "dissertation" @default.
- W2254575701 hasAuthorship W2254575701A5050347338 @default.
- W2254575701 hasConcept C103621254 @default.
- W2254575701 hasConcept C138885662 @default.
- W2254575701 hasConcept C14999030 @default.
- W2254575701 hasConcept C151730666 @default.
- W2254575701 hasConcept C154945302 @default.
- W2254575701 hasConcept C204321447 @default.
- W2254575701 hasConcept C23224414 @default.
- W2254575701 hasConcept C2779343474 @default.
- W2254575701 hasConcept C2781045179 @default.
- W2254575701 hasConcept C28490314 @default.
- W2254575701 hasConcept C41008148 @default.
- W2254575701 hasConcept C41895202 @default.
- W2254575701 hasConcept C86803240 @default.
- W2254575701 hasConceptScore W2254575701C103621254 @default.
- W2254575701 hasConceptScore W2254575701C138885662 @default.
- W2254575701 hasConceptScore W2254575701C14999030 @default.
- W2254575701 hasConceptScore W2254575701C151730666 @default.
- W2254575701 hasConceptScore W2254575701C154945302 @default.
- W2254575701 hasConceptScore W2254575701C204321447 @default.
- W2254575701 hasConceptScore W2254575701C23224414 @default.
- W2254575701 hasConceptScore W2254575701C2779343474 @default.
- W2254575701 hasConceptScore W2254575701C2781045179 @default.
- W2254575701 hasConceptScore W2254575701C28490314 @default.
- W2254575701 hasConceptScore W2254575701C41008148 @default.
- W2254575701 hasConceptScore W2254575701C41895202 @default.
- W2254575701 hasConceptScore W2254575701C86803240 @default.
- W2254575701 hasLocation W22545757011 @default.
- W2254575701 hasOpenAccess W2254575701 @default.
- W2254575701 hasPrimaryLocation W22545757011 @default.
- W2254575701 hasRelatedWork W104669487 @default.
- W2254575701 hasRelatedWork W121607380 @default.
- W2254575701 hasRelatedWork W157687613 @default.
- W2254575701 hasRelatedWork W2034090682 @default.
- W2254575701 hasRelatedWork W2043209642 @default.
- W2254575701 hasRelatedWork W2102322458 @default.
- W2254575701 hasRelatedWork W2169631155 @default.
- W2254575701 hasRelatedWork W2295125710 @default.
- W2254575701 hasRelatedWork W2355381779 @default.
- W2254575701 hasRelatedWork W2393947547 @default.
- W2254575701 hasRelatedWork W2396108715 @default.
- W2254575701 hasRelatedWork W2515145832 @default.
- W2254575701 hasRelatedWork W2617892953 @default.
- W2254575701 hasRelatedWork W2792578205 @default.
- W2254575701 hasRelatedWork W2952712483 @default.
- W2254575701 hasRelatedWork W3008260965 @default.
- W2254575701 hasRelatedWork W1703163951 @default.
- W2254575701 hasRelatedWork W2554190555 @default.
- W2254575701 hasRelatedWork W2844871513 @default.
- W2254575701 hasRelatedWork W3150471556 @default.
- W2254575701 isParatext "false" @default.
- W2254575701 isRetracted "false" @default.
- W2254575701 magId "2254575701" @default.
- W2254575701 workType "dissertation" @default.