Matches in SemOpenAlex for { <https://semopenalex.org/work/W4313394195> ?p ?o ?g. }
Showing items 1 to 89 of
89
with 100 items per page.
- W4313394195 endingPage "45" @default.
- W4313394195 startingPage "36" @default.
- W4313394195 abstract "There have been lots of previous studies on fluency evaluation of spontaneous speech. However, most of them focus on lexical cues, and little emphasis is placed on how diverse acoustic features and deep end-to-end models contribute to improving the performance. In this paper, we describe multi-layer neural network to investigate not only lexical features extracted from transcription, but also consider utterance-level acoustic features from audio data. We also conduct the experiments to investigate the performance of end-to-end approaches with mel-spectrogram in this task. As the speech fluency evaluation task, we evaluate our proposed method in two binary classification tasks of fluent speech detection and disfluent speech detection. Speech data of around 10 seconds duration each with the annotation of the three classes of “fluent,” “neutral,” and “disfluent” is used for evaluation. According to the two way splits of those three classes, the task of fluent speech detection is defined as binary classification of fluent vs. neutral and disfluent, while that of disfluent speech detection is defined as binary classification of fluent and neutral vs. disfluent. We then conduct experiments with the purpose of comparative evaluation of multi-layer neural network with diverse features as well as end-to-end models. For the fluent speech detection, in the comparison of utterance-level disfluency-based, prosodic, and acoustic features with multi-layer neural network, disfluency-based and prosodic features only are better. More specifically, the performance improved a lot when removing all of the acoustic features from the full set of features, while the performance is damaged a lot if fillers related features are removed. Overall, however, the end-to-end Transformer+VGGNet model with mel-spectrogram achieves the best results. For the disfluent speech detection, the multi-layer neural network using disfluency-based, prosodic, and acoustic features without fillers achieves the best results. The end-to-end Transformer+VGGNet architecture also obtains high scores, whereas it is exceeded by the best results with the multi-layer neural network with significant difference. Thus, unlike in the fluent speech detection, disfluency-based and prosodic features other than fillers are still necessary in the disfluent speech detection." @default.
- W4313394195 created "2023-01-06" @default.
- W4313394195 creator A5014342555 @default.
- W4313394195 creator A5033502032 @default.
- W4313394195 creator A5066456246 @default.
- W4313394195 creator A5079042022 @default.
- W4313394195 date "2023-01-01" @default.
- W4313394195 modified "2023-09-25" @default.
- W4313394195 title "Comparative Evaluation of Diverse Features in Fluency Evaluation of Spontaneous Speech" @default.
- W4313394195 cites W101111729 @default.
- W4313394195 cites W1501669607 @default.
- W4313394195 cites W2141001701 @default.
- W4313394195 cites W2161274063 @default.
- W4313394195 cites W2296199791 @default.
- W4313394195 cites W2890964092 @default.
- W4313394195 cites W2892009249 @default.
- W4313394195 cites W2963729456 @default.
- W4313394195 cites W2964172015 @default.
- W4313394195 cites W2972328063 @default.
- W4313394195 cites W3016010032 @default.
- W4313394195 cites W3016114816 @default.
- W4313394195 cites W3016128928 @default.
- W4313394195 cites W3097777922 @default.
- W4313394195 cites W4205354701 @default.
- W4313394195 cites W80810014 @default.
- W4313394195 doi "https://doi.org/10.1587/transinf.2022edp7047" @default.
- W4313394195 hasPublicationYear "2023" @default.
- W4313394195 type Work @default.
- W4313394195 citedByCount "0" @default.
- W4313394195 crossrefType "journal-article" @default.
- W4313394195 hasAuthorship W4313394195A5014342555 @default.
- W4313394195 hasAuthorship W4313394195A5033502032 @default.
- W4313394195 hasAuthorship W4313394195A5066456246 @default.
- W4313394195 hasAuthorship W4313394195A5079042022 @default.
- W4313394195 hasBestOaLocation W43133941951 @default.
- W4313394195 hasConcept C12267149 @default.
- W4313394195 hasConcept C138885662 @default.
- W4313394195 hasConcept C154945302 @default.
- W4313394195 hasConcept C162324750 @default.
- W4313394195 hasConcept C177264268 @default.
- W4313394195 hasConcept C187736073 @default.
- W4313394195 hasConcept C199360897 @default.
- W4313394195 hasConcept C204321447 @default.
- W4313394195 hasConcept C2775852435 @default.
- W4313394195 hasConcept C2777413886 @default.
- W4313394195 hasConcept C2780451532 @default.
- W4313394195 hasConcept C28490314 @default.
- W4313394195 hasConcept C41008148 @default.
- W4313394195 hasConcept C41895202 @default.
- W4313394195 hasConcept C45273575 @default.
- W4313394195 hasConcept C50644808 @default.
- W4313394195 hasConcept C66905080 @default.
- W4313394195 hasConceptScore W4313394195C12267149 @default.
- W4313394195 hasConceptScore W4313394195C138885662 @default.
- W4313394195 hasConceptScore W4313394195C154945302 @default.
- W4313394195 hasConceptScore W4313394195C162324750 @default.
- W4313394195 hasConceptScore W4313394195C177264268 @default.
- W4313394195 hasConceptScore W4313394195C187736073 @default.
- W4313394195 hasConceptScore W4313394195C199360897 @default.
- W4313394195 hasConceptScore W4313394195C204321447 @default.
- W4313394195 hasConceptScore W4313394195C2775852435 @default.
- W4313394195 hasConceptScore W4313394195C2777413886 @default.
- W4313394195 hasConceptScore W4313394195C2780451532 @default.
- W4313394195 hasConceptScore W4313394195C28490314 @default.
- W4313394195 hasConceptScore W4313394195C41008148 @default.
- W4313394195 hasConceptScore W4313394195C41895202 @default.
- W4313394195 hasConceptScore W4313394195C45273575 @default.
- W4313394195 hasConceptScore W4313394195C50644808 @default.
- W4313394195 hasConceptScore W4313394195C66905080 @default.
- W4313394195 hasIssue "1" @default.
- W4313394195 hasLocation W43133941951 @default.
- W4313394195 hasOpenAccess W4313394195 @default.
- W4313394195 hasPrimaryLocation W43133941951 @default.
- W4313394195 hasRelatedWork W2071629692 @default.
- W4313394195 hasRelatedWork W2396552128 @default.
- W4313394195 hasRelatedWork W2403303957 @default.
- W4313394195 hasRelatedWork W2807333695 @default.
- W4313394195 hasRelatedWork W2964808434 @default.
- W4313394195 hasRelatedWork W3013539567 @default.
- W4313394195 hasRelatedWork W3047505566 @default.
- W4313394195 hasRelatedWork W3137244216 @default.
- W4313394195 hasRelatedWork W3186330609 @default.
- W4313394195 hasRelatedWork W4313394195 @default.
- W4313394195 hasVolume "E106.D" @default.
- W4313394195 isParatext "false" @default.
- W4313394195 isRetracted "false" @default.
- W4313394195 workType "article" @default.