Matches in SemOpenAlex for { <https://semopenalex.org/work/W135040384> ?p ?o ?g. }
Showing items 1 to 93 of
93
with 100 items per page.
- W135040384 endingPage "6" @default.
- W135040384 startingPage "1" @default.
- W135040384 abstract "We investigate the use of multi-stream HMMs in the automatic recognition of audio-visual speech. Multi-stream HMMs allow the modeling of asynchrony between the audio and visual state sequences at a variety of levels (phone, syllable, word, etc.) and are equivalent to product, or composite, HMMs. In this paper, we consider such models synchronized at the phone boundary level, allowing various degrees of audio and visual state-sequence asynchrony. Furthermore, we investigate joint training of all product HMM parameters, instead of just composing the model from separately trained audio- and visual-only HMMs. We report experiments on a multi-subject connected digit recognition task, as well as on a more complex, speaker-independent large-vocabulary dictation task. Our results demonstrate that in both cases, joint multi-stream HMM training is superior to separate training of single-stream HMMs. In addition, we observe that allowing state-sequence asynchrony between the HMM audio and visual components improves connected digit recognition significantly, however it degrades performance on the dictation task. The resulting multi-stream models dramatically improve speech recognition robustness to noise, by successfully exploiting the visual modality speech information: For example, at 11 dB SNR, they reduce connected digit word error rate from the audio-only 2.3% to 0.77% audio-visual, and, for the large-vocabulary task, from 28.3% to 19.5%. Compared to the audio-only performance at 10 dB SNR, the use of multi-stream HMMs achieves an effective SNR gain of up to 9 dB and 7 dB respectively, for the two recognition tasks considered." @default.
- W135040384 created "2016-06-24" @default.
- W135040384 creator A5008190208 @default.
- W135040384 creator A5009895550 @default.
- W135040384 creator A5024184433 @default.
- W135040384 date "2002-03-24" @default.
- W135040384 modified "2023-09-24" @default.
- W135040384 title "Asynchrony modeling for audio-visual speech recognition" @default.
- W135040384 cites W1572240262 @default.
- W135040384 cites W1800365115 @default.
- W135040384 cites W1922557984 @default.
- W135040384 cites W1929897159 @default.
- W135040384 cites W1984867864 @default.
- W135040384 cites W2032243055 @default.
- W135040384 cites W2121486117 @default.
- W135040384 cites W2124174353 @default.
- W135040384 cites W2137075158 @default.
- W135040384 cites W2144788278 @default.
- W135040384 cites W2146871184 @default.
- W135040384 cites W2157190406 @default.
- W135040384 cites W2163680580 @default.
- W135040384 cites W2487271655 @default.
- W135040384 cites W3099202502 @default.
- W135040384 cites W3179102351 @default.
- W135040384 cites W322227076 @default.
- W135040384 hasPublicationYear "2002" @default.
- W135040384 type Work @default.
- W135040384 sameAs 135040384 @default.
- W135040384 citedByCount "21" @default.
- W135040384 countsByYear W1350403842012 @default.
- W135040384 countsByYear W1350403842015 @default.
- W135040384 countsByYear W1350403842016 @default.
- W135040384 countsByYear W1350403842017 @default.
- W135040384 countsByYear W1350403842019 @default.
- W135040384 crossrefType "proceedings-article" @default.
- W135040384 hasAuthorship W135040384A5008190208 @default.
- W135040384 hasAuthorship W135040384A5009895550 @default.
- W135040384 hasAuthorship W135040384A5024184433 @default.
- W135040384 hasConcept C104317684 @default.
- W135040384 hasConcept C138885662 @default.
- W135040384 hasConcept C154945302 @default.
- W135040384 hasConcept C185592680 @default.
- W135040384 hasConcept C23224414 @default.
- W135040384 hasConcept C2777601683 @default.
- W135040384 hasConcept C2779077324 @default.
- W135040384 hasConcept C28490314 @default.
- W135040384 hasConcept C40969351 @default.
- W135040384 hasConcept C41008148 @default.
- W135040384 hasConcept C41895202 @default.
- W135040384 hasConcept C55493867 @default.
- W135040384 hasConcept C63479239 @default.
- W135040384 hasConceptScore W135040384C104317684 @default.
- W135040384 hasConceptScore W135040384C138885662 @default.
- W135040384 hasConceptScore W135040384C154945302 @default.
- W135040384 hasConceptScore W135040384C185592680 @default.
- W135040384 hasConceptScore W135040384C23224414 @default.
- W135040384 hasConceptScore W135040384C2777601683 @default.
- W135040384 hasConceptScore W135040384C2779077324 @default.
- W135040384 hasConceptScore W135040384C28490314 @default.
- W135040384 hasConceptScore W135040384C40969351 @default.
- W135040384 hasConceptScore W135040384C41008148 @default.
- W135040384 hasConceptScore W135040384C41895202 @default.
- W135040384 hasConceptScore W135040384C55493867 @default.
- W135040384 hasConceptScore W135040384C63479239 @default.
- W135040384 hasLocation W1350403841 @default.
- W135040384 hasOpenAccess W135040384 @default.
- W135040384 hasPrimaryLocation W1350403841 @default.
- W135040384 hasRelatedWork W1560013842 @default.
- W135040384 hasRelatedWork W1576996184 @default.
- W135040384 hasRelatedWork W1978380426 @default.
- W135040384 hasRelatedWork W2015394094 @default.
- W135040384 hasRelatedWork W2049633694 @default.
- W135040384 hasRelatedWork W2096391593 @default.
- W135040384 hasRelatedWork W2098923380 @default.
- W135040384 hasRelatedWork W2110575115 @default.
- W135040384 hasRelatedWork W2121430128 @default.
- W135040384 hasRelatedWork W2121486117 @default.
- W135040384 hasRelatedWork W2122678358 @default.
- W135040384 hasRelatedWork W2124174353 @default.
- W135040384 hasRelatedWork W2125838338 @default.
- W135040384 hasRelatedWork W2152239535 @default.
- W135040384 hasRelatedWork W2157190406 @default.
- W135040384 hasRelatedWork W2157827878 @default.
- W135040384 hasRelatedWork W2164450870 @default.
- W135040384 hasRelatedWork W22517275 @default.
- W135040384 hasRelatedWork W3099202502 @default.
- W135040384 hasRelatedWork W88081813 @default.
- W135040384 isParatext "false" @default.
- W135040384 isRetracted "false" @default.
- W135040384 magId "135040384" @default.
- W135040384 workType "article" @default.