Matches in SemOpenAlex for { <https://semopenalex.org/work/W4328114908> ?p ?o ?g. }
Showing items 1 to 100 of
100
with 100 items per page.
- W4328114908 abstract "As one of the most effective methods to improve the accuracy and robustness of speech tasks, the audio–visual fusion approach has recently been introduced into the field of Keyword Spotting (KWS). However, existing audio–visual keyword spotting models are limited to detecting isolated words, while keyword spotting for unconstrained speech is still a challenging problem. To this end, an Audio–Visual Keyword Transformer (AVKT) network is proposed to spot keywords in unconstrained video clips. The authors present a transformer classifier with learnable CLS tokens to extract distinctive keyword features from the variable-length audio and visual inputs. The outputs of audio and visual branches are combined in a decision fusion module. As humans can easily notice whether a keyword appears in a sentence or not, our AVKT network can detect whether a video clip with a spoken sentence contains a pre-specified keyword. Moreover, the position of the keyword is localised in the attention map without additional position labels. Experimental results on the LRS2-KWS dataset and our newly collected PKU-KWS dataset show that the accuracy of AVKT exceeded 99% in clean scenes and 85% in extremely noisy conditions. The code is available at https://github.com/jialeren/AVKT." @default.
- W4328114908 created "2023-03-22" @default.
- W4328114908 creator A5040489711 @default.
- W4328114908 creator A5051448705 @default.
- W4328114908 creator A5057287134 @default.
- W4328114908 creator A5071411382 @default.
- W4328114908 creator A5072379029 @default.
- W4328114908 creator A5079738340 @default.
- W4328114908 date "2023-03-20" @default.
- W4328114908 modified "2023-09-28" @default.
- W4328114908 title "Audio–visual keyword transformer for unconstrained sentence‐level keyword spotting" @default.
- W4328114908 cites W1503933356 @default.
- W4328114908 cites W1978380426 @default.
- W4328114908 cites W2015394094 @default.
- W4328114908 cites W2034940213 @default.
- W4328114908 cites W2035777533 @default.
- W4328114908 cites W2040818685 @default.
- W4328114908 cites W2052274902 @default.
- W4328114908 cites W2056986588 @default.
- W4328114908 cites W2101346879 @default.
- W4328114908 cites W2121486117 @default.
- W4328114908 cites W2122797512 @default.
- W4328114908 cites W2161301211 @default.
- W4328114908 cites W2285716245 @default.
- W4328114908 cites W2783089003 @default.
- W4328114908 cites W2799804273 @default.
- W4328114908 cites W2890952074 @default.
- W4328114908 cites W2910275879 @default.
- W4328114908 cites W2953219395 @default.
- W4328114908 cites W3046052470 @default.
- W4328114908 cites W3138516171 @default.
- W4328114908 cites W3160207687 @default.
- W4328114908 cites W3183430956 @default.
- W4328114908 cites W3198035615 @default.
- W4328114908 cites W3199527474 @default.
- W4328114908 cites W3211278025 @default.
- W4328114908 cites W4226135414 @default.
- W4328114908 cites W4284966478 @default.
- W4328114908 cites W4297841641 @default.
- W4328114908 doi "https://doi.org/10.1049/cit2.12212" @default.
- W4328114908 hasPublicationYear "2023" @default.
- W4328114908 type Work @default.
- W4328114908 citedByCount "0" @default.
- W4328114908 crossrefType "journal-article" @default.
- W4328114908 hasAuthorship W4328114908A5040489711 @default.
- W4328114908 hasAuthorship W4328114908A5051448705 @default.
- W4328114908 hasAuthorship W4328114908A5057287134 @default.
- W4328114908 hasAuthorship W4328114908A5071411382 @default.
- W4328114908 hasAuthorship W4328114908A5072379029 @default.
- W4328114908 hasAuthorship W4328114908A5079738340 @default.
- W4328114908 hasBestOaLocation W43281149081 @default.
- W4328114908 hasConcept C104317684 @default.
- W4328114908 hasConcept C121332964 @default.
- W4328114908 hasConcept C154945302 @default.
- W4328114908 hasConcept C165801399 @default.
- W4328114908 hasConcept C185592680 @default.
- W4328114908 hasConcept C204321447 @default.
- W4328114908 hasConcept C2777530160 @default.
- W4328114908 hasConcept C2781213101 @default.
- W4328114908 hasConcept C28490314 @default.
- W4328114908 hasConcept C3017588708 @default.
- W4328114908 hasConcept C41008148 @default.
- W4328114908 hasConcept C49774154 @default.
- W4328114908 hasConcept C55493867 @default.
- W4328114908 hasConcept C62520636 @default.
- W4328114908 hasConcept C63479239 @default.
- W4328114908 hasConcept C66322947 @default.
- W4328114908 hasConceptScore W4328114908C104317684 @default.
- W4328114908 hasConceptScore W4328114908C121332964 @default.
- W4328114908 hasConceptScore W4328114908C154945302 @default.
- W4328114908 hasConceptScore W4328114908C165801399 @default.
- W4328114908 hasConceptScore W4328114908C185592680 @default.
- W4328114908 hasConceptScore W4328114908C204321447 @default.
- W4328114908 hasConceptScore W4328114908C2777530160 @default.
- W4328114908 hasConceptScore W4328114908C2781213101 @default.
- W4328114908 hasConceptScore W4328114908C28490314 @default.
- W4328114908 hasConceptScore W4328114908C3017588708 @default.
- W4328114908 hasConceptScore W4328114908C41008148 @default.
- W4328114908 hasConceptScore W4328114908C49774154 @default.
- W4328114908 hasConceptScore W4328114908C55493867 @default.
- W4328114908 hasConceptScore W4328114908C62520636 @default.
- W4328114908 hasConceptScore W4328114908C63479239 @default.
- W4328114908 hasConceptScore W4328114908C66322947 @default.
- W4328114908 hasFunder F4320321001 @default.
- W4328114908 hasLocation W43281149081 @default.
- W4328114908 hasOpenAccess W4328114908 @default.
- W4328114908 hasPrimaryLocation W43281149081 @default.
- W4328114908 hasRelatedWork W1517743118 @default.
- W4328114908 hasRelatedWork W1567338489 @default.
- W4328114908 hasRelatedWork W159132833 @default.
- W4328114908 hasRelatedWork W1978971213 @default.
- W4328114908 hasRelatedWork W1987454298 @default.
- W4328114908 hasRelatedWork W2155289555 @default.
- W4328114908 hasRelatedWork W3022217344 @default.
- W4328114908 hasRelatedWork W38394648 @default.
- W4328114908 hasRelatedWork W4287888637 @default.
- W4328114908 hasRelatedWork W4318978824 @default.
- W4328114908 isParatext "false" @default.
- W4328114908 isRetracted "false" @default.
- W4328114908 workType "article" @default.