Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387582860> ?p ?o ?g. }
Showing items 1 to 55 of
55
with 100 items per page.
- W4387582860 abstract "In this paper, we propose a novel method for speaker adaptation in lip reading, motivated by two observations. Firstly, a speaker's own characteristics can always be portrayed well by his/her few facial images or even a single image with shallow networks, while the fine-grained dynamic features associated with speech content expressed by the talking face always need deep sequential networks to represent accurately. Therefore, we treat the shallow and deep layers differently for speaker adaptive lip reading. Secondly, we observe that a speaker's unique characteristics ( e.g. prominent oral cavity and mandible) have varied effects on lip reading performance for different words and pronunciations, necessitating adaptive enhancement or suppression of the features for robust lip reading. Based on these two observations, we propose to take advantage of the speaker's own characteristics to automatically learn separable hidden unit contributions with different targets for shallow layers and deep layers respectively. For shallow layers where features related to the speaker's characteristics are stronger than the speech content related features, we introduce speaker-adaptive features to learn for enhancing the speech content features. For deep layers where both the speaker's features and the speech content features are all expressed well, we introduce the speaker-adaptive features to learn for suppressing the speech content irrelevant noise for robust lip reading. Our approach consistently outperforms existing methods, as confirmed by comprehensive analysis and comparison across different settings. Besides the evaluation on the popular LRW-ID and GRID datasets, we also release a new dataset for evaluation, CAS-VSR-S68h, to further assess the performance in an extreme setting where just a few speakers are available but the speech content covers a large and diversified range." @default.
- W4387582860 created "2023-10-13" @default.
- W4387582860 creator A5009171844 @default.
- W4387582860 creator A5012581730 @default.
- W4387582860 creator A5020086523 @default.
- W4387582860 creator A5083420537 @default.
- W4387582860 date "2023-10-08" @default.
- W4387582860 modified "2023-10-14" @default.
- W4387582860 title "Learning Separable Hidden Unit Contributions for Speaker-Adaptive Lip-Reading" @default.
- W4387582860 doi "https://doi.org/10.48550/arxiv.2310.05058" @default.
- W4387582860 hasPublicationYear "2023" @default.
- W4387582860 type Work @default.
- W4387582860 citedByCount "0" @default.
- W4387582860 crossrefType "posted-content" @default.
- W4387582860 hasAuthorship W4387582860A5009171844 @default.
- W4387582860 hasAuthorship W4387582860A5012581730 @default.
- W4387582860 hasAuthorship W4387582860A5020086523 @default.
- W4387582860 hasAuthorship W4387582860A5083420537 @default.
- W4387582860 hasBestOaLocation W43875828601 @default.
- W4387582860 hasConcept C133892786 @default.
- W4387582860 hasConcept C138885662 @default.
- W4387582860 hasConcept C149838564 @default.
- W4387582860 hasConcept C153180895 @default.
- W4387582860 hasConcept C154945302 @default.
- W4387582860 hasConcept C2779304628 @default.
- W4387582860 hasConcept C28490314 @default.
- W4387582860 hasConcept C41008148 @default.
- W4387582860 hasConcept C41895202 @default.
- W4387582860 hasConcept C554936623 @default.
- W4387582860 hasConceptScore W4387582860C133892786 @default.
- W4387582860 hasConceptScore W4387582860C138885662 @default.
- W4387582860 hasConceptScore W4387582860C149838564 @default.
- W4387582860 hasConceptScore W4387582860C153180895 @default.
- W4387582860 hasConceptScore W4387582860C154945302 @default.
- W4387582860 hasConceptScore W4387582860C2779304628 @default.
- W4387582860 hasConceptScore W4387582860C28490314 @default.
- W4387582860 hasConceptScore W4387582860C41008148 @default.
- W4387582860 hasConceptScore W4387582860C41895202 @default.
- W4387582860 hasConceptScore W4387582860C554936623 @default.
- W4387582860 hasLocation W43875828601 @default.
- W4387582860 hasOpenAccess W4387582860 @default.
- W4387582860 hasPrimaryLocation W43875828601 @default.
- W4387582860 hasRelatedWork W1497807607 @default.
- W4387582860 hasRelatedWork W1509309911 @default.
- W4387582860 hasRelatedWork W1521049138 @default.
- W4387582860 hasRelatedWork W1813780412 @default.
- W4387582860 hasRelatedWork W1940231550 @default.
- W4387582860 hasRelatedWork W2118860825 @default.
- W4387582860 hasRelatedWork W2128773298 @default.
- W4387582860 hasRelatedWork W2144208207 @default.
- W4387582860 hasRelatedWork W2160753975 @default.
- W4387582860 hasRelatedWork W2499802997 @default.
- W4387582860 isParatext "false" @default.
- W4387582860 isRetracted "false" @default.
- W4387582860 workType "article" @default.