Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386908258> ?p ?o ?g. }
- W4386908258 abstract "Audio-visual representation learning aims to develop systems with human-like perception by utilizing correlation between auditory and visual information. However, current models often focus on a limited set of tasks, and generalization abilities of learned representations are unclear. To this end, we propose the AV-SUPERB benchmark that enables general-purpose evaluation of unimodal audio/visual and bimodal fusion representations on 7 datasets covering 5 audio-visual tasks in speech and audio processing. We evaluate 5 recent self-supervised models and show that none of these models generalize to all tasks, emphasizing the need for future study on improving universal model performance. In addition, we show that representations may be improved with intermediate-task fine-tuning and audio event classification with AudioSet serves as a strong intermediate task. We release our benchmark with evaluation code and a model submission platform to encourage further research in audio-visual learning." @default.
- W4386908258 created "2023-09-21" @default.
- W4386908258 creator A5004717608 @default.
- W4386908258 creator A5011276883 @default.
- W4386908258 creator A5018543342 @default.
- W4386908258 creator A5027396181 @default.
- W4386908258 creator A5029566548 @default.
- W4386908258 creator A5040508737 @default.
- W4386908258 creator A5042186348 @default.
- W4386908258 creator A5044008055 @default.
- W4386908258 creator A5044999870 @default.
- W4386908258 creator A5046324698 @default.
- W4386908258 creator A5053466746 @default.
- W4386908258 creator A5063149046 @default.
- W4386908258 creator A5066850676 @default.
- W4386908258 creator A5069644229 @default.
- W4386908258 creator A5075735963 @default.
- W4386908258 creator A5082327841 @default.
- W4386908258 creator A5082418737 @default.
- W4386908258 creator A5083886640 @default.
- W4386908258 creator A5084236961 @default.
- W4386908258 date "2023-09-19" @default.
- W4386908258 modified "2023-09-28" @default.
- W4386908258 title "AV-SUPERB: A Multi-Task Evaluation Benchmark for Audio-Visual Representation Models" @default.
- W4386908258 doi "https://doi.org/10.48550/arxiv.2309.10787" @default.
- W4386908258 hasPublicationYear "2023" @default.
- W4386908258 type Work @default.
- W4386908258 citedByCount "0" @default.
- W4386908258 crossrefType "posted-content" @default.
- W4386908258 hasAuthorship W4386908258A5004717608 @default.
- W4386908258 hasAuthorship W4386908258A5011276883 @default.
- W4386908258 hasAuthorship W4386908258A5018543342 @default.
- W4386908258 hasAuthorship W4386908258A5027396181 @default.
- W4386908258 hasAuthorship W4386908258A5029566548 @default.
- W4386908258 hasAuthorship W4386908258A5040508737 @default.
- W4386908258 hasAuthorship W4386908258A5042186348 @default.
- W4386908258 hasAuthorship W4386908258A5044008055 @default.
- W4386908258 hasAuthorship W4386908258A5044999870 @default.
- W4386908258 hasAuthorship W4386908258A5046324698 @default.
- W4386908258 hasAuthorship W4386908258A5053466746 @default.
- W4386908258 hasAuthorship W4386908258A5063149046 @default.
- W4386908258 hasAuthorship W4386908258A5066850676 @default.
- W4386908258 hasAuthorship W4386908258A5069644229 @default.
- W4386908258 hasAuthorship W4386908258A5075735963 @default.
- W4386908258 hasAuthorship W4386908258A5082327841 @default.
- W4386908258 hasAuthorship W4386908258A5082418737 @default.
- W4386908258 hasAuthorship W4386908258A5083886640 @default.
- W4386908258 hasAuthorship W4386908258A5084236961 @default.
- W4386908258 hasBestOaLocation W43869082581 @default.
- W4386908258 hasConcept C119857082 @default.
- W4386908258 hasConcept C13280743 @default.
- W4386908258 hasConcept C134306372 @default.
- W4386908258 hasConcept C154945302 @default.
- W4386908258 hasConcept C15744967 @default.
- W4386908258 hasConcept C162324750 @default.
- W4386908258 hasConcept C169760540 @default.
- W4386908258 hasConcept C177148314 @default.
- W4386908258 hasConcept C177264268 @default.
- W4386908258 hasConcept C17744445 @default.
- W4386908258 hasConcept C185798385 @default.
- W4386908258 hasConcept C187736073 @default.
- W4386908258 hasConcept C199360897 @default.
- W4386908258 hasConcept C199539241 @default.
- W4386908258 hasConcept C205649164 @default.
- W4386908258 hasConcept C26760741 @default.
- W4386908258 hasConcept C2776359362 @default.
- W4386908258 hasConcept C2776760102 @default.
- W4386908258 hasConcept C2780451532 @default.
- W4386908258 hasConcept C28490314 @default.
- W4386908258 hasConcept C3017588708 @default.
- W4386908258 hasConcept C33923547 @default.
- W4386908258 hasConcept C41008148 @default.
- W4386908258 hasConcept C49774154 @default.
- W4386908258 hasConcept C94625758 @default.
- W4386908258 hasConceptScore W4386908258C119857082 @default.
- W4386908258 hasConceptScore W4386908258C13280743 @default.
- W4386908258 hasConceptScore W4386908258C134306372 @default.
- W4386908258 hasConceptScore W4386908258C154945302 @default.
- W4386908258 hasConceptScore W4386908258C15744967 @default.
- W4386908258 hasConceptScore W4386908258C162324750 @default.
- W4386908258 hasConceptScore W4386908258C169760540 @default.
- W4386908258 hasConceptScore W4386908258C177148314 @default.
- W4386908258 hasConceptScore W4386908258C177264268 @default.
- W4386908258 hasConceptScore W4386908258C17744445 @default.
- W4386908258 hasConceptScore W4386908258C185798385 @default.
- W4386908258 hasConceptScore W4386908258C187736073 @default.
- W4386908258 hasConceptScore W4386908258C199360897 @default.
- W4386908258 hasConceptScore W4386908258C199539241 @default.
- W4386908258 hasConceptScore W4386908258C205649164 @default.
- W4386908258 hasConceptScore W4386908258C26760741 @default.
- W4386908258 hasConceptScore W4386908258C2776359362 @default.
- W4386908258 hasConceptScore W4386908258C2776760102 @default.
- W4386908258 hasConceptScore W4386908258C2780451532 @default.
- W4386908258 hasConceptScore W4386908258C28490314 @default.
- W4386908258 hasConceptScore W4386908258C3017588708 @default.
- W4386908258 hasConceptScore W4386908258C33923547 @default.
- W4386908258 hasConceptScore W4386908258C41008148 @default.
- W4386908258 hasConceptScore W4386908258C49774154 @default.
- W4386908258 hasConceptScore W4386908258C94625758 @default.
- W4386908258 hasLocation W43869082581 @default.