Matches in SemOpenAlex for { <https://semopenalex.org/work/W2996889020> ?p ?o ?g. }
- W2996889020 endingPage "428" @default.
- W2996889020 startingPage "416" @default.
- W2996889020 abstract "Audio-visual (AV) representation learning is an important task from the perspective of designing machines with the ability to understand complex events. To this end, we propose a novel multimodal framework that instantiates multiple instance learning. Specifically, we develop methods that identify events and localize corresponding AV cues in unconstrained videos. Importantly, this is done using weak labels where only video-level event labels are known without any information about their location in time. We show that the learnt representations are useful for performing several tasks such as event/object classification, audio event detection, audio source separation and visual object localization. An important feature of our method is its capacity to learn from unsynchronized audio-visual events. We also demonstrate our framework's ability to separate out the audio source of interest through a novel use of nonnegative matrix factorization. State-of-the-art classification results, with a F1-score of 65.0, are achieved on DCASE 2017 smart cars challenge data with promising generalization to diverse object types such as musical instruments. Visualizations of localized visual regions and audio segments substantiate our system's efficacy, especially when dealing with noisy situations where modality-specific cues appear asynchronously." @default.
- W2996889020 created "2020-01-10" @default.
- W2996889020 creator A5001145583 @default.
- W2996889020 creator A5041584595 @default.
- W2996889020 creator A5055423112 @default.
- W2996889020 creator A5060031161 @default.
- W2996889020 creator A5073788938 @default.
- W2996889020 creator A5076170578 @default.
- W2996889020 date "2020-01-01" @default.
- W2996889020 modified "2023-10-17" @default.
- W2996889020 title "Weakly Supervised Representation Learning for Audio-Visual Scene Analysis" @default.
- W2996889020 cites W1511170870 @default.
- W2996889020 cites W1536680647 @default.
- W2996889020 cites W1559046793 @default.
- W2996889020 cites W1777628566 @default.
- W2996889020 cites W1952794764 @default.
- W2996889020 cites W1994488211 @default.
- W2996889020 cites W2015433306 @default.
- W2996889020 cites W2036931824 @default.
- W2996889020 cites W2039844283 @default.
- W2996889020 cites W2065274193 @default.
- W2996889020 cites W2086384421 @default.
- W2996889020 cites W2088049833 @default.
- W2996889020 cites W2102605133 @default.
- W2996889020 cites W2104446196 @default.
- W2996889020 cites W2105582566 @default.
- W2996889020 cites W2108598243 @default.
- W2996889020 cites W2109255472 @default.
- W2996889020 cites W2110119381 @default.
- W2996889020 cites W2110226160 @default.
- W2996889020 cites W2115447976 @default.
- W2996889020 cites W2127851351 @default.
- W2996889020 cites W2133324800 @default.
- W2996889020 cites W2141355815 @default.
- W2996889020 cites W2152617463 @default.
- W2996889020 cites W2175354415 @default.
- W2996889020 cites W2295107390 @default.
- W2996889020 cites W2306289963 @default.
- W2996889020 cites W2354870669 @default.
- W2996889020 cites W2464894339 @default.
- W2996889020 cites W2511428026 @default.
- W2996889020 cites W2519284461 @default.
- W2996889020 cites W2526050071 @default.
- W2996889020 cites W2593116425 @default.
- W2996889020 cites W2618530766 @default.
- W2996889020 cites W2632052911 @default.
- W2996889020 cites W2783473931 @default.
- W2996889020 cites W2939574508 @default.
- W2996889020 cites W2963099423 @default.
- W2996889020 cites W2963115079 @default.
- W2996889020 cites W2963150697 @default.
- W2996889020 cites W2963603913 @default.
- W2996889020 cites W2963610932 @default.
- W2996889020 cites W2964345931 @default.
- W2996889020 cites W3123940584 @default.
- W2996889020 cites W4245923654 @default.
- W2996889020 cites W4289665794 @default.
- W2996889020 cites W639708223 @default.
- W2996889020 cites W874179280 @default.
- W2996889020 doi "https://doi.org/10.1109/taslp.2019.2957889" @default.
- W2996889020 hasPublicationYear "2020" @default.
- W2996889020 type Work @default.
- W2996889020 sameAs 2996889020 @default.
- W2996889020 citedByCount "15" @default.
- W2996889020 countsByYear W29968890202012 @default.
- W2996889020 countsByYear W29968890202020 @default.
- W2996889020 countsByYear W29968890202021 @default.
- W2996889020 countsByYear W29968890202022 @default.
- W2996889020 crossrefType "journal-article" @default.
- W2996889020 hasAuthorship W2996889020A5001145583 @default.
- W2996889020 hasAuthorship W2996889020A5041584595 @default.
- W2996889020 hasAuthorship W2996889020A5055423112 @default.
- W2996889020 hasAuthorship W2996889020A5060031161 @default.
- W2996889020 hasAuthorship W2996889020A5073788938 @default.
- W2996889020 hasAuthorship W2996889020A5076170578 @default.
- W2996889020 hasBestOaLocation W29968890202 @default.
- W2996889020 hasConcept C121332964 @default.
- W2996889020 hasConcept C12713177 @default.
- W2996889020 hasConcept C134306372 @default.
- W2996889020 hasConcept C138885662 @default.
- W2996889020 hasConcept C153180895 @default.
- W2996889020 hasConcept C154945302 @default.
- W2996889020 hasConcept C177148314 @default.
- W2996889020 hasConcept C17744445 @default.
- W2996889020 hasConcept C199539241 @default.
- W2996889020 hasConcept C2776359362 @default.
- W2996889020 hasConcept C2776401178 @default.
- W2996889020 hasConcept C2779662365 @default.
- W2996889020 hasConcept C2781238097 @default.
- W2996889020 hasConcept C28490314 @default.
- W2996889020 hasConcept C3017588708 @default.
- W2996889020 hasConcept C33923547 @default.
- W2996889020 hasConcept C41008148 @default.
- W2996889020 hasConcept C41895202 @default.
- W2996889020 hasConcept C49774154 @default.
- W2996889020 hasConcept C59404180 @default.
- W2996889020 hasConcept C62520636 @default.
- W2996889020 hasConcept C94625758 @default.