Matches in SemOpenAlex for { <https://semopenalex.org/work/W3128585087> ?p ?o ?g. }
- W3128585087 abstract "People can easily imagine the potential sound while seeing an event. This natural synchronization between audio and visual signals reveals their intrinsic correlations. To this end, we propose to learn the audio-visual correlations from the perspective of cross-modal generation in a self-supervised manner, the learned correlations can be then readily applied in multiple downstream tasks such as the audio-visual cross-modal localization and retrieval. We introduce a novel Variational AutoEncoder (VAE) framework that consists of Multiple encoders and a Shared decoder (MS-VAE) with an additional Wasserstein distance constraint to tackle the problem. Extensive experiments demonstrate that the optimized latent representation of the proposed MS-VAE can effectively learn the audio-visual correlations and can be readily applied in multiple audio-visual downstream tasks to achieve competitive performance even without any given label information during training." @default.
- W3128585087 created "2021-02-15" @default.
- W3128585087 creator A5041766127 @default.
- W3128585087 creator A5045556326 @default.
- W3128585087 creator A5048411184 @default.
- W3128585087 creator A5072690470 @default.
- W3128585087 creator A5086734254 @default.
- W3128585087 date "2021-02-05" @default.
- W3128585087 modified "2023-09-23" @default.
- W3128585087 title "Learning Audio-Visual Correlations from Variational Cross-Modal Generation" @default.
- W3128585087 cites W1523385540 @default.
- W3128585087 cites W1959608418 @default.
- W3128585087 cites W2019106840 @default.
- W3128585087 cites W2593116425 @default.
- W3128585087 cites W2619697695 @default.
- W3128585087 cites W2739748921 @default.
- W3128585087 cites W2788962824 @default.
- W3128585087 cites W2859444450 @default.
- W3128585087 cites W2915434260 @default.
- W3128585087 cites W2931433835 @default.
- W3128585087 cites W2937949426 @default.
- W3128585087 cites W2962699416 @default.
- W3128585087 cites W2962756039 @default.
- W3128585087 cites W2962865004 @default.
- W3128585087 cites W2962960500 @default.
- W3128585087 cites W2963066677 @default.
- W3128585087 cites W2963115079 @default.
- W3128585087 cites W2963207848 @default.
- W3128585087 cites W2963218389 @default.
- W3128585087 cites W2963680395 @default.
- W3128585087 cites W2963746531 @default.
- W3128585087 cites W2964048159 @default.
- W3128585087 cites W2964109005 @default.
- W3128585087 cites W2964951956 @default.
- W3128585087 cites W2982619606 @default.
- W3128585087 cites W2990113535 @default.
- W3128585087 cites W3017343282 @default.
- W3128585087 cites W3021321555 @default.
- W3128585087 cites W3108240585 @default.
- W3128585087 doi "https://doi.org/10.48550/arxiv.2102.03424" @default.
- W3128585087 hasPublicationYear "2021" @default.
- W3128585087 type Work @default.
- W3128585087 sameAs 3128585087 @default.
- W3128585087 citedByCount "1" @default.
- W3128585087 countsByYear W31285850872021 @default.
- W3128585087 crossrefType "posted-content" @default.
- W3128585087 hasAuthorship W3128585087A5041766127 @default.
- W3128585087 hasAuthorship W3128585087A5045556326 @default.
- W3128585087 hasAuthorship W3128585087A5048411184 @default.
- W3128585087 hasAuthorship W3128585087A5072690470 @default.
- W3128585087 hasAuthorship W3128585087A5086734254 @default.
- W3128585087 hasBestOaLocation W31285850871 @default.
- W3128585087 hasConcept C101738243 @default.
- W3128585087 hasConcept C108583219 @default.
- W3128585087 hasConcept C111919701 @default.
- W3128585087 hasConcept C118505674 @default.
- W3128585087 hasConcept C12713177 @default.
- W3128585087 hasConcept C127162648 @default.
- W3128585087 hasConcept C13895895 @default.
- W3128585087 hasConcept C153180895 @default.
- W3128585087 hasConcept C154945302 @default.
- W3128585087 hasConcept C17744445 @default.
- W3128585087 hasConcept C185592680 @default.
- W3128585087 hasConcept C188027245 @default.
- W3128585087 hasConcept C199539241 @default.
- W3128585087 hasConcept C2524010 @default.
- W3128585087 hasConcept C2776036281 @default.
- W3128585087 hasConcept C2776359362 @default.
- W3128585087 hasConcept C2778562939 @default.
- W3128585087 hasConcept C28490314 @default.
- W3128585087 hasConcept C3017588708 @default.
- W3128585087 hasConcept C31258907 @default.
- W3128585087 hasConcept C33923547 @default.
- W3128585087 hasConcept C41008148 @default.
- W3128585087 hasConcept C49774154 @default.
- W3128585087 hasConcept C64922751 @default.
- W3128585087 hasConcept C71139939 @default.
- W3128585087 hasConcept C94625758 @default.
- W3128585087 hasConceptScore W3128585087C101738243 @default.
- W3128585087 hasConceptScore W3128585087C108583219 @default.
- W3128585087 hasConceptScore W3128585087C111919701 @default.
- W3128585087 hasConceptScore W3128585087C118505674 @default.
- W3128585087 hasConceptScore W3128585087C12713177 @default.
- W3128585087 hasConceptScore W3128585087C127162648 @default.
- W3128585087 hasConceptScore W3128585087C13895895 @default.
- W3128585087 hasConceptScore W3128585087C153180895 @default.
- W3128585087 hasConceptScore W3128585087C154945302 @default.
- W3128585087 hasConceptScore W3128585087C17744445 @default.
- W3128585087 hasConceptScore W3128585087C185592680 @default.
- W3128585087 hasConceptScore W3128585087C188027245 @default.
- W3128585087 hasConceptScore W3128585087C199539241 @default.
- W3128585087 hasConceptScore W3128585087C2524010 @default.
- W3128585087 hasConceptScore W3128585087C2776036281 @default.
- W3128585087 hasConceptScore W3128585087C2776359362 @default.
- W3128585087 hasConceptScore W3128585087C2778562939 @default.
- W3128585087 hasConceptScore W3128585087C28490314 @default.
- W3128585087 hasConceptScore W3128585087C3017588708 @default.
- W3128585087 hasConceptScore W3128585087C31258907 @default.
- W3128585087 hasConceptScore W3128585087C33923547 @default.
- W3128585087 hasConceptScore W3128585087C41008148 @default.