Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386076291> ?p ?o ?g. }
Showing items 1 to 97 of
97
with 100 items per page.
- W4386076291 abstract "How does audio describe the world around us? In this paper, we propose a method for generating an image of a scene from sound. Our method addresses the challenges of dealing with the large gaps that often exist between sight and sound. We design a model that works by scheduling the learning procedure of each model component to associate audio-visual modalities despite their information gaps. The key idea is to enrich the audio features with visual information by learning to align audio to visual latent space. We translate the input audio to visual features, then use a pre-trained generator to produce an image. To further improve the quality of our generated images, we use sound source localization to select the audio-visual pairs that have strong cross-modal correlations. We obtain substantially better results on the VEGAS and VGGSound datasets than prior approaches. We also show that we can control our model's predictions by applying simple manipulations to the input waveform, or to the latent space." @default.
- W4386076291 created "2023-08-23" @default.
- W4386076291 creator A5043008237 @default.
- W4386076291 creator A5070707693 @default.
- W4386076291 creator A5071880694 @default.
- W4386076291 creator A5078114111 @default.
- W4386076291 creator A5080036318 @default.
- W4386076291 date "2023-06-01" @default.
- W4386076291 modified "2023-10-06" @default.
- W4386076291 title "Sound to Visual Scene Generation by Audio-to-Visual Latent Alignment" @default.
- W4386076291 cites W2108598243 @default.
- W4386076291 cites W2183341477 @default.
- W4386076291 cites W2194775991 @default.
- W4386076291 cites W2619697695 @default.
- W4386076291 cites W2951611190 @default.
- W4386076291 cites W2962795401 @default.
- W4386076291 cites W2962969419 @default.
- W4386076291 cites W2963066677 @default.
- W4386076291 cites W2963184176 @default.
- W4386076291 cites W2963663420 @default.
- W4386076291 cites W2963680395 @default.
- W4386076291 cites W2963807156 @default.
- W4386076291 cites W2964345931 @default.
- W4386076291 cites W2965833116 @default.
- W4386076291 cites W2979157532 @default.
- W4386076291 cites W2982619606 @default.
- W4386076291 cites W2985068832 @default.
- W4386076291 cites W2988200020 @default.
- W4386076291 cites W2989980422 @default.
- W4386076291 cites W3015371781 @default.
- W4386076291 cites W3034368386 @default.
- W4386076291 cites W3110013267 @default.
- W4386076291 cites W3150091376 @default.
- W4386076291 cites W3162322471 @default.
- W4386076291 cites W3170088426 @default.
- W4386076291 cites W3176913662 @default.
- W4386076291 cites W3190580390 @default.
- W4386076291 cites W3204840685 @default.
- W4386076291 cites W4212847156 @default.
- W4386076291 cites W4214926101 @default.
- W4386076291 cites W4284898017 @default.
- W4386076291 cites W4319300082 @default.
- W4386076291 cites W4372340819 @default.
- W4386076291 doi "https://doi.org/10.1109/cvpr52729.2023.00622" @default.
- W4386076291 hasPublicationYear "2023" @default.
- W4386076291 type Work @default.
- W4386076291 citedByCount "1" @default.
- W4386076291 crossrefType "proceedings-article" @default.
- W4386076291 hasAuthorship W4386076291A5043008237 @default.
- W4386076291 hasAuthorship W4386076291A5070707693 @default.
- W4386076291 hasAuthorship W4386076291A5071880694 @default.
- W4386076291 hasAuthorship W4386076291A5078114111 @default.
- W4386076291 hasAuthorship W4386076291A5080036318 @default.
- W4386076291 hasConcept C127220857 @default.
- W4386076291 hasConcept C13895895 @default.
- W4386076291 hasConcept C154945302 @default.
- W4386076291 hasConcept C167310288 @default.
- W4386076291 hasConcept C185592680 @default.
- W4386076291 hasConcept C188027245 @default.
- W4386076291 hasConcept C28490314 @default.
- W4386076291 hasConcept C3017588708 @default.
- W4386076291 hasConcept C31972630 @default.
- W4386076291 hasConcept C36464697 @default.
- W4386076291 hasConcept C41008148 @default.
- W4386076291 hasConcept C49774154 @default.
- W4386076291 hasConcept C64922751 @default.
- W4386076291 hasConcept C71139939 @default.
- W4386076291 hasConceptScore W4386076291C127220857 @default.
- W4386076291 hasConceptScore W4386076291C13895895 @default.
- W4386076291 hasConceptScore W4386076291C154945302 @default.
- W4386076291 hasConceptScore W4386076291C167310288 @default.
- W4386076291 hasConceptScore W4386076291C185592680 @default.
- W4386076291 hasConceptScore W4386076291C188027245 @default.
- W4386076291 hasConceptScore W4386076291C28490314 @default.
- W4386076291 hasConceptScore W4386076291C3017588708 @default.
- W4386076291 hasConceptScore W4386076291C31972630 @default.
- W4386076291 hasConceptScore W4386076291C36464697 @default.
- W4386076291 hasConceptScore W4386076291C41008148 @default.
- W4386076291 hasConceptScore W4386076291C49774154 @default.
- W4386076291 hasConceptScore W4386076291C64922751 @default.
- W4386076291 hasConceptScore W4386076291C71139939 @default.
- W4386076291 hasLocation W43860762911 @default.
- W4386076291 hasOpenAccess W4386076291 @default.
- W4386076291 hasPrimaryLocation W43860762911 @default.
- W4386076291 hasRelatedWork W1891287906 @default.
- W4386076291 hasRelatedWork W2036807459 @default.
- W4386076291 hasRelatedWork W2114565900 @default.
- W4386076291 hasRelatedWork W2170815394 @default.
- W4386076291 hasRelatedWork W2775347418 @default.
- W4386076291 hasRelatedWork W2902072547 @default.
- W4386076291 hasRelatedWork W2977078804 @default.
- W4386076291 hasRelatedWork W3007022793 @default.
- W4386076291 hasRelatedWork W3019533076 @default.
- W4386076291 hasRelatedWork W41373443 @default.
- W4386076291 isParatext "false" @default.
- W4386076291 isRetracted "false" @default.
- W4386076291 workType "article" @default.