Matches in SemOpenAlex for { <https://semopenalex.org/work/W4310873011> ?p ?o ?g. }
- W4310873011 endingPage "151" @default.
- W4310873011 startingPage "137" @default.
- W4310873011 abstract "Pre-trained models are essential as feature extractors in modern machine learning systems in various domains. In this study, we hypothesize that representations effective for general audio tasks should provide multiple aspects of robust features of the input sound. For recognizing sounds regardless of perturbations such as varying pitch or timbre, features should be robust to these perturbations. For serving the diverse needs of tasks such as recognition of emotions or music genres, representations should provide multiple aspects of information, such as local and global features. To implement our principle, we propose a self-supervised learning method: Bootstrap Your Own Latent (BYOL) for Audio (BYOL-A, pronounced “viola”). BYOL-A pre-trains representations of the input sound invariant to audio data augmentations, which makes the learned representations robust to the perturbations of sounds. Whereas the BYOL-A encoder combines local and global features and calculates their statistics to make the representation provide multi-aspect information. As a result, the learned representations should provide robust and multi-aspect information to serve various needs of diverse tasks. We evaluated the general audio task performance of BYOL-A compared to previous state-of-the-art methods, and BYOL-A demonstrated generalizability with the best average result of 72.4% and the best VoxCeleb1 result of 57.6%. Extensive ablation experiments revealed that the BYOL-A encoder architecture contributes to most performance, and the final critical portion resorts to the BYOL framework and BYOL-A augmentations. Our code is available online for future studies." @default.
- W4310873011 created "2022-12-19" @default.
- W4310873011 creator A5049600455 @default.
- W4310873011 creator A5054467679 @default.
- W4310873011 creator A5061465935 @default.
- W4310873011 creator A5062509967 @default.
- W4310873011 creator A5091219538 @default.
- W4310873011 date "2023-01-01" @default.
- W4310873011 modified "2023-10-01" @default.
- W4310873011 title "BYOL for Audio: Exploring Pre-Trained General-Purpose Audio Representations" @default.
- W4310873011 cites W1494198834 @default.
- W4310873011 cites W2030931454 @default.
- W4310873011 cites W2033875152 @default.
- W4310873011 cites W2038484192 @default.
- W4310873011 cites W2052666245 @default.
- W4310873011 cites W2133824856 @default.
- W4310873011 cites W2194775991 @default.
- W4310873011 cites W2549139847 @default.
- W4310873011 cites W2593116425 @default.
- W4310873011 cites W2619697695 @default.
- W4310873011 cites W2939574508 @default.
- W4310873011 cites W2949676527 @default.
- W4310873011 cites W2962904371 @default.
- W4310873011 cites W2963031676 @default.
- W4310873011 cites W2963361348 @default.
- W4310873011 cites W2969940675 @default.
- W4310873011 cites W2973109987 @default.
- W4310873011 cites W2982223350 @default.
- W4310873011 cites W3006926732 @default.
- W4310873011 cites W3015213852 @default.
- W4310873011 cites W3035524453 @default.
- W4310873011 cites W3041671906 @default.
- W4310873011 cites W3082319384 @default.
- W4310873011 cites W3094550259 @default.
- W4310873011 cites W3097791920 @default.
- W4310873011 cites W3137857706 @default.
- W4310873011 cites W3159239022 @default.
- W4310873011 cites W3162391496 @default.
- W4310873011 cites W3165587897 @default.
- W4310873011 cites W3176445421 @default.
- W4310873011 cites W3196818043 @default.
- W4310873011 cites W3196974791 @default.
- W4310873011 cites W3198452188 @default.
- W4310873011 cites W3198882010 @default.
- W4310873011 cites W3199246634 @default.
- W4310873011 cites W3201065055 @default.
- W4310873011 cites W3201143670 @default.
- W4310873011 cites W3204696009 @default.
- W4310873011 cites W3205743929 @default.
- W4310873011 cites W3209059054 @default.
- W4310873011 cites W3209984917 @default.
- W4310873011 cites W4225713393 @default.
- W4310873011 cites W4252814261 @default.
- W4310873011 cites W4284898017 @default.
- W4310873011 doi "https://doi.org/10.1109/taslp.2022.3221007" @default.
- W4310873011 hasPublicationYear "2023" @default.
- W4310873011 type Work @default.
- W4310873011 citedByCount "3" @default.
- W4310873011 countsByYear W43108730112023 @default.
- W4310873011 crossrefType "journal-article" @default.
- W4310873011 hasAuthorship W4310873011A5049600455 @default.
- W4310873011 hasAuthorship W4310873011A5054467679 @default.
- W4310873011 hasAuthorship W4310873011A5061465935 @default.
- W4310873011 hasAuthorship W4310873011A5062509967 @default.
- W4310873011 hasAuthorship W4310873011A5091219538 @default.
- W4310873011 hasBestOaLocation W43108730111 @default.
- W4310873011 hasConcept C105795698 @default.
- W4310873011 hasConcept C119857082 @default.
- W4310873011 hasConcept C142362112 @default.
- W4310873011 hasConcept C153349607 @default.
- W4310873011 hasConcept C154945302 @default.
- W4310873011 hasConcept C17744445 @default.
- W4310873011 hasConcept C199539241 @default.
- W4310873011 hasConcept C27158222 @default.
- W4310873011 hasConcept C2776359362 @default.
- W4310873011 hasConcept C2776539107 @default.
- W4310873011 hasConcept C28490314 @default.
- W4310873011 hasConcept C33923547 @default.
- W4310873011 hasConcept C41008148 @default.
- W4310873011 hasConcept C558565934 @default.
- W4310873011 hasConcept C59404180 @default.
- W4310873011 hasConcept C94625758 @default.
- W4310873011 hasConceptScore W4310873011C105795698 @default.
- W4310873011 hasConceptScore W4310873011C119857082 @default.
- W4310873011 hasConceptScore W4310873011C142362112 @default.
- W4310873011 hasConceptScore W4310873011C153349607 @default.
- W4310873011 hasConceptScore W4310873011C154945302 @default.
- W4310873011 hasConceptScore W4310873011C17744445 @default.
- W4310873011 hasConceptScore W4310873011C199539241 @default.
- W4310873011 hasConceptScore W4310873011C27158222 @default.
- W4310873011 hasConceptScore W4310873011C2776359362 @default.
- W4310873011 hasConceptScore W4310873011C2776539107 @default.
- W4310873011 hasConceptScore W4310873011C28490314 @default.
- W4310873011 hasConceptScore W4310873011C33923547 @default.
- W4310873011 hasConceptScore W4310873011C41008148 @default.
- W4310873011 hasConceptScore W4310873011C558565934 @default.
- W4310873011 hasConceptScore W4310873011C59404180 @default.
- W4310873011 hasConceptScore W4310873011C94625758 @default.