Matches in SemOpenAlex for { <https://semopenalex.org/work/W4367000246> ?p ?o ?g. }
Showing items 1 to 63 of
63
with 100 items per page.
- W4367000246 abstract "Singing voice transcription converts recorded singing audio to musical notation. Sound contamination (such as accompaniment) and lack of annotated data make singing voice transcription an extremely difficult task. We take two approaches to tackle the above challenges: 1) introducing multimodal learning for singing voice transcription together with a new multimodal singing dataset, N20EMv2, enhancing noise robustness by utilizing video information (lip movements to predict the onset/offset of notes), and 2) adapting self-supervised learning models from the speech domain to the singing voice transcription task, significantly reducing annotated data requirements while preserving pretrained features. We build a self-supervised learning based audio-only singing voice transcription system, which not only outperforms current state-of-the-art technologies as a strong baseline, but also generalizes well to out-of-domain singing data. We then develop a self-supervised learning based video-only singing voice transcription system that detects note onsets and offsets with an accuracy of about 80%. Finally, based on the powerful acoustic and visual representations extracted by the above two systems as well as the feature fusion design, we create an audio-visual singing voice transcription system that improves the noise robustness significantly under different acoustic environments compared to the audio-only systems." @default.
- W4367000246 created "2023-04-27" @default.
- W4367000246 creator A5003653417 @default.
- W4367000246 creator A5029934565 @default.
- W4367000246 creator A5063943033 @default.
- W4367000246 creator A5065592637 @default.
- W4367000246 creator A5090291830 @default.
- W4367000246 date "2023-04-24" @default.
- W4367000246 modified "2023-10-18" @default.
- W4367000246 title "Deep Audio-Visual Singing Voice Transcription based on Self-Supervised Learning Models" @default.
- W4367000246 doi "https://doi.org/10.48550/arxiv.2304.12082" @default.
- W4367000246 hasPublicationYear "2023" @default.
- W4367000246 type Work @default.
- W4367000246 citedByCount "0" @default.
- W4367000246 crossrefType "posted-content" @default.
- W4367000246 hasAuthorship W4367000246A5003653417 @default.
- W4367000246 hasAuthorship W4367000246A5029934565 @default.
- W4367000246 hasAuthorship W4367000246A5063943033 @default.
- W4367000246 hasAuthorship W4367000246A5065592637 @default.
- W4367000246 hasAuthorship W4367000246A5090291830 @default.
- W4367000246 hasBestOaLocation W43670002461 @default.
- W4367000246 hasConcept C104317684 @default.
- W4367000246 hasConcept C121332964 @default.
- W4367000246 hasConcept C138885662 @default.
- W4367000246 hasConcept C154945302 @default.
- W4367000246 hasConcept C179926584 @default.
- W4367000246 hasConcept C185592680 @default.
- W4367000246 hasConcept C24890656 @default.
- W4367000246 hasConcept C28490314 @default.
- W4367000246 hasConcept C41008148 @default.
- W4367000246 hasConcept C41895202 @default.
- W4367000246 hasConcept C44819458 @default.
- W4367000246 hasConcept C55493867 @default.
- W4367000246 hasConcept C63479239 @default.
- W4367000246 hasConceptScore W4367000246C104317684 @default.
- W4367000246 hasConceptScore W4367000246C121332964 @default.
- W4367000246 hasConceptScore W4367000246C138885662 @default.
- W4367000246 hasConceptScore W4367000246C154945302 @default.
- W4367000246 hasConceptScore W4367000246C179926584 @default.
- W4367000246 hasConceptScore W4367000246C185592680 @default.
- W4367000246 hasConceptScore W4367000246C24890656 @default.
- W4367000246 hasConceptScore W4367000246C28490314 @default.
- W4367000246 hasConceptScore W4367000246C41008148 @default.
- W4367000246 hasConceptScore W4367000246C41895202 @default.
- W4367000246 hasConceptScore W4367000246C44819458 @default.
- W4367000246 hasConceptScore W4367000246C55493867 @default.
- W4367000246 hasConceptScore W4367000246C63479239 @default.
- W4367000246 hasLocation W43670002461 @default.
- W4367000246 hasOpenAccess W4367000246 @default.
- W4367000246 hasPrimaryLocation W43670002461 @default.
- W4367000246 hasRelatedWork W1508603446 @default.
- W4367000246 hasRelatedWork W1528909574 @default.
- W4367000246 hasRelatedWork W178932670 @default.
- W4367000246 hasRelatedWork W2034995162 @default.
- W4367000246 hasRelatedWork W218087806 @default.
- W4367000246 hasRelatedWork W2189603775 @default.
- W4367000246 hasRelatedWork W2577202280 @default.
- W4367000246 hasRelatedWork W2676493621 @default.
- W4367000246 hasRelatedWork W3132121394 @default.
- W4367000246 hasRelatedWork W3207249698 @default.
- W4367000246 isParatext "false" @default.
- W4367000246 isRetracted "false" @default.
- W4367000246 workType "article" @default.