Matches in SemOpenAlex for { <https://semopenalex.org/work/W4225939130> ?p ?o ?g. }
Showing items 1 to 87 of
87
with 100 items per page.
- W4225939130 endingPage "1082" @default.
- W4225939130 startingPage "1073" @default.
- W4225939130 abstract "Audio scene analysis (ASA) is a challenging and multifaceted task in audio signal processing that uncovers information about the nature of an audio recording. Regardless of the analysis goal, a number of audio sources are observed in any audio scene. However, this consideration is usually not explored or given considerable thought in research. This work aims to demonstrate the utility of audio source counting with a novel solution consisting of a multimodal system for ASA. Both speaker counting and sound event counting techniques use deep neural networks (DNN) to predict the number of sources. We are able to present competitive results for audio source counting by achieving prediction accuracy of 46.03% and 89.57% with a margin of error of <inline-formula xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink><tex-math notation=LaTeX>$pm 1$</tex-math></inline-formula> for speaker counting, which outperforms state-of-the-art systems for similar tasks. For sound event counting we achieve 50.55% and 86.59% prediction accuracy and accuracy with a margin of error of <inline-formula xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink><tex-math notation=LaTeX>$pm 1$</tex-math></inline-formula> , respectively, that establishes a clear baseline. Our system also demonstrates real-time aspects with an overall processing time of <inline-formula xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink><tex-math notation=LaTeX>$sim 0.4614$</tex-math></inline-formula> s per audio recording." @default.
- W4225939130 created "2022-05-05" @default.
- W4225939130 creator A5046605996 @default.
- W4225939130 creator A5086845888 @default.
- W4225939130 date "2022-01-01" @default.
- W4225939130 modified "2023-09-30" @default.
- W4225939130 title "Multimodal System for Audio Scene Source Counting and Analysis" @default.
- W4225939130 cites W2029880715 @default.
- W4225939130 cites W2051906523 @default.
- W4225939130 cites W2064675550 @default.
- W4225939130 cites W2081074144 @default.
- W4225939130 cites W2094128754 @default.
- W4225939130 cites W2103869314 @default.
- W4225939130 cites W2171679232 @default.
- W4225939130 cites W2194775991 @default.
- W4225939130 cites W2526050071 @default.
- W4225939130 cites W2591013610 @default.
- W4225939130 cites W2791535566 @default.
- W4225939130 cites W2801612492 @default.
- W4225939130 cites W2896538040 @default.
- W4225939130 cites W2898395910 @default.
- W4225939130 cites W2905111007 @default.
- W4225939130 cites W2907228390 @default.
- W4225939130 cites W2916103538 @default.
- W4225939130 cites W2952752702 @default.
- W4225939130 cites W2962696180 @default.
- W4225939130 cites W2963227667 @default.
- W4225939130 cites W2963470929 @default.
- W4225939130 cites W2982429715 @default.
- W4225939130 cites W2989159420 @default.
- W4225939130 cites W2990016708 @default.
- W4225939130 cites W3016244460 @default.
- W4225939130 cites W3024400986 @default.
- W4225939130 cites W3033907960 @default.
- W4225939130 cites W3038713101 @default.
- W4225939130 cites W3048319238 @default.
- W4225939130 cites W3135871143 @default.
- W4225939130 cites W2746871870 @default.
- W4225939130 doi "https://doi.org/10.1109/taslp.2022.3156795" @default.
- W4225939130 hasPublicationYear "2022" @default.
- W4225939130 type Work @default.
- W4225939130 citedByCount "3" @default.
- W4225939130 countsByYear W42259391302023 @default.
- W4225939130 crossrefType "journal-article" @default.
- W4225939130 hasAuthorship W4225939130A5046605996 @default.
- W4225939130 hasAuthorship W4225939130A5086845888 @default.
- W4225939130 hasConcept C119857082 @default.
- W4225939130 hasConcept C127220857 @default.
- W4225939130 hasConcept C13895895 @default.
- W4225939130 hasConcept C154945302 @default.
- W4225939130 hasConcept C28490314 @default.
- W4225939130 hasConcept C33923547 @default.
- W4225939130 hasConcept C41008148 @default.
- W4225939130 hasConcept C45357846 @default.
- W4225939130 hasConcept C64922751 @default.
- W4225939130 hasConcept C774472 @default.
- W4225939130 hasConcept C94375191 @default.
- W4225939130 hasConceptScore W4225939130C119857082 @default.
- W4225939130 hasConceptScore W4225939130C127220857 @default.
- W4225939130 hasConceptScore W4225939130C13895895 @default.
- W4225939130 hasConceptScore W4225939130C154945302 @default.
- W4225939130 hasConceptScore W4225939130C28490314 @default.
- W4225939130 hasConceptScore W4225939130C33923547 @default.
- W4225939130 hasConceptScore W4225939130C41008148 @default.
- W4225939130 hasConceptScore W4225939130C45357846 @default.
- W4225939130 hasConceptScore W4225939130C64922751 @default.
- W4225939130 hasConceptScore W4225939130C774472 @default.
- W4225939130 hasConceptScore W4225939130C94375191 @default.
- W4225939130 hasLocation W42259391301 @default.
- W4225939130 hasOpenAccess W4225939130 @default.
- W4225939130 hasPrimaryLocation W42259391301 @default.
- W4225939130 hasRelatedWork W1603949574 @default.
- W4225939130 hasRelatedWork W1828907027 @default.
- W4225939130 hasRelatedWork W2098448993 @default.
- W4225939130 hasRelatedWork W2121646400 @default.
- W4225939130 hasRelatedWork W2338027094 @default.
- W4225939130 hasRelatedWork W2359932478 @default.
- W4225939130 hasRelatedWork W2379113420 @default.
- W4225939130 hasRelatedWork W2562176306 @default.
- W4225939130 hasRelatedWork W2604447241 @default.
- W4225939130 hasRelatedWork W2754746744 @default.
- W4225939130 hasVolume "30" @default.
- W4225939130 isParatext "false" @default.
- W4225939130 isRetracted "false" @default.
- W4225939130 workType "article" @default.