Matches in SemOpenAlex for { <https://semopenalex.org/work/W4285197326> ?p ?o ?g. }
- W4285197326 endingPage "102063" @default.
- W4285197326 startingPage "102048" @default.
- W4285197326 abstract "Speech separation has been employed in important applications such as automatic speech, paralinguistics, speech recognition, hearing aids, and human-machine interactions. In recent years, deep neural networks have been widely used for speech and music separation. Some of these breakthrough successful models based on embedding vectors have been proposed, such as deep clustering. In this paper, we propose a node encoder Squash-norm deep clustering (ESDC) as an enhanced discriminative learning framework by combining node encoder, Squash-norm, and deep clustering (DC). First, a node encoder is used to create intermediate features. Node encoders are developed through a matrix factorization-based learning method for graph representations. It creates distinguishable intermediate features that play an important role in improving performance. These discriminated intermediate features are then used as input features for the separation block. The decoder block finally constructs the estimation mask through the clustering method and reconstructs the estimated signal for each source. In particular, we apply a normalization function, Squash-norm, to the input and output vectors to enhance the distinction between high-dimensional embedding vectors. This nonlinear function amplifies the differences in the input vectors, resulting in highly unique features, which are scalar products of the vectors. Similar to the input vector, Squash-norm also enhances the discrimination of the output vector, thereby enhancing the ability to construct an estimated mask by clustering the output vector. Overall, the proposed ESDC achieves 1.27 – 2.09 dB SDR, 1.28 – 2.21 dB SDRi, and 1.3 – 2.44 dB SI-SNRi gain compared to the DC baseline separation performance across genders on the TSP and TIMIT datasets. With the same gender, our proposed ESDC achieves 1.14 – 2.71 dB SDR, 0.99 – 2.74 dB SDRi, and 0.62 – 2.86 dB SI-SNRi gain compared with the DC baseline on the TIMIT dataset. In all cases, the proposed ESDC model consistently maintains STOI and PESQ higher than the DC baselines on the TSP and TIMIT datasets." @default.
- W4285197326 created "2022-07-14" @default.
- W4285197326 creator A5007952996 @default.
- W4285197326 creator A5022504648 @default.
- W4285197326 creator A5026413645 @default.
- W4285197326 creator A5048205934 @default.
- W4285197326 creator A5056726536 @default.
- W4285197326 creator A5085561698 @default.
- W4285197326 date "2022-01-01" @default.
- W4285197326 modified "2023-10-13" @default.
- W4285197326 title "Speech Separation Using Augmented-Discrimination Learning on Squash-Norm Embedding Vector and Node Encoder" @default.
- W4285197326 cites W1602144089 @default.
- W4285197326 cites W165783309 @default.
- W4285197326 cites W1780344239 @default.
- W4285197326 cites W185399533 @default.
- W4285197326 cites W1987906574 @default.
- W4285197326 cites W2031647436 @default.
- W4285197326 cites W2037351952 @default.
- W4285197326 cites W2053165762 @default.
- W4285197326 cites W2069681747 @default.
- W4285197326 cites W2088361146 @default.
- W4285197326 cites W2100495367 @default.
- W4285197326 cites W2127851351 @default.
- W4285197326 cites W2130178255 @default.
- W4285197326 cites W2133340843 @default.
- W4285197326 cites W2141411743 @default.
- W4285197326 cites W2141998673 @default.
- W4285197326 cites W2146608839 @default.
- W4285197326 cites W2149425615 @default.
- W4285197326 cites W2150376021 @default.
- W4285197326 cites W2150415460 @default.
- W4285197326 cites W2159202424 @default.
- W4285197326 cites W2170256193 @default.
- W4285197326 cites W2221409856 @default.
- W4285197326 cites W2460742184 @default.
- W4285197326 cites W2552071709 @default.
- W4285197326 cites W2558649592 @default.
- W4285197326 cites W2610674366 @default.
- W4285197326 cites W2626544737 @default.
- W4285197326 cites W2734774145 @default.
- W4285197326 cites W2767832833 @default.
- W4285197326 cites W2804644188 @default.
- W4285197326 cites W2890111732 @default.
- W4285197326 cites W2891405874 @default.
- W4285197326 cites W2962701080 @default.
- W4285197326 cites W2962715207 @default.
- W4285197326 cites W2962905190 @default.
- W4285197326 cites W2962935966 @default.
- W4285197326 cites W2963551828 @default.
- W4285197326 cites W3015199127 @default.
- W4285197326 cites W3020724926 @default.
- W4285197326 cites W3026111682 @default.
- W4285197326 cites W3031404175 @default.
- W4285197326 cites W3041647828 @default.
- W4285197326 cites W3096893582 @default.
- W4285197326 cites W3160903688 @default.
- W4285197326 cites W3162698503 @default.
- W4285197326 cites W3163114796 @default.
- W4285197326 cites W3163652268 @default.
- W4285197326 cites W3185109982 @default.
- W4285197326 cites W3186221976 @default.
- W4285197326 cites W3217000580 @default.
- W4285197326 cites W4200529817 @default.
- W4285197326 cites W4206215724 @default.
- W4285197326 cites W4225281800 @default.
- W4285197326 cites W80444264 @default.
- W4285197326 doi "https://doi.org/10.1109/access.2022.3188712" @default.
- W4285197326 hasPublicationYear "2022" @default.
- W4285197326 type Work @default.
- W4285197326 citedByCount "2" @default.
- W4285197326 countsByYear W42851973262023 @default.
- W4285197326 crossrefType "journal-article" @default.
- W4285197326 hasAuthorship W4285197326A5007952996 @default.
- W4285197326 hasAuthorship W4285197326A5022504648 @default.
- W4285197326 hasAuthorship W4285197326A5026413645 @default.
- W4285197326 hasAuthorship W4285197326A5048205934 @default.
- W4285197326 hasAuthorship W4285197326A5056726536 @default.
- W4285197326 hasAuthorship W4285197326A5085561698 @default.
- W4285197326 hasBestOaLocation W42851973261 @default.
- W4285197326 hasConcept C108583219 @default.
- W4285197326 hasConcept C136886441 @default.
- W4285197326 hasConcept C144024400 @default.
- W4285197326 hasConcept C153180895 @default.
- W4285197326 hasConcept C154945302 @default.
- W4285197326 hasConcept C17744445 @default.
- W4285197326 hasConcept C19165224 @default.
- W4285197326 hasConcept C191795146 @default.
- W4285197326 hasConcept C199539241 @default.
- W4285197326 hasConcept C28490314 @default.
- W4285197326 hasConcept C41008148 @default.
- W4285197326 hasConcept C41608201 @default.
- W4285197326 hasConcept C73555534 @default.
- W4285197326 hasConceptScore W4285197326C108583219 @default.
- W4285197326 hasConceptScore W4285197326C136886441 @default.
- W4285197326 hasConceptScore W4285197326C144024400 @default.
- W4285197326 hasConceptScore W4285197326C153180895 @default.
- W4285197326 hasConceptScore W4285197326C154945302 @default.
- W4285197326 hasConceptScore W4285197326C17744445 @default.