Matches in SemOpenAlex for { <https://semopenalex.org/work/W3180992661> ?p ?o ?g. }
- W3180992661 abstract "Abstract Lately, the self-attention mechanism has marked a new milestone in the field of automatic speech recognition (ASR). Nevertheless, its performance is susceptible to environmental intrusions as the system predicts the next output symbol depending on the full input sequence and the previous predictions. A popular solution for this problem is adding an independent speech enhancement module as the front-end. Nonetheless, due to being trained separately from the ASR module, the independent enhancement front-end falls into the sub-optimum easily. Besides, the handcrafted loss function of the enhancement module tends to introduce unseen distortions, which even degrade the ASR performance. Inspired by the extensive applications of the generative adversarial networks (GANs) in speech enhancement and ASR tasks, we propose an adversarial joint training framework with the self-attention mechanism to boost the noise robustness of the ASR system. Generally, it consists of a self-attention speech enhancement GAN and a self-attention end-to-end ASR model. There are two advantages which are worth noting in this proposed framework. One is that it benefits from the advancement of both self-attention mechanism and GANs, while the other is that the discriminator of GAN plays the role of the global discriminant network in the stage of the adversarial joint training, which guides the enhancement front-end to capture more compatible structures for the subsequent ASR module and thereby offsets the limitation of the separate training and handcrafted loss functions. With the adversarial joint optimization, the proposed framework is expected to learn more robust representations suitable for the ASR task. We execute systematic experiments on the corpus AISHELL-1, and the experimental results show that on the artificial noisy test set, the proposed framework achieves the relative improvements of 66% compared to the ASR model trained by clean data solely, 35.1% compared to the speech enhancement and ASR scheme without joint training, and 5.3% compared to multi-condition training." @default.
- W3180992661 created "2021-07-19" @default.
- W3180992661 creator A5000758959 @default.
- W3180992661 creator A5035094320 @default.
- W3180992661 creator A5039092855 @default.
- W3180992661 creator A5078449382 @default.
- W3180992661 creator A5079663771 @default.
- W3180992661 creator A5085411179 @default.
- W3180992661 date "2021-07-05" @default.
- W3180992661 modified "2023-10-09" @default.
- W3180992661 title "Adversarial joint training with self-attention mechanism for robust end-to-end speech recognition" @default.
- W3180992661 cites W1482149378 @default.
- W3180992661 cites W1598508708 @default.
- W3180992661 cites W1677182931 @default.
- W3180992661 cites W1974387177 @default.
- W3180992661 cites W1995536493 @default.
- W3180992661 cites W2042141988 @default.
- W3180992661 cites W2044893557 @default.
- W3180992661 cites W2046869671 @default.
- W3180992661 cites W2069681747 @default.
- W3180992661 cites W2107776732 @default.
- W3180992661 cites W2113556376 @default.
- W3180992661 cites W2117109565 @default.
- W3180992661 cites W2141411743 @default.
- W3180992661 cites W2144404214 @default.
- W3180992661 cites W2160815625 @default.
- W3180992661 cites W2296167893 @default.
- W3180992661 cites W2491899193 @default.
- W3180992661 cites W2526425061 @default.
- W3180992661 cites W2587210085 @default.
- W3180992661 cites W2746457594 @default.
- W3180992661 cites W2749371885 @default.
- W3180992661 cites W2795050058 @default.
- W3180992661 cites W2802023636 @default.
- W3180992661 cites W2802304149 @default.
- W3180992661 cites W2810535283 @default.
- W3180992661 cites W2891138528 @default.
- W3180992661 cites W2891980359 @default.
- W3180992661 cites W2892009249 @default.
- W3180992661 cites W2911291251 @default.
- W3180992661 cites W2936774411 @default.
- W3180992661 cites W2962780374 @default.
- W3180992661 cites W2962824709 @default.
- W3180992661 cites W2962909949 @default.
- W3180992661 cites W2962959469 @default.
- W3180992661 cites W2962993399 @default.
- W3180992661 cites W2963073614 @default.
- W3180992661 cites W2963242190 @default.
- W3180992661 cites W2963321191 @default.
- W3180992661 cites W2963341071 @default.
- W3180992661 cites W2963420272 @default.
- W3180992661 cites W2963542740 @default.
- W3180992661 cites W2972320711 @default.
- W3180992661 cites W2972451902 @default.
- W3180992661 cites W2973201641 @default.
- W3180992661 cites W2976556660 @default.
- W3180992661 cites W3006827623 @default.
- W3180992661 cites W3008191852 @default.
- W3180992661 cites W3015219411 @default.
- W3180992661 cites W3097777922 @default.
- W3180992661 cites W3102190437 @default.
- W3180992661 cites W3144345593 @default.
- W3180992661 cites W3161273075 @default.
- W3180992661 cites W4253928870 @default.
- W3180992661 doi "https://doi.org/10.1186/s13636-021-00215-6" @default.
- W3180992661 hasPublicationYear "2021" @default.
- W3180992661 type Work @default.
- W3180992661 sameAs 3180992661 @default.
- W3180992661 citedByCount "6" @default.
- W3180992661 countsByYear W31809926612021 @default.
- W3180992661 countsByYear W31809926612022 @default.
- W3180992661 countsByYear W31809926612023 @default.
- W3180992661 crossrefType "journal-article" @default.
- W3180992661 hasAuthorship W3180992661A5000758959 @default.
- W3180992661 hasAuthorship W3180992661A5035094320 @default.
- W3180992661 hasAuthorship W3180992661A5039092855 @default.
- W3180992661 hasAuthorship W3180992661A5078449382 @default.
- W3180992661 hasAuthorship W3180992661A5079663771 @default.
- W3180992661 hasAuthorship W3180992661A5085411179 @default.
- W3180992661 hasBestOaLocation W31809926611 @default.
- W3180992661 hasConcept C104317684 @default.
- W3180992661 hasConcept C111919701 @default.
- W3180992661 hasConcept C127413603 @default.
- W3180992661 hasConcept C154945302 @default.
- W3180992661 hasConcept C170154142 @default.
- W3180992661 hasConcept C18555067 @default.
- W3180992661 hasConcept C185592680 @default.
- W3180992661 hasConcept C2779803651 @default.
- W3180992661 hasConcept C28490314 @default.
- W3180992661 hasConcept C37736160 @default.
- W3180992661 hasConcept C39890363 @default.
- W3180992661 hasConcept C41008148 @default.
- W3180992661 hasConcept C53016008 @default.
- W3180992661 hasConcept C55493867 @default.
- W3180992661 hasConcept C63479239 @default.
- W3180992661 hasConcept C74296488 @default.
- W3180992661 hasConcept C76155785 @default.
- W3180992661 hasConcept C94915269 @default.
- W3180992661 hasConceptScore W3180992661C104317684 @default.
- W3180992661 hasConceptScore W3180992661C111919701 @default.