Matches in SemOpenAlex for { <https://semopenalex.org/work/W2968831808> ?p ?o ?g. }
- W2968831808 endingPage "21" @default.
- W2968831808 startingPage "1" @default.
- W2968831808 abstract "In recent times, sequence-to-sequence (seq2seq) models have gained a lot of popularity and provide state-of-the-art performance in a wide variety of tasks, such as machine translation, headline generation, text summarization, speech-to-text conversion, and image caption generation. The underlying framework for all these models is usually a deep neural network comprising an encoder and a decoder. Although simple encoder-decoder models produce competitive results, many researchers have proposed additional improvements over these seq2seq models, e.g., using an attention-based model over the input, pointer-generation models, and self-attention models. However, such seq2seq models suffer from two common problems: 1) exposure bias and 2) inconsistency between train/test measurement. Recently, a completely novel point of view has emerged in addressing these two problems in seq2seq models, leveraging methods from reinforcement learning (RL). In this survey, we consider seq2seq problems from the RL point of view and provide a formulation combining the power of RL methods in decision-making with seq2seq models that enable remembering long-term memories. We present some of the most recent frameworks that combine the concepts from RL and deep neural networks. Our work aims to provide insights into some of the problems that inherently arise with current approaches and how we can address them with better RL models. We also provide the source code for implementing most of the RL models discussed in this paper to support the complex task of abstractive text summarization and provide some targeted experiments for these RL models, both in terms of performance and training time." @default.
- W2968831808 created "2019-08-22" @default.
- W2968831808 creator A5001022750 @default.
- W2968831808 creator A5012282263 @default.
- W2968831808 creator A5019616632 @default.
- W2968831808 creator A5035052603 @default.
- W2968831808 date "2019-01-01" @default.
- W2968831808 modified "2023-10-14" @default.
- W2968831808 title "Deep Reinforcement Learning for Sequence-to-Sequence Models" @default.
- W2968831808 cites W1494198834 @default.
- W2968831808 cites W1536680647 @default.
- W2968831808 cites W1840435438 @default.
- W2968831808 cites W1895577753 @default.
- W2968831808 cites W1902237438 @default.
- W2968831808 cites W1905882502 @default.
- W2968831808 cites W1931639407 @default.
- W2968831808 cites W1933349210 @default.
- W2968831808 cites W1947481528 @default.
- W2968831808 cites W1956340063 @default.
- W2968831808 cites W1991418309 @default.
- W2968831808 cites W1999156278 @default.
- W2968831808 cites W2064675550 @default.
- W2968831808 cites W2068730032 @default.
- W2968831808 cites W2096765155 @default.
- W2968831808 cites W2102003408 @default.
- W2968831808 cites W2107598941 @default.
- W2968831808 cites W2112796928 @default.
- W2968831808 cites W2119717200 @default.
- W2968831808 cites W2139501017 @default.
- W2968831808 cites W2141559645 @default.
- W2968831808 cites W2143612262 @default.
- W2968831808 cites W2145339207 @default.
- W2968831808 cites W2158349948 @default.
- W2968831808 cites W2158847908 @default.
- W2968831808 cites W2250539671 @default.
- W2968831808 cites W2250726251 @default.
- W2968831808 cites W2251199578 @default.
- W2968831808 cites W2251957808 @default.
- W2968831808 cites W2257979135 @default.
- W2968831808 cites W2425121537 @default.
- W2968831808 cites W2467173223 @default.
- W2968831808 cites W2506483933 @default.
- W2968831808 cites W2512290434 @default.
- W2968831808 cites W2546950329 @default.
- W2968831808 cites W2600702321 @default.
- W2968831808 cites W2606974598 @default.
- W2968831808 cites W2609482285 @default.
- W2968831808 cites W2610891036 @default.
- W2968831808 cites W2741375528 @default.
- W2968831808 cites W2759135454 @default.
- W2968831808 cites W2798524681 @default.
- W2968831808 cites W2913640436 @default.
- W2968831808 cites W2949376505 @default.
- W2968831808 cites W2962701888 @default.
- W2968831808 cites W2962826786 @default.
- W2968831808 cites W2962849707 @default.
- W2968831808 cites W2962897543 @default.
- W2968831808 cites W2962937869 @default.
- W2968831808 cites W2962939608 @default.
- W2968831808 cites W2962953307 @default.
- W2968831808 cites W2962965405 @default.
- W2968831808 cites W2962972512 @default.
- W2968831808 cites W2962985882 @default.
- W2968831808 cites W2963084599 @default.
- W2968831808 cites W2963167310 @default.
- W2968831808 cites W2963206148 @default.
- W2968831808 cites W2963211739 @default.
- W2968831808 cites W2963212250 @default.
- W2968831808 cites W2963321993 @default.
- W2968831808 cites W2963339397 @default.
- W2968831808 cites W2963351776 @default.
- W2968831808 cites W2963385935 @default.
- W2968831808 cites W2963410018 @default.
- W2968831808 cites W2963435215 @default.
- W2968831808 cites W2963463964 @default.
- W2968831808 cites W2963545005 @default.
- W2968831808 cites W2963727906 @default.
- W2968831808 cites W2963730239 @default.
- W2968831808 cites W2963748441 @default.
- W2968831808 cites W2963846996 @default.
- W2968831808 cites W2963888230 @default.
- W2968831808 cites W2963899988 @default.
- W2968831808 cites W2963918774 @default.
- W2968831808 cites W2963929190 @default.
- W2968831808 cites W2963938442 @default.
- W2968831808 cites W2963954913 @default.
- W2968831808 cites W2963970666 @default.
- W2968831808 cites W2964032708 @default.
- W2968831808 cites W2964045208 @default.
- W2968831808 cites W2964236999 @default.
- W2968831808 cites W2964241990 @default.
- W2968831808 cites W3100789280 @default.
- W2968831808 cites W32403112 @default.
- W2968831808 cites W4233696721 @default.
- W2968831808 cites W4254816979 @default.
- W2968831808 cites W639708223 @default.
- W2968831808 doi "https://doi.org/10.1109/tnnls.2019.2929141" @default.
- W2968831808 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/31425057" @default.