Matches in SemOpenAlex for { <https://semopenalex.org/work/W2989598077> ?p ?o ?g. }
- W2989598077 abstract "In environments with continuous state and action spaces, state-of-the-art actor-critic reinforcement learning algorithms can solve very complex problems, yet can also fail in environments that seem trivial, but the reason for such failures is still poorly understood. In this paper, we contribute a formal explanation of these failures in the particular case of sparse reward and deterministic environments. First, using a very elementary control problem, we illustrate that the learning process can get stuck into a fixed point corresponding to a poor solution. Then, generalizing from the studied example, we provide a detailed analysis of the underlying mechanisms which results in a new understanding of one of the convergence regimes of these algorithms. The resulting perspective casts a new light on already existing solutions to the issues we have highlighted, and suggests other potential approaches." @default.
- W2989598077 created "2019-12-05" @default.
- W2989598077 creator A5018273862 @default.
- W2989598077 creator A5042850624 @default.
- W2989598077 creator A5071054174 @default.
- W2989598077 date "2019-09-25" @default.
- W2989598077 modified "2023-09-27" @default.
- W2989598077 title "The problem with DDPG: understanding failures in deterministic environments with sparse rewards." @default.
- W2989598077 cites W1757796397 @default.
- W2989598077 cites W1980241579 @default.
- W2989598077 cites W2079247031 @default.
- W2989598077 cites W2101539915 @default.
- W2989598077 cites W2121863487 @default.
- W2989598077 cites W2125074935 @default.
- W2989598077 cites W2137766593 @default.
- W2989598077 cites W2145339207 @default.
- W2989598077 cites W2165150801 @default.
- W2989598077 cites W2173248099 @default.
- W2989598077 cites W2623491082 @default.
- W2989598077 cites W2781726626 @default.
- W2989598077 cites W2787938642 @default.
- W2989598077 cites W2788781499 @default.
- W2989598077 cites W2810785043 @default.
- W2989598077 cites W2902098903 @default.
- W2989598077 cites W2904246096 @default.
- W2989598077 cites W2924131335 @default.
- W2989598077 cites W2924205497 @default.
- W2989598077 cites W2950872548 @default.
- W2989598077 cites W2962821147 @default.
- W2989598077 cites W2963690172 @default.
- W2989598077 cites W2963704132 @default.
- W2989598077 cites W2964174623 @default.
- W2989598077 cites W2966754320 @default.
- W2989598077 hasPublicationYear "2019" @default.
- W2989598077 type Work @default.
- W2989598077 sameAs 2989598077 @default.
- W2989598077 citedByCount "12" @default.
- W2989598077 countsByYear W29895980772020 @default.
- W2989598077 countsByYear W29895980772021 @default.
- W2989598077 crossrefType "posted-content" @default.
- W2989598077 hasAuthorship W2989598077A5018273862 @default.
- W2989598077 hasAuthorship W2989598077A5042850624 @default.
- W2989598077 hasAuthorship W2989598077A5071054174 @default.
- W2989598077 hasConcept C111919701 @default.
- W2989598077 hasConcept C11413529 @default.
- W2989598077 hasConcept C121332964 @default.
- W2989598077 hasConcept C12713177 @default.
- W2989598077 hasConcept C154945302 @default.
- W2989598077 hasConcept C162324750 @default.
- W2989598077 hasConcept C2524010 @default.
- W2989598077 hasConcept C2777303404 @default.
- W2989598077 hasConcept C2780791683 @default.
- W2989598077 hasConcept C28719098 @default.
- W2989598077 hasConcept C33923547 @default.
- W2989598077 hasConcept C41008148 @default.
- W2989598077 hasConcept C48103436 @default.
- W2989598077 hasConcept C50522688 @default.
- W2989598077 hasConcept C62520636 @default.
- W2989598077 hasConcept C80444323 @default.
- W2989598077 hasConcept C97541855 @default.
- W2989598077 hasConcept C98045186 @default.
- W2989598077 hasConceptScore W2989598077C111919701 @default.
- W2989598077 hasConceptScore W2989598077C11413529 @default.
- W2989598077 hasConceptScore W2989598077C121332964 @default.
- W2989598077 hasConceptScore W2989598077C12713177 @default.
- W2989598077 hasConceptScore W2989598077C154945302 @default.
- W2989598077 hasConceptScore W2989598077C162324750 @default.
- W2989598077 hasConceptScore W2989598077C2524010 @default.
- W2989598077 hasConceptScore W2989598077C2777303404 @default.
- W2989598077 hasConceptScore W2989598077C2780791683 @default.
- W2989598077 hasConceptScore W2989598077C28719098 @default.
- W2989598077 hasConceptScore W2989598077C33923547 @default.
- W2989598077 hasConceptScore W2989598077C41008148 @default.
- W2989598077 hasConceptScore W2989598077C48103436 @default.
- W2989598077 hasConceptScore W2989598077C50522688 @default.
- W2989598077 hasConceptScore W2989598077C62520636 @default.
- W2989598077 hasConceptScore W2989598077C80444323 @default.
- W2989598077 hasConceptScore W2989598077C97541855 @default.
- W2989598077 hasConceptScore W2989598077C98045186 @default.
- W2989598077 hasLocation W29895980771 @default.
- W2989598077 hasOpenAccess W2989598077 @default.
- W2989598077 hasPrimaryLocation W29895980771 @default.
- W2989598077 hasRelatedWork W1757796397 @default.
- W2989598077 hasRelatedWork W2121863487 @default.
- W2989598077 hasRelatedWork W2134746140 @default.
- W2989598077 hasRelatedWork W2145339207 @default.
- W2989598077 hasRelatedWork W2155968351 @default.
- W2989598077 hasRelatedWork W2165150801 @default.
- W2989598077 hasRelatedWork W2173248099 @default.
- W2989598077 hasRelatedWork W2186226886 @default.
- W2989598077 hasRelatedWork W2257979135 @default.
- W2989598077 hasRelatedWork W2526500736 @default.
- W2989598077 hasRelatedWork W2736601468 @default.
- W2989598077 hasRelatedWork W2781726626 @default.
- W2989598077 hasRelatedWork W2785542505 @default.
- W2989598077 hasRelatedWork W2963847403 @default.
- W2989598077 hasRelatedWork W2993208870 @default.
- W2989598077 hasRelatedWork W3129322645 @default.
- W2989598077 hasRelatedWork W3138862539 @default.
- W2989598077 hasRelatedWork W3204728906 @default.