Matches in SemOpenAlex for { <https://semopenalex.org/work/W3136456184> ?p ?o ?g. }
- W3136456184 endingPage "5589" @default.
- W3136456184 startingPage "5572" @default.
- W3136456184 abstract "It is difficult to solve complex tasks that involve large state spaces and long-term decision processes by reinforcement learning (RL) algorithms. A common and promising method to address this challenge is to compress a large RL problem into a small one. Towards this goal, the compression should be state-temporal and optimality-preserving (i.e., the optimal policy of the compressed problem should correspond to that of the uncompressed problem). In this paper, we propose a reward-restricted geodesic (RRG) metric, which can be learned by a neural network, to perform state-temporal compression in RL. We prove that compression based on the RRG metric is approximately optimality-preserving for the raw RL problem endowed with temporally abstract actions. With this compression, we design an RRG metric-based reinforcement learning (RRG-RL) algorithm to solve complex tasks. Experiments in both discrete (2D Minecraft) and continuous (Doom) environments demonstrated the superiority of our method over existing RL approaches." @default.
- W3136456184 created "2021-03-29" @default.
- W3136456184 creator A5004579631 @default.
- W3136456184 creator A5019974853 @default.
- W3136456184 creator A5021979312 @default.
- W3136456184 creator A5040875615 @default.
- W3136456184 creator A5048321911 @default.
- W3136456184 date "2022-09-01" @default.
- W3136456184 modified "2023-10-15" @default.
- W3136456184 title "State-Temporal Compression in Reinforcement Learning With the Reward-Restricted Geodesic Metric" @default.
- W3136456184 cites W1504915502 @default.
- W3136456184 cites W1505937442 @default.
- W3136456184 cites W1509780496 @default.
- W3136456184 cites W1996648614 @default.
- W3136456184 cites W2056354534 @default.
- W3136456184 cites W2067635342 @default.
- W3136456184 cites W2071056361 @default.
- W3136456184 cites W2071302132 @default.
- W3136456184 cites W2109910161 @default.
- W3136456184 cites W2119972318 @default.
- W3136456184 cites W2121517924 @default.
- W3136456184 cites W2138621090 @default.
- W3136456184 cites W2145339207 @default.
- W3136456184 cites W2160808139 @default.
- W3136456184 cites W2787066086 @default.
- W3136456184 cites W2926786214 @default.
- W3136456184 cites W2951714068 @default.
- W3136456184 cites W2963011537 @default.
- W3136456184 cites W2963523627 @default.
- W3136456184 cites W2963871073 @default.
- W3136456184 cites W2964227312 @default.
- W3136456184 cites W2997101648 @default.
- W3136456184 doi "https://doi.org/10.1109/tpami.2021.3069005" @default.
- W3136456184 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/33764874" @default.
- W3136456184 hasPublicationYear "2022" @default.
- W3136456184 type Work @default.
- W3136456184 sameAs 3136456184 @default.
- W3136456184 citedByCount "2" @default.
- W3136456184 countsByYear W31364561842022 @default.
- W3136456184 countsByYear W31364561842023 @default.
- W3136456184 crossrefType "journal-article" @default.
- W3136456184 hasAuthorship W3136456184A5004579631 @default.
- W3136456184 hasAuthorship W3136456184A5019974853 @default.
- W3136456184 hasAuthorship W3136456184A5021979312 @default.
- W3136456184 hasAuthorship W3136456184A5040875615 @default.
- W3136456184 hasAuthorship W3136456184A5048321911 @default.
- W3136456184 hasConcept C11413529 @default.
- W3136456184 hasConcept C126255220 @default.
- W3136456184 hasConcept C134306372 @default.
- W3136456184 hasConcept C154945302 @default.
- W3136456184 hasConcept C159985019 @default.
- W3136456184 hasConcept C162324750 @default.
- W3136456184 hasConcept C162478608 @default.
- W3136456184 hasConcept C165818556 @default.
- W3136456184 hasConcept C176217482 @default.
- W3136456184 hasConcept C180016635 @default.
- W3136456184 hasConcept C192562407 @default.
- W3136456184 hasConcept C202474056 @default.
- W3136456184 hasConcept C21547014 @default.
- W3136456184 hasConcept C2781238097 @default.
- W3136456184 hasConcept C33923547 @default.
- W3136456184 hasConcept C41008148 @default.
- W3136456184 hasConcept C48103436 @default.
- W3136456184 hasConcept C97541855 @default.
- W3136456184 hasConceptScore W3136456184C11413529 @default.
- W3136456184 hasConceptScore W3136456184C126255220 @default.
- W3136456184 hasConceptScore W3136456184C134306372 @default.
- W3136456184 hasConceptScore W3136456184C154945302 @default.
- W3136456184 hasConceptScore W3136456184C159985019 @default.
- W3136456184 hasConceptScore W3136456184C162324750 @default.
- W3136456184 hasConceptScore W3136456184C162478608 @default.
- W3136456184 hasConceptScore W3136456184C165818556 @default.
- W3136456184 hasConceptScore W3136456184C176217482 @default.
- W3136456184 hasConceptScore W3136456184C180016635 @default.
- W3136456184 hasConceptScore W3136456184C192562407 @default.
- W3136456184 hasConceptScore W3136456184C202474056 @default.
- W3136456184 hasConceptScore W3136456184C21547014 @default.
- W3136456184 hasConceptScore W3136456184C2781238097 @default.
- W3136456184 hasConceptScore W3136456184C33923547 @default.
- W3136456184 hasConceptScore W3136456184C41008148 @default.
- W3136456184 hasConceptScore W3136456184C48103436 @default.
- W3136456184 hasConceptScore W3136456184C97541855 @default.
- W3136456184 hasFunder F4320321001 @default.
- W3136456184 hasIssue "9" @default.
- W3136456184 hasLocation W31364561841 @default.
- W3136456184 hasLocation W31364561842 @default.
- W3136456184 hasOpenAccess W3136456184 @default.
- W3136456184 hasPrimaryLocation W31364561841 @default.
- W3136456184 hasRelatedWork W2511960118 @default.
- W3136456184 hasRelatedWork W260766989 @default.
- W3136456184 hasRelatedWork W2959276766 @default.
- W3136456184 hasRelatedWork W3074294383 @default.
- W3136456184 hasRelatedWork W3111983280 @default.
- W3136456184 hasRelatedWork W3136456184 @default.
- W3136456184 hasRelatedWork W3139193008 @default.
- W3136456184 hasRelatedWork W3164468573 @default.
- W3136456184 hasRelatedWork W4206669594 @default.
- W3136456184 hasRelatedWork W4295941380 @default.