Matches in SemOpenAlex for { <https://semopenalex.org/work/W3114751253> ?p ?o ?g. }
Showing items 1 to 93 of
93
with 100 items per page.
- W3114751253 abstract "We propose a generic reward shaping approach for improving the rate of convergence in reinforcement learning (RL), called Self Improvement Based REwards, or SIBRE. The approach is designed for use in conjunction with any existing RL algorithm, and consists of rewarding improvement over the agent's own past performance. We prove that SIBRE converges in expectation under the same conditions as the original RL algorithm. The reshaped rewards help discriminate between policies when the original rewards are weakly discriminated or sparse. Experiments on several well-known benchmark environments with different RL algorithms show that SIBRE converges to the optimal policy faster and more stably. We also perform sensitivity analysis with respect to hyper-parameters, in comparison with baseline RL algorithms." @default.
- W3114751253 created "2021-01-05" @default.
- W3114751253 creator A5003269612 @default.
- W3114751253 creator A5045214135 @default.
- W3114751253 creator A5050561161 @default.
- W3114751253 creator A5091821909 @default.
- W3114751253 date "2020-04-21" @default.
- W3114751253 modified "2023-09-27" @default.
- W3114751253 title "SIBRE: Self Improvement Based REwards for Adaptive Feedback in Reinforcement Learning" @default.
- W3114751253 hasPublicationYear "2020" @default.
- W3114751253 type Work @default.
- W3114751253 sameAs 3114751253 @default.
- W3114751253 citedByCount "0" @default.
- W3114751253 crossrefType "posted-content" @default.
- W3114751253 hasAuthorship W3114751253A5003269612 @default.
- W3114751253 hasAuthorship W3114751253A5045214135 @default.
- W3114751253 hasAuthorship W3114751253A5050561161 @default.
- W3114751253 hasAuthorship W3114751253A5091821909 @default.
- W3114751253 hasConcept C111368507 @default.
- W3114751253 hasConcept C11413529 @default.
- W3114751253 hasConcept C119857082 @default.
- W3114751253 hasConcept C126255220 @default.
- W3114751253 hasConcept C12725497 @default.
- W3114751253 hasConcept C127313418 @default.
- W3114751253 hasConcept C127413603 @default.
- W3114751253 hasConcept C13280743 @default.
- W3114751253 hasConcept C154945302 @default.
- W3114751253 hasConcept C162324750 @default.
- W3114751253 hasConcept C185798385 @default.
- W3114751253 hasConcept C205649164 @default.
- W3114751253 hasConcept C21200559 @default.
- W3114751253 hasConcept C24326235 @default.
- W3114751253 hasConcept C26517878 @default.
- W3114751253 hasConcept C2777303404 @default.
- W3114751253 hasConcept C33923547 @default.
- W3114751253 hasConcept C38652104 @default.
- W3114751253 hasConcept C41008148 @default.
- W3114751253 hasConcept C50522688 @default.
- W3114751253 hasConcept C57869625 @default.
- W3114751253 hasConcept C66938386 @default.
- W3114751253 hasConcept C67203356 @default.
- W3114751253 hasConcept C97541855 @default.
- W3114751253 hasConceptScore W3114751253C111368507 @default.
- W3114751253 hasConceptScore W3114751253C11413529 @default.
- W3114751253 hasConceptScore W3114751253C119857082 @default.
- W3114751253 hasConceptScore W3114751253C126255220 @default.
- W3114751253 hasConceptScore W3114751253C12725497 @default.
- W3114751253 hasConceptScore W3114751253C127313418 @default.
- W3114751253 hasConceptScore W3114751253C127413603 @default.
- W3114751253 hasConceptScore W3114751253C13280743 @default.
- W3114751253 hasConceptScore W3114751253C154945302 @default.
- W3114751253 hasConceptScore W3114751253C162324750 @default.
- W3114751253 hasConceptScore W3114751253C185798385 @default.
- W3114751253 hasConceptScore W3114751253C205649164 @default.
- W3114751253 hasConceptScore W3114751253C21200559 @default.
- W3114751253 hasConceptScore W3114751253C24326235 @default.
- W3114751253 hasConceptScore W3114751253C26517878 @default.
- W3114751253 hasConceptScore W3114751253C2777303404 @default.
- W3114751253 hasConceptScore W3114751253C33923547 @default.
- W3114751253 hasConceptScore W3114751253C38652104 @default.
- W3114751253 hasConceptScore W3114751253C41008148 @default.
- W3114751253 hasConceptScore W3114751253C50522688 @default.
- W3114751253 hasConceptScore W3114751253C57869625 @default.
- W3114751253 hasConceptScore W3114751253C66938386 @default.
- W3114751253 hasConceptScore W3114751253C67203356 @default.
- W3114751253 hasConceptScore W3114751253C97541855 @default.
- W3114751253 hasLocation W31147512531 @default.
- W3114751253 hasOpenAccess W3114751253 @default.
- W3114751253 hasPrimaryLocation W31147512531 @default.
- W3114751253 hasRelatedWork W208428353 @default.
- W3114751253 hasRelatedWork W2344349469 @default.
- W3114751253 hasRelatedWork W2804930149 @default.
- W3114751253 hasRelatedWork W2895478303 @default.
- W3114751253 hasRelatedWork W2953220522 @default.
- W3114751253 hasRelatedWork W2966067376 @default.
- W3114751253 hasRelatedWork W2972500268 @default.
- W3114751253 hasRelatedWork W2994972935 @default.
- W3114751253 hasRelatedWork W3019379181 @default.
- W3114751253 hasRelatedWork W3034448784 @default.
- W3114751253 hasRelatedWork W3037924899 @default.
- W3114751253 hasRelatedWork W3091628900 @default.
- W3114751253 hasRelatedWork W3092872653 @default.
- W3114751253 hasRelatedWork W3133533407 @default.
- W3114751253 hasRelatedWork W3167755691 @default.
- W3114751253 hasRelatedWork W3172786397 @default.
- W3114751253 hasRelatedWork W3177009489 @default.
- W3114751253 hasRelatedWork W3177136099 @default.
- W3114751253 hasRelatedWork W3180964409 @default.
- W3114751253 hasRelatedWork W3208111980 @default.
- W3114751253 isParatext "false" @default.
- W3114751253 isRetracted "false" @default.
- W3114751253 magId "3114751253" @default.
- W3114751253 workType "article" @default.