Matches in SemOpenAlex for { <https://semopenalex.org/work/W2996680501> ?p ?o ?g. }
- W2996680501 abstract "Inadequate reliability is a well-known issue for reinforcement learning (RL) algorithms. This problem has gained increasing attention in recent years, and efforts to improve it have grown substantially. To aid RL researchers and production users with the evaluation and improvement of reliability, we propose a novel set of metrics that quantitatively measure different aspects of reliability. In this work, we address variability and risk, both during training and after learning (on a fixed policy). We designed these metrics to be general-purpose, and we also designed complementary statistical tests to enable rigorous comparisons on these metrics. In this paper, we first describe the desired properties of the metrics and their design, the aspects of reliability that they measure, and their applicability to different scenarios. We then describe the statistical tests and make additional practical recommendations for reporting results. Finally, we apply our metrics to a set of common RL algorithms and environments, compare them, and analyze the results." @default.
- W2996680501 created "2019-12-26" @default.
- W2996680501 creator A5002408570 @default.
- W2996680501 creator A5004083108 @default.
- W2996680501 creator A5021342036 @default.
- W2996680501 creator A5067066896 @default.
- W2996680501 creator A5089723214 @default.
- W2996680501 date "2020-04-30" @default.
- W2996680501 modified "2023-09-25" @default.
- W2996680501 title "Measuring the Reliability of Reinforcement Learning Algorithms" @default.
- W2996680501 cites W1607204504 @default.
- W2996680501 cites W2027106436 @default.
- W2996680501 cites W2112688502 @default.
- W2996680501 cites W2117922789 @default.
- W2996680501 cites W2145339207 @default.
- W2996680501 cites W2155027007 @default.
- W2996680501 cites W2159223630 @default.
- W2996680501 cites W2161498332 @default.
- W2996680501 cites W2166362129 @default.
- W2996680501 cites W2167696704 @default.
- W2996680501 cites W2173248099 @default.
- W2996680501 cites W2732547613 @default.
- W2996680501 cites W2736601468 @default.
- W2996680501 cites W2747402019 @default.
- W2996680501 cites W2781726626 @default.
- W2996680501 cites W2809256243 @default.
- W2996680501 cites W2890735789 @default.
- W2996680501 cites W2905342215 @default.
- W2996680501 cites W2962878825 @default.
- W2996680501 cites W2963403143 @default.
- W2996680501 cites W2963641140 @default.
- W2996680501 cites W2964174623 @default.
- W2996680501 cites W2964291307 @default.
- W2996680501 cites W3123675609 @default.
- W2996680501 cites W3146166473 @default.
- W2996680501 hasPublicationYear "2020" @default.
- W2996680501 type Work @default.
- W2996680501 sameAs 2996680501 @default.
- W2996680501 citedByCount "13" @default.
- W2996680501 countsByYear W29966805012019 @default.
- W2996680501 countsByYear W29966805012020 @default.
- W2996680501 countsByYear W29966805012021 @default.
- W2996680501 countsByYear W29966805012022 @default.
- W2996680501 crossrefType "proceedings-article" @default.
- W2996680501 hasAuthorship W2996680501A5002408570 @default.
- W2996680501 hasAuthorship W2996680501A5004083108 @default.
- W2996680501 hasAuthorship W2996680501A5021342036 @default.
- W2996680501 hasAuthorship W2996680501A5067066896 @default.
- W2996680501 hasAuthorship W2996680501A5089723214 @default.
- W2996680501 hasConcept C11413529 @default.
- W2996680501 hasConcept C119857082 @default.
- W2996680501 hasConcept C121332964 @default.
- W2996680501 hasConcept C124101348 @default.
- W2996680501 hasConcept C154945302 @default.
- W2996680501 hasConcept C163258240 @default.
- W2996680501 hasConcept C177264268 @default.
- W2996680501 hasConcept C199360897 @default.
- W2996680501 hasConcept C2780009758 @default.
- W2996680501 hasConcept C41008148 @default.
- W2996680501 hasConcept C43214815 @default.
- W2996680501 hasConcept C62520636 @default.
- W2996680501 hasConcept C97541855 @default.
- W2996680501 hasConceptScore W2996680501C11413529 @default.
- W2996680501 hasConceptScore W2996680501C119857082 @default.
- W2996680501 hasConceptScore W2996680501C121332964 @default.
- W2996680501 hasConceptScore W2996680501C124101348 @default.
- W2996680501 hasConceptScore W2996680501C154945302 @default.
- W2996680501 hasConceptScore W2996680501C163258240 @default.
- W2996680501 hasConceptScore W2996680501C177264268 @default.
- W2996680501 hasConceptScore W2996680501C199360897 @default.
- W2996680501 hasConceptScore W2996680501C2780009758 @default.
- W2996680501 hasConceptScore W2996680501C41008148 @default.
- W2996680501 hasConceptScore W2996680501C43214815 @default.
- W2996680501 hasConceptScore W2996680501C62520636 @default.
- W2996680501 hasConceptScore W2996680501C97541855 @default.
- W2996680501 hasLocation W29966805011 @default.
- W2996680501 hasOpenAccess W2996680501 @default.
- W2996680501 hasPrimaryLocation W29966805011 @default.
- W2996680501 hasRelatedWork W1494217791 @default.
- W2996680501 hasRelatedWork W1532349484 @default.
- W2996680501 hasRelatedWork W1603910443 @default.
- W2996680501 hasRelatedWork W1981838976 @default.
- W2996680501 hasRelatedWork W2012878626 @default.
- W2996680501 hasRelatedWork W2145339207 @default.
- W2996680501 hasRelatedWork W2158782408 @default.
- W2996680501 hasRelatedWork W2173248099 @default.
- W2996680501 hasRelatedWork W2802752250 @default.
- W2996680501 hasRelatedWork W2807906245 @default.
- W2996680501 hasRelatedWork W2936308220 @default.
- W2996680501 hasRelatedWork W2963120839 @default.
- W2996680501 hasRelatedWork W2964043796 @default.
- W2996680501 hasRelatedWork W2978157449 @default.
- W2996680501 hasRelatedWork W2994722919 @default.
- W2996680501 hasRelatedWork W3038530774 @default.
- W2996680501 hasRelatedWork W3047479470 @default.
- W2996680501 hasRelatedWork W3148121023 @default.
- W2996680501 hasRelatedWork W3150597725 @default.
- W2996680501 hasRelatedWork W3156771790 @default.
- W2996680501 isParatext "false" @default.
- W2996680501 isRetracted "false" @default.