Matches in SemOpenAlex for { <https://semopenalex.org/work/W4200631657> ?p ?o ?g. }
Showing items 1 to 66 of
66
with 100 items per page.
- W4200631657 abstract "Reinforcement learning (RL) has gained increasing attraction in the academia and tech industry with launches to a variety of impactful applications and products. Although research is being actively conducted on many fronts (e.g., offline RL, performance, etc.), many RL practitioners face a challenge that has been largely ignored: determine whether a designed Markov Decision Process (MDP) is valid and meaningful. This study proposes a heuristic-based feature analysis method to validate whether an MDP is well formulated. We believe an MDP suitable for applying RL should contain a set of state features that are both sensitive to actions and predictive in rewards. We tested our method in constructed environments showing that our approach can identify certain invalid environment formulations. As far as we know, performing validity analysis for RL problem formulation is a novel direction. We envision that our tool will serve as a motivational example to help practitioners apply RL in real-world problems more easily." @default.
- W4200631657 created "2021-12-31" @default.
- W4200631657 creator A5010176677 @default.
- W4200631657 creator A5089618727 @default.
- W4200631657 date "2021-12-10" @default.
- W4200631657 modified "2023-09-29" @default.
- W4200631657 title "A Validation Tool for Designing Reinforcement Learning Environments" @default.
- W4200631657 doi "https://doi.org/10.48550/arxiv.2112.05519" @default.
- W4200631657 hasPublicationYear "2021" @default.
- W4200631657 type Work @default.
- W4200631657 citedByCount "0" @default.
- W4200631657 crossrefType "posted-content" @default.
- W4200631657 hasAuthorship W4200631657A5010176677 @default.
- W4200631657 hasAuthorship W4200631657A5089618727 @default.
- W4200631657 hasBestOaLocation W42006316571 @default.
- W4200631657 hasConcept C105795698 @default.
- W4200631657 hasConcept C106189395 @default.
- W4200631657 hasConcept C111919701 @default.
- W4200631657 hasConcept C119857082 @default.
- W4200631657 hasConcept C136197465 @default.
- W4200631657 hasConcept C138885662 @default.
- W4200631657 hasConcept C154945302 @default.
- W4200631657 hasConcept C159886148 @default.
- W4200631657 hasConcept C173801870 @default.
- W4200631657 hasConcept C177264268 @default.
- W4200631657 hasConcept C199360897 @default.
- W4200631657 hasConcept C2776401178 @default.
- W4200631657 hasConcept C33923547 @default.
- W4200631657 hasConcept C41008148 @default.
- W4200631657 hasConcept C41895202 @default.
- W4200631657 hasConcept C97541855 @default.
- W4200631657 hasConcept C98045186 @default.
- W4200631657 hasConceptScore W4200631657C105795698 @default.
- W4200631657 hasConceptScore W4200631657C106189395 @default.
- W4200631657 hasConceptScore W4200631657C111919701 @default.
- W4200631657 hasConceptScore W4200631657C119857082 @default.
- W4200631657 hasConceptScore W4200631657C136197465 @default.
- W4200631657 hasConceptScore W4200631657C138885662 @default.
- W4200631657 hasConceptScore W4200631657C154945302 @default.
- W4200631657 hasConceptScore W4200631657C159886148 @default.
- W4200631657 hasConceptScore W4200631657C173801870 @default.
- W4200631657 hasConceptScore W4200631657C177264268 @default.
- W4200631657 hasConceptScore W4200631657C199360897 @default.
- W4200631657 hasConceptScore W4200631657C2776401178 @default.
- W4200631657 hasConceptScore W4200631657C33923547 @default.
- W4200631657 hasConceptScore W4200631657C41008148 @default.
- W4200631657 hasConceptScore W4200631657C41895202 @default.
- W4200631657 hasConceptScore W4200631657C97541855 @default.
- W4200631657 hasConceptScore W4200631657C98045186 @default.
- W4200631657 hasLocation W42006316571 @default.
- W4200631657 hasLocation W42006316572 @default.
- W4200631657 hasOpenAccess W4200631657 @default.
- W4200631657 hasPrimaryLocation W42006316571 @default.
- W4200631657 hasRelatedWork W2910367420 @default.
- W4200631657 hasRelatedWork W2980794631 @default.
- W4200631657 hasRelatedWork W3942861 @default.
- W4200631657 hasRelatedWork W4200631657 @default.
- W4200631657 hasRelatedWork W4226401562 @default.
- W4200631657 hasRelatedWork W4309864858 @default.
- W4200631657 hasRelatedWork W4319083788 @default.
- W4200631657 hasRelatedWork W4323030201 @default.
- W4200631657 hasRelatedWork W4324119149 @default.
- W4200631657 hasRelatedWork W4382239365 @default.
- W4200631657 isParatext "false" @default.
- W4200631657 isRetracted "false" @default.
- W4200631657 workType "article" @default.