Matches in SemOpenAlex for { <https://semopenalex.org/work/W4287208051> ?p ?o ?g. }
Showing items 1 to 57 of
57
with 100 items per page.
- W4287208051 abstract "In this work we explore an auxiliary loss useful for reinforcement learning in environments where strong performing agents are required to be able to navigate a spatial environment. The auxiliary loss proposed is to minimize the classification error of a neural network classifier that predicts whether or not a pair of states sampled from the agents current episode trajectory are in order. The classifier takes as input a pair of states as well as the agent's memory. The motivation for this auxiliary loss is that there is a strong correlation with which of a pair of states is more recent in the agents episode trajectory and which of the two states is spatially closer to the agent. Our hypothesis is that learning features to answer this question encourages the agent to learn and internalize in memory representations of states that facilitate spatial reasoning. We tested this auxiliary loss on a navigation task in a gridworld and achieved 9.6% increase in accumulative episode reward compared to a strong baseline approach." @default.
- W4287208051 created "2022-07-25" @default.
- W4287208051 creator A5005641582 @default.
- W4287208051 creator A5023346681 @default.
- W4287208051 creator A5038213642 @default.
- W4287208051 creator A5084360449 @default.
- W4287208051 date "2021-04-17" @default.
- W4287208051 modified "2023-09-30" @default.
- W4287208051 title "A Self-Supervised Auxiliary Loss for Deep RL in Partially Observable Settings" @default.
- W4287208051 doi "https://doi.org/10.48550/arxiv.2104.08492" @default.
- W4287208051 hasPublicationYear "2021" @default.
- W4287208051 type Work @default.
- W4287208051 citedByCount "0" @default.
- W4287208051 crossrefType "posted-content" @default.
- W4287208051 hasAuthorship W4287208051A5005641582 @default.
- W4287208051 hasAuthorship W4287208051A5023346681 @default.
- W4287208051 hasAuthorship W4287208051A5038213642 @default.
- W4287208051 hasAuthorship W4287208051A5084360449 @default.
- W4287208051 hasBestOaLocation W42872080511 @default.
- W4287208051 hasConcept C119857082 @default.
- W4287208051 hasConcept C121332964 @default.
- W4287208051 hasConcept C1276947 @default.
- W4287208051 hasConcept C13662910 @default.
- W4287208051 hasConcept C154945302 @default.
- W4287208051 hasConcept C32848918 @default.
- W4287208051 hasConcept C41008148 @default.
- W4287208051 hasConcept C50644808 @default.
- W4287208051 hasConcept C62520636 @default.
- W4287208051 hasConcept C95623464 @default.
- W4287208051 hasConcept C97541855 @default.
- W4287208051 hasConceptScore W4287208051C119857082 @default.
- W4287208051 hasConceptScore W4287208051C121332964 @default.
- W4287208051 hasConceptScore W4287208051C1276947 @default.
- W4287208051 hasConceptScore W4287208051C13662910 @default.
- W4287208051 hasConceptScore W4287208051C154945302 @default.
- W4287208051 hasConceptScore W4287208051C32848918 @default.
- W4287208051 hasConceptScore W4287208051C41008148 @default.
- W4287208051 hasConceptScore W4287208051C50644808 @default.
- W4287208051 hasConceptScore W4287208051C62520636 @default.
- W4287208051 hasConceptScore W4287208051C95623464 @default.
- W4287208051 hasConceptScore W4287208051C97541855 @default.
- W4287208051 hasLocation W42872080511 @default.
- W4287208051 hasOpenAccess W4287208051 @default.
- W4287208051 hasPrimaryLocation W42872080511 @default.
- W4287208051 hasRelatedWork W1479873353 @default.
- W4287208051 hasRelatedWork W2556319748 @default.
- W4287208051 hasRelatedWork W2891961174 @default.
- W4287208051 hasRelatedWork W2961085424 @default.
- W4287208051 hasRelatedWork W3022038857 @default.
- W4287208051 hasRelatedWork W3200179079 @default.
- W4287208051 hasRelatedWork W4211088005 @default.
- W4287208051 hasRelatedWork W4249229055 @default.
- W4287208051 hasRelatedWork W4319083788 @default.
- W4287208051 hasRelatedWork W1629725936 @default.
- W4287208051 isParatext "false" @default.
- W4287208051 isRetracted "false" @default.
- W4287208051 workType "article" @default.