Matches in SemOpenAlex for { <https://semopenalex.org/work/W239795587> ?p ?o ?g. }
Showing items 1 to 81 of
81
with 100 items per page.
- W239795587 endingPage "178" @default.
- W239795587 startingPage "177" @default.
- W239795587 abstract "With the development of deep learning technology, deep reinforcement learning (DRL) has successfully built intelligent agents in sequential decision-making problems through interaction with image-based environments. However, learning from unlimited interaction is impractical and sample inefficient because training an agent requires many trial and error and numerous samples. One response to this problem is sample-efficient DRL, a research area that encourages learning effective state representations in limited interactions with image-based environments. Previous methods could effectively surpass human performance by training an RL agent using self-supervised learning and data augmentation to learn good state representations from a given interaction. However, most of the existing methods only consider similarity of image observations so that they are hard to capture semantic representations. To address these challenges, we propose spatio-temporal and action-based contrastive representation (STACoRe) learning for sample-efficient DRL. STACoRe performs two contrastive learning to learn proper state representations. One uses the agent’s actions as pseudo labels, and the other uses spatio-temporal information. In particular, when performing the action-based contrastive learning, we propose a method that automatically selects data augmentation techniques suitable for each environment for stable model training. We train the model by simultaneously optimizing an action-based contrastive loss function and spatio-temporal contrastive loss functions in an end-to-end manner. This leads to improving sample efficiency for DRL. We use 26 benchmark games in Atari 2600 whose environment interaction is limited to only 100k steps. The experimental results confirm that our method is more sample efficient than existing methods. The code is available at https://github.com/dudwojae/STACoRe." @default.
- W239795587 created "2016-06-24" @default.
- W239795587 creator A5011231520 @default.
- W239795587 date "2003-04-01" @default.
- W239795587 modified "2023-09-25" @default.
- W239795587 title "Un sujet � replacer dans son contexte international (commentaires au texte de Francis Ribeyre)A theme to put in its international context (comments on Ribeyre's paper)" @default.
- W239795587 doi "https://doi.org/10.1016/s1240-1307(03)00048-7" @default.
- W239795587 hasPublicationYear "2003" @default.
- W239795587 type Work @default.
- W239795587 sameAs 239795587 @default.
- W239795587 citedByCount "0" @default.
- W239795587 crossrefType "journal-article" @default.
- W239795587 hasAuthorship W239795587A5011231520 @default.
- W239795587 hasBestOaLocation W2397955871 @default.
- W239795587 hasConcept C103278499 @default.
- W239795587 hasConcept C111919701 @default.
- W239795587 hasConcept C115961682 @default.
- W239795587 hasConcept C119857082 @default.
- W239795587 hasConcept C121332964 @default.
- W239795587 hasConcept C13280743 @default.
- W239795587 hasConcept C154945302 @default.
- W239795587 hasConcept C166957645 @default.
- W239795587 hasConcept C17744445 @default.
- W239795587 hasConcept C185592680 @default.
- W239795587 hasConcept C185798385 @default.
- W239795587 hasConcept C198531522 @default.
- W239795587 hasConcept C199539241 @default.
- W239795587 hasConcept C205649164 @default.
- W239795587 hasConcept C2776359362 @default.
- W239795587 hasConcept C2779343474 @default.
- W239795587 hasConcept C2780791683 @default.
- W239795587 hasConcept C33566652 @default.
- W239795587 hasConcept C41008148 @default.
- W239795587 hasConcept C43617362 @default.
- W239795587 hasConcept C62520636 @default.
- W239795587 hasConcept C94625758 @default.
- W239795587 hasConcept C97541855 @default.
- W239795587 hasConceptScore W239795587C103278499 @default.
- W239795587 hasConceptScore W239795587C111919701 @default.
- W239795587 hasConceptScore W239795587C115961682 @default.
- W239795587 hasConceptScore W239795587C119857082 @default.
- W239795587 hasConceptScore W239795587C121332964 @default.
- W239795587 hasConceptScore W239795587C13280743 @default.
- W239795587 hasConceptScore W239795587C154945302 @default.
- W239795587 hasConceptScore W239795587C166957645 @default.
- W239795587 hasConceptScore W239795587C17744445 @default.
- W239795587 hasConceptScore W239795587C185592680 @default.
- W239795587 hasConceptScore W239795587C185798385 @default.
- W239795587 hasConceptScore W239795587C198531522 @default.
- W239795587 hasConceptScore W239795587C199539241 @default.
- W239795587 hasConceptScore W239795587C205649164 @default.
- W239795587 hasConceptScore W239795587C2776359362 @default.
- W239795587 hasConceptScore W239795587C2779343474 @default.
- W239795587 hasConceptScore W239795587C2780791683 @default.
- W239795587 hasConceptScore W239795587C33566652 @default.
- W239795587 hasConceptScore W239795587C41008148 @default.
- W239795587 hasConceptScore W239795587C43617362 @default.
- W239795587 hasConceptScore W239795587C62520636 @default.
- W239795587 hasConceptScore W239795587C94625758 @default.
- W239795587 hasConceptScore W239795587C97541855 @default.
- W239795587 hasIssue "2" @default.
- W239795587 hasLocation W2397955871 @default.
- W239795587 hasOpenAccess W239795587 @default.
- W239795587 hasPrimaryLocation W2397955871 @default.
- W239795587 hasRelatedWork W1485630101 @default.
- W239795587 hasRelatedWork W2923653485 @default.
- W239795587 hasRelatedWork W3022038857 @default.
- W239795587 hasRelatedWork W3088315509 @default.
- W239795587 hasRelatedWork W3095449511 @default.
- W239795587 hasRelatedWork W3132110306 @default.
- W239795587 hasRelatedWork W3147214434 @default.
- W239795587 hasRelatedWork W3156313910 @default.
- W239795587 hasRelatedWork W4296474751 @default.
- W239795587 hasRelatedWork W4319083788 @default.
- W239795587 hasVolume "11" @default.
- W239795587 isParatext "false" @default.
- W239795587 isRetracted "false" @default.
- W239795587 magId "239795587" @default.
- W239795587 workType "article" @default.