Matches in SemOpenAlex for { <https://semopenalex.org/work/W2945950052> ?p ?o ?g. }
Showing items 1 to 68 of
68
with 100 items per page.
- W2945950052 abstract "Abstract Learning from human demonstration (LfD), among many speedup techniques for reinforcement learning (RL), has seen many successful applications. We consider one LfD technique called human–agent transfer (HAT), where a model of the human demonstrator’s decision function is induced via supervised learning and used as an initial bias for RL. Some recent work in LfD has investigated learning from observations only, that is, when only the demonstrator’s states (and not its actions) are available to the learner. Since the demonstrator’s actions are treated as labels for HAT, supervised learning becomes untenable in their absence. We adapt the idea of learning an inverse dynamics model from the data acquired by the learner’s interactions with the environment and deploy it to fill in the missing actions of the demonstrator. The resulting version of HAT—called state-only HAT (SoHAT) —is experimentally shown to preserve some advantages of HAT in benchmark domains with both discrete and continuous actions. This paper also establishes principled modifications of an existing baseline algorithm—called A3C—to create its HAT and SoHAT variants that are used in our experiments." @default.
- W2945950052 created "2019-05-29" @default.
- W2945950052 creator A5023711364 @default.
- W2945950052 creator A5025954824 @default.
- W2945950052 date "2020-11-27" @default.
- W2945950052 modified "2023-09-27" @default.
- W2945950052 title "Human–agent transfer from observations" @default.
- W2945950052 cites W1971890413 @default.
- W2945950052 cites W1986014385 @default.
- W2945950052 cites W2056584142 @default.
- W2945950052 cites W2063471043 @default.
- W2945950052 cites W2115668428 @default.
- W2945950052 cites W2119717200 @default.
- W2945950052 cites W2145339207 @default.
- W2945950052 cites W2257979135 @default.
- W2945950052 cites W2296673577 @default.
- W2945950052 cites W2580475959 @default.
- W2945950052 cites W2740302738 @default.
- W2945950052 cites W2751530711 @default.
- W2945950052 cites W2921955147 @default.
- W2945950052 cites W2963802910 @default.
- W2945950052 doi "https://doi.org/10.1017/s0269888920000387" @default.
- W2945950052 hasPublicationYear "2020" @default.
- W2945950052 type Work @default.
- W2945950052 sameAs 2945950052 @default.
- W2945950052 citedByCount "0" @default.
- W2945950052 crossrefType "journal-article" @default.
- W2945950052 hasAuthorship W2945950052A5023711364 @default.
- W2945950052 hasAuthorship W2945950052A5025954824 @default.
- W2945950052 hasBestOaLocation W29459500521 @default.
- W2945950052 hasConcept C111919701 @default.
- W2945950052 hasConcept C119857082 @default.
- W2945950052 hasConcept C13280743 @default.
- W2945950052 hasConcept C150899416 @default.
- W2945950052 hasConcept C154945302 @default.
- W2945950052 hasConcept C185798385 @default.
- W2945950052 hasConcept C205649164 @default.
- W2945950052 hasConcept C41008148 @default.
- W2945950052 hasConcept C68339613 @default.
- W2945950052 hasConcept C97541855 @default.
- W2945950052 hasConceptScore W2945950052C111919701 @default.
- W2945950052 hasConceptScore W2945950052C119857082 @default.
- W2945950052 hasConceptScore W2945950052C13280743 @default.
- W2945950052 hasConceptScore W2945950052C150899416 @default.
- W2945950052 hasConceptScore W2945950052C154945302 @default.
- W2945950052 hasConceptScore W2945950052C185798385 @default.
- W2945950052 hasConceptScore W2945950052C205649164 @default.
- W2945950052 hasConceptScore W2945950052C41008148 @default.
- W2945950052 hasConceptScore W2945950052C68339613 @default.
- W2945950052 hasConceptScore W2945950052C97541855 @default.
- W2945950052 hasLocation W29459500521 @default.
- W2945950052 hasOpenAccess W2945950052 @default.
- W2945950052 hasPrimaryLocation W29459500521 @default.
- W2945950052 hasRelatedWork W2053732522 @default.
- W2945950052 hasRelatedWork W2348739446 @default.
- W2945950052 hasRelatedWork W3018421652 @default.
- W2945950052 hasRelatedWork W3022038857 @default.
- W2945950052 hasRelatedWork W4225907548 @default.
- W2945950052 hasRelatedWork W4281382123 @default.
- W2945950052 hasRelatedWork W4288040045 @default.
- W2945950052 hasRelatedWork W4308233397 @default.
- W2945950052 hasRelatedWork W4308262314 @default.
- W2945950052 hasRelatedWork W4319083788 @default.
- W2945950052 hasVolume "36" @default.
- W2945950052 isParatext "false" @default.
- W2945950052 isRetracted "false" @default.
- W2945950052 magId "2945950052" @default.
- W2945950052 workType "article" @default.