Matches in SemOpenAlex for { <https://semopenalex.org/work/W2783154620> ?p ?o ?g. }
Showing items 1 to 86 of
86
with 100 items per page.
- W2783154620 abstract "This paper considers the problem of inverse reinforcement learning in zero-sum stochastic games when expert demonstrations are known to be not optimal. Compared to previous works that decouple agents in the game by assuming optimality in expert strategies, we introduce a new objective function that directly pits experts against Nash Equilibrium strategies, and we design an algorithm to solve for the reward function in the context of inverse reinforcement learning with deep neural networks as model approximations. In our setting the model and algorithm do not decouple by agent. In order to find Nash Equilibrium in large-scale games, we also propose an adversarial training algorithm for zero-sum stochastic games, and show the theoretical appeal of non-existence of local optima in its objective function. In our numerical experiments, we demonstrate that our Nash Equilibrium and inverse reinforcement learning algorithms address games that are not amenable to previous approaches using tabular representations. Moreover, with sub-optimal expert demonstrations our algorithms recover both reward functions and strategies with good quality." @default.
- W2783154620 created "2018-01-26" @default.
- W2783154620 creator A5013049879 @default.
- W2783154620 creator A5021376482 @default.
- W2783154620 date "2018-01-07" @default.
- W2783154620 modified "2023-09-27" @default.
- W2783154620 title "Competitive Multi-agent Inverse Reinforcement Learning with Sub-optimal Demonstrations" @default.
- W2783154620 cites W1522301498 @default.
- W2783154620 cites W1533809857 @default.
- W2783154620 cites W1999874108 @default.
- W2783154620 cites W2069195348 @default.
- W2783154620 cites W2098774185 @default.
- W2783154620 cites W2129241738 @default.
- W2783154620 cites W2290104316 @default.
- W2783154620 cites W2434014514 @default.
- W2783154620 cites W2507470618 @default.
- W2783154620 cites W2736601468 @default.
- W2783154620 cites W2753984105 @default.
- W2783154620 cites W2789386227 @default.
- W2783154620 cites W2962957031 @default.
- W2783154620 cites W38129364 @default.
- W2783154620 hasPublicationYear "2018" @default.
- W2783154620 type Work @default.
- W2783154620 sameAs 2783154620 @default.
- W2783154620 citedByCount "1" @default.
- W2783154620 countsByYear W27831546202021 @default.
- W2783154620 crossrefType "posted-content" @default.
- W2783154620 hasAuthorship W2783154620A5013049879 @default.
- W2783154620 hasAuthorship W2783154620A5021376482 @default.
- W2783154620 hasConcept C126255220 @default.
- W2783154620 hasConcept C136356330 @default.
- W2783154620 hasConcept C14036430 @default.
- W2783154620 hasConcept C151730666 @default.
- W2783154620 hasConcept C154945302 @default.
- W2783154620 hasConcept C207467116 @default.
- W2783154620 hasConcept C2524010 @default.
- W2783154620 hasConcept C2779343474 @default.
- W2783154620 hasConcept C32407928 @default.
- W2783154620 hasConcept C33923547 @default.
- W2783154620 hasConcept C41008148 @default.
- W2783154620 hasConcept C46814582 @default.
- W2783154620 hasConcept C78458016 @default.
- W2783154620 hasConcept C86803240 @default.
- W2783154620 hasConcept C97541855 @default.
- W2783154620 hasConceptScore W2783154620C126255220 @default.
- W2783154620 hasConceptScore W2783154620C136356330 @default.
- W2783154620 hasConceptScore W2783154620C14036430 @default.
- W2783154620 hasConceptScore W2783154620C151730666 @default.
- W2783154620 hasConceptScore W2783154620C154945302 @default.
- W2783154620 hasConceptScore W2783154620C207467116 @default.
- W2783154620 hasConceptScore W2783154620C2524010 @default.
- W2783154620 hasConceptScore W2783154620C2779343474 @default.
- W2783154620 hasConceptScore W2783154620C32407928 @default.
- W2783154620 hasConceptScore W2783154620C33923547 @default.
- W2783154620 hasConceptScore W2783154620C41008148 @default.
- W2783154620 hasConceptScore W2783154620C46814582 @default.
- W2783154620 hasConceptScore W2783154620C78458016 @default.
- W2783154620 hasConceptScore W2783154620C86803240 @default.
- W2783154620 hasConceptScore W2783154620C97541855 @default.
- W2783154620 hasLocation W27831546201 @default.
- W2783154620 hasOpenAccess W2783154620 @default.
- W2783154620 hasPrimaryLocation W27831546201 @default.
- W2783154620 hasRelatedWork W1966224968 @default.
- W2783154620 hasRelatedWork W2180467047 @default.
- W2783154620 hasRelatedWork W2221312193 @default.
- W2783154620 hasRelatedWork W2897975793 @default.
- W2783154620 hasRelatedWork W2902138440 @default.
- W2783154620 hasRelatedWork W2910246453 @default.
- W2783154620 hasRelatedWork W2941095974 @default.
- W2783154620 hasRelatedWork W2964098908 @default.
- W2783154620 hasRelatedWork W2989169948 @default.
- W2783154620 hasRelatedWork W3012430540 @default.
- W2783154620 hasRelatedWork W3042859406 @default.
- W2783154620 hasRelatedWork W3082580287 @default.
- W2783154620 hasRelatedWork W3093099641 @default.
- W2783154620 hasRelatedWork W3094530921 @default.
- W2783154620 hasRelatedWork W3105688066 @default.
- W2783154620 hasRelatedWork W3109571516 @default.
- W2783154620 hasRelatedWork W3110227966 @default.
- W2783154620 hasRelatedWork W3110979110 @default.
- W2783154620 hasRelatedWork W3195418377 @default.
- W2783154620 hasRelatedWork W3207544809 @default.
- W2783154620 isParatext "false" @default.
- W2783154620 isRetracted "false" @default.
- W2783154620 magId "2783154620" @default.
- W2783154620 workType "article" @default.