SemOpenAlex |

SemOpenAlex

Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387390015> ?p ?o ?g. }

Showing items 1 to 61 of 61 with 100 items per page.

W4387390015 abstract "Recently, people have shown that large-scale pre-training from internet-scale data is the key to building generalist models, as witnessed in NLP. To build embodied generalist agents, we and many other researchers hypothesize that such foundation prior is also an indispensable component. However, it is unclear what is the proper concrete form to represent those embodied foundation priors and how they should be used in the downstream task. In this paper, we propose an intuitive and effective set of embodied priors that consist of foundation policy, value, and success reward. The proposed priors are based on the goal-conditioned MDP. To verify their effectiveness, we instantiate an actor-critic method assisted by the priors, called Foundation Actor-Critic (FAC). We name our framework as Foundation Reinforcement Learning (FRL), since it completely relies on embodied foundation priors to explore, learn and reinforce. The benefits of FRL are threefold. (1) Sample efficient. With foundation priors, FAC learns significantly faster than traditional RL. Our evaluation on the Meta-World has proved that FAC can achieve 100% success rates for 7/8 tasks under less than 200k frames, which outperforms the baseline method with careful manual-designed rewards under 1M frames. (2) Robust to noisy priors. Our method tolerates the unavoidable noise in embodied foundation models. We show that FAC works well even under heavy noise or quantization errors. (3) Minimal human intervention: FAC completely learns from the foundation priors, without the need of human-specified dense reward, or providing teleoperated demos. Thus, FAC can be easily scaled up. We believe our FRL framework could enable the future robot to autonomously explore and learn without human intervention in the physical world. In summary, our proposed FRL is a novel and powerful learning paradigm, towards achieving embodied generalist agents." @default.
W4387390015 created "2023-10-06" @default.
W4387390015 creator A5004393324 @default.
W4387390015 creator A5030471578 @default.
W4387390015 creator A5044754993 @default.
W4387390015 creator A5047427786 @default.
W4387390015 creator A5049349154 @default.
W4387390015 creator A5050818431 @default.
W4387390015 creator A5054970130 @default.
W4387390015 date "2023-10-04" @default.
W4387390015 modified "2023-10-07" @default.
W4387390015 title "Foundation Reinforcement Learning: towards Embodied Generalist Agents with Foundation Prior Assistance" @default.
W4387390015 doi "https://doi.org/10.48550/arxiv.2310.02635" @default.
W4387390015 hasPublicationYear "2023" @default.
W4387390015 type Work @default.
W4387390015 citedByCount "0" @default.
W4387390015 crossrefType "posted-content" @default.
W4387390015 hasAuthorship W4387390015A5004393324 @default.
W4387390015 hasAuthorship W4387390015A5030471578 @default.
W4387390015 hasAuthorship W4387390015A5044754993 @default.
W4387390015 hasAuthorship W4387390015A5047427786 @default.
W4387390015 hasAuthorship W4387390015A5049349154 @default.
W4387390015 hasAuthorship W4387390015A5050818431 @default.
W4387390015 hasAuthorship W4387390015A5054970130 @default.
W4387390015 hasBestOaLocation W43873900151 @default.
W4387390015 hasConcept C100609095 @default.
W4387390015 hasConcept C107673813 @default.
W4387390015 hasConcept C119857082 @default.
W4387390015 hasConcept C154945302 @default.
W4387390015 hasConcept C166957645 @default.
W4387390015 hasConcept C177769412 @default.
W4387390015 hasConcept C2780966255 @default.
W4387390015 hasConcept C41008148 @default.
W4387390015 hasConcept C95457728 @default.
W4387390015 hasConcept C97541855 @default.
W4387390015 hasConceptScore W4387390015C100609095 @default.
W4387390015 hasConceptScore W4387390015C107673813 @default.
W4387390015 hasConceptScore W4387390015C119857082 @default.
W4387390015 hasConceptScore W4387390015C154945302 @default.
W4387390015 hasConceptScore W4387390015C166957645 @default.
W4387390015 hasConceptScore W4387390015C177769412 @default.
W4387390015 hasConceptScore W4387390015C2780966255 @default.
W4387390015 hasConceptScore W4387390015C41008148 @default.
W4387390015 hasConceptScore W4387390015C95457728 @default.
W4387390015 hasConceptScore W4387390015C97541855 @default.
W4387390015 hasLocation W43873900151 @default.
W4387390015 hasOpenAccess W4387390015 @default.
W4387390015 hasPrimaryLocation W43873900151 @default.
W4387390015 hasRelatedWork W2091233881 @default.
W4387390015 hasRelatedWork W2124102101 @default.
W4387390015 hasRelatedWork W2333383158 @default.
W4387390015 hasRelatedWork W2352366064 @default.
W4387390015 hasRelatedWork W2380179524 @default.
W4387390015 hasRelatedWork W2754826905 @default.
W4387390015 hasRelatedWork W3166536154 @default.
W4387390015 hasRelatedWork W4250305970 @default.
W4387390015 hasRelatedWork W4283365723 @default.
W4387390015 hasRelatedWork W1570928019 @default.
W4387390015 isParatext "false" @default.
W4387390015 isRetracted "false" @default.
W4387390015 workType "article" @default.