Matches in SemOpenAlex for { <https://semopenalex.org/work/W3203443004> ?p ?o ?g. }
- W3203443004 abstract "Offline reinforcement learning is used to train policies in scenarios wherereal-time access to the environment is expensive or impossible. As a naturalconsequence of these harsh conditions, an agent may lack the resources to fullyobserve the online environment before taking an action. We dub this situationthe resource-constrained setting. This leads to situations where the offlinedataset (available for training) can contain fully processed features (usingpowerful language models, image models, complex sensors, etc.) which are notavailable when actions are actually taken online. This disconnect leads to aninteresting and unexplored problem in offline RL: Is it possible to use arichly processed offline dataset to train a policy which has access to fewerfeatures in the online environment? In this work, we introduce and formalizethis novel resource-constrained problem setting. We highlight the performancegap between policies trained using the full offline dataset and policiestrained using limited features. We address this performance gap with a policytransfer algorithm which first trains a teacher agent using the offline datasetwhere features are fully available, and then transfers this knowledge to astudent agent that only uses the resource-constrained features. To bettercapture the challenge of this setting, we propose a data collection procedure:Resource Constrained-Datasets for RL (RC-D4RL). We evaluate our transferalgorithm on RC-D4RL and the popular D4RL benchmarks and observe consistentimprovement over the baseline (TD3+BC without transfer). The code for theexperiments is available athttps://github.com/JayanthRR/RC-OfflineRL}{github.com/RC-OfflineRL." @default.
- W3203443004 created "2021-10-11" @default.
- W3203443004 creator A5017906439 @default.
- W3203443004 creator A5029672764 @default.
- W3203443004 creator A5035432690 @default.
- W3203443004 creator A5046845999 @default.
- W3203443004 creator A5059566652 @default.
- W3203443004 creator A5086993787 @default.
- W3203443004 date "2021-10-07" @default.
- W3203443004 modified "2023-09-27" @default.
- W3203443004 title "Offline RL With Resource Constrained Online Deployment." @default.
- W3203443004 cites W1510402218 @default.
- W3203443004 cites W1757796397 @default.
- W3203443004 cites W192920577 @default.
- W3203443004 cites W2011418219 @default.
- W3203443004 cites W2079247031 @default.
- W3203443004 cites W2097381042 @default.
- W3203443004 cites W2153353285 @default.
- W3203443004 cites W2158782408 @default.
- W3203443004 cites W2168359464 @default.
- W3203443004 cites W2173248099 @default.
- W3203443004 cites W2173379916 @default.
- W3203443004 cites W2212660284 @default.
- W3203443004 cites W2257979135 @default.
- W3203443004 cites W2897337230 @default.
- W3203443004 cites W2944804155 @default.
- W3203443004 cites W2945670763 @default.
- W3203443004 cites W2947150733 @default.
- W3203443004 cites W2949600457 @default.
- W3203443004 cites W2952099252 @default.
- W3203443004 cites W2962902376 @default.
- W3203443004 cites W2962923909 @default.
- W3203443004 cites W2963120839 @default.
- W3203443004 cites W2963704132 @default.
- W3203443004 cites W2964043796 @default.
- W3203443004 cites W2971262355 @default.
- W3203443004 cites W2991355586 @default.
- W3203443004 cites W2995894173 @default.
- W3203443004 cites W3016525976 @default.
- W3203443004 cites W3022566517 @default.
- W3203443004 cites W3025606523 @default.
- W3203443004 cites W3033324992 @default.
- W3203443004 cites W3034607397 @default.
- W3203443004 cites W3043057128 @default.
- W3203443004 cites W3085267010 @default.
- W3203443004 cites W3085735333 @default.
- W3203443004 cites W3128018413 @default.
- W3203443004 cites W3133362425 @default.
- W3203443004 cites W3143499011 @default.
- W3203443004 cites W3162450516 @default.
- W3203443004 cites W3166795773 @default.
- W3203443004 cites W3167222229 @default.
- W3203443004 cites W3169291081 @default.
- W3203443004 cites W3170016383 @default.
- W3203443004 cites W3172360140 @default.
- W3203443004 cites W3177145475 @default.
- W3203443004 hasPublicationYear "2021" @default.
- W3203443004 type Work @default.
- W3203443004 sameAs 3203443004 @default.
- W3203443004 citedByCount "0" @default.
- W3203443004 crossrefType "posted-content" @default.
- W3203443004 hasAuthorship W3203443004A5017906439 @default.
- W3203443004 hasAuthorship W3203443004A5029672764 @default.
- W3203443004 hasAuthorship W3203443004A5035432690 @default.
- W3203443004 hasAuthorship W3203443004A5046845999 @default.
- W3203443004 hasAuthorship W3203443004A5059566652 @default.
- W3203443004 hasAuthorship W3203443004A5086993787 @default.
- W3203443004 hasConcept C105339364 @default.
- W3203443004 hasConcept C111919701 @default.
- W3203443004 hasConcept C11413529 @default.
- W3203443004 hasConcept C115903868 @default.
- W3203443004 hasConcept C119857082 @default.
- W3203443004 hasConcept C120314980 @default.
- W3203443004 hasConcept C150899416 @default.
- W3203443004 hasConcept C154945302 @default.
- W3203443004 hasConcept C177264268 @default.
- W3203443004 hasConcept C190839683 @default.
- W3203443004 hasConcept C196921405 @default.
- W3203443004 hasConcept C199360897 @default.
- W3203443004 hasConcept C205649164 @default.
- W3203443004 hasConcept C206345919 @default.
- W3203443004 hasConcept C2776760102 @default.
- W3203443004 hasConcept C2780102126 @default.
- W3203443004 hasConcept C31258907 @default.
- W3203443004 hasConcept C41008148 @default.
- W3203443004 hasConcept C58640448 @default.
- W3203443004 hasConcept C97541855 @default.
- W3203443004 hasConceptScore W3203443004C105339364 @default.
- W3203443004 hasConceptScore W3203443004C111919701 @default.
- W3203443004 hasConceptScore W3203443004C11413529 @default.
- W3203443004 hasConceptScore W3203443004C115903868 @default.
- W3203443004 hasConceptScore W3203443004C119857082 @default.
- W3203443004 hasConceptScore W3203443004C120314980 @default.
- W3203443004 hasConceptScore W3203443004C150899416 @default.
- W3203443004 hasConceptScore W3203443004C154945302 @default.
- W3203443004 hasConceptScore W3203443004C177264268 @default.
- W3203443004 hasConceptScore W3203443004C190839683 @default.
- W3203443004 hasConceptScore W3203443004C196921405 @default.
- W3203443004 hasConceptScore W3203443004C199360897 @default.
- W3203443004 hasConceptScore W3203443004C205649164 @default.