Matches in SemOpenAlex for { <https://semopenalex.org/work/W2803761331> ?p ?o ?g. }
- W2803761331 abstract "Agents trained in simulation may make errors in the real world due to mismatches between training and execution environments. These mistakes can be dangerous and difficult to discover because the agent cannot predict them a priori. We propose using oracle feedback to learn a predictive model of these blind spots to reduce costly errors in real-world applications. We focus on blind spots in reinforcement learning (RL) that occur due to incomplete state representation: The agent does not have the appropriate features to represent the true state of the world and thus cannot distinguish among numerous states. We formalize the problem of discovering blind spots in RL as a noisy supervised learning problem with class imbalance. We learn models to predict blind spots in unseen regions of the state space by combining techniques for label aggregation, calibration, and supervised learning. The models take into consideration noise emerging from different forms of oracle feedback, including demonstrations and corrections. We evaluate our approach on two domains and show that it achieves higher predictive performance than baseline methods, and that the learned model can be used to selectively query an oracle at execution time to prevent errors. We also empirically analyze the biases of various feedback types and how they influence the discovery of blind spots." @default.
- W2803761331 created "2018-06-01" @default.
- W2803761331 creator A5020666821 @default.
- W2803761331 creator A5028114802 @default.
- W2803761331 creator A5029144357 @default.
- W2803761331 creator A5043228682 @default.
- W2803761331 creator A5044369720 @default.
- W2803761331 date "2018-05-23" @default.
- W2803761331 modified "2023-09-27" @default.
- W2803761331 title "Discovering Blind Spots in Reinforcement Learning" @default.
- W2803761331 cites W1543648998 @default.
- W2803761331 cites W1845972764 @default.
- W2803761331 cites W1986014385 @default.
- W2803761331 cites W2097381042 @default.
- W2803761331 cites W2098441518 @default.
- W2803761331 cites W2122646361 @default.
- W2803761331 cites W2124695578 @default.
- W2803761331 cites W2142261479 @default.
- W2803761331 cites W2156869222 @default.
- W2803761331 cites W2165698076 @default.
- W2803761331 cites W2280163991 @default.
- W2803761331 cites W2335959470 @default.
- W2803761331 cites W2440926996 @default.
- W2803761331 cites W2530944449 @default.
- W2803761331 cites W2583689529 @default.
- W2803761331 cites W2586067474 @default.
- W2803761331 cites W2590953969 @default.
- W2803761331 cites W2626804490 @default.
- W2803761331 cites W2736629007 @default.
- W2803761331 cites W2952606116 @default.
- W2803761331 cites W2964059111 @default.
- W2803761331 cites W9014458 @default.
- W2803761331 hasPublicationYear "2018" @default.
- W2803761331 type Work @default.
- W2803761331 sameAs 2803761331 @default.
- W2803761331 citedByCount "7" @default.
- W2803761331 countsByYear W28037613312018 @default.
- W2803761331 countsByYear W28037613312019 @default.
- W2803761331 countsByYear W28037613312020 @default.
- W2803761331 countsByYear W28037613312021 @default.
- W2803761331 countsByYear W28037613312022 @default.
- W2803761331 crossrefType "posted-content" @default.
- W2803761331 hasAuthorship W2803761331A5020666821 @default.
- W2803761331 hasAuthorship W2803761331A5028114802 @default.
- W2803761331 hasAuthorship W2803761331A5029144357 @default.
- W2803761331 hasAuthorship W2803761331A5043228682 @default.
- W2803761331 hasAuthorship W2803761331A5044369720 @default.
- W2803761331 hasConcept C111472728 @default.
- W2803761331 hasConcept C115903868 @default.
- W2803761331 hasConcept C119857082 @default.
- W2803761331 hasConcept C120665830 @default.
- W2803761331 hasConcept C121332964 @default.
- W2803761331 hasConcept C138885662 @default.
- W2803761331 hasConcept C154945302 @default.
- W2803761331 hasConcept C17744445 @default.
- W2803761331 hasConcept C192209626 @default.
- W2803761331 hasConcept C199539241 @default.
- W2803761331 hasConcept C2776359362 @default.
- W2803761331 hasConcept C41008148 @default.
- W2803761331 hasConcept C55166926 @default.
- W2803761331 hasConcept C64731932 @default.
- W2803761331 hasConcept C75553542 @default.
- W2803761331 hasConcept C94625758 @default.
- W2803761331 hasConcept C97541855 @default.
- W2803761331 hasConceptScore W2803761331C111472728 @default.
- W2803761331 hasConceptScore W2803761331C115903868 @default.
- W2803761331 hasConceptScore W2803761331C119857082 @default.
- W2803761331 hasConceptScore W2803761331C120665830 @default.
- W2803761331 hasConceptScore W2803761331C121332964 @default.
- W2803761331 hasConceptScore W2803761331C138885662 @default.
- W2803761331 hasConceptScore W2803761331C154945302 @default.
- W2803761331 hasConceptScore W2803761331C17744445 @default.
- W2803761331 hasConceptScore W2803761331C192209626 @default.
- W2803761331 hasConceptScore W2803761331C199539241 @default.
- W2803761331 hasConceptScore W2803761331C2776359362 @default.
- W2803761331 hasConceptScore W2803761331C41008148 @default.
- W2803761331 hasConceptScore W2803761331C55166926 @default.
- W2803761331 hasConceptScore W2803761331C64731932 @default.
- W2803761331 hasConceptScore W2803761331C75553542 @default.
- W2803761331 hasConceptScore W2803761331C94625758 @default.
- W2803761331 hasConceptScore W2803761331C97541855 @default.
- W2803761331 hasLocation W28037613311 @default.
- W2803761331 hasOpenAccess W2803761331 @default.
- W2803761331 hasPrimaryLocation W28037613311 @default.
- W2803761331 hasRelatedWork W2128786740 @default.
- W2803761331 hasRelatedWork W2558634851 @default.
- W2803761331 hasRelatedWork W2589501981 @default.
- W2803761331 hasRelatedWork W2804837173 @default.
- W2803761331 hasRelatedWork W2889990052 @default.
- W2803761331 hasRelatedWork W2902567911 @default.
- W2803761331 hasRelatedWork W2903994599 @default.
- W2803761331 hasRelatedWork W2945115303 @default.
- W2803761331 hasRelatedWork W2954142106 @default.
- W2803761331 hasRelatedWork W2962882442 @default.
- W2803761331 hasRelatedWork W2963391602 @default.
- W2803761331 hasRelatedWork W2975503305 @default.
- W2803761331 hasRelatedWork W2982593362 @default.
- W2803761331 hasRelatedWork W3034568050 @default.
- W2803761331 hasRelatedWork W3045585575 @default.
- W2803761331 hasRelatedWork W3134032827 @default.