Matches in SemOpenAlex for { <https://semopenalex.org/work/W76312321> ?p ?o ?g. }
Showing items 1 to 97 of
97
with 100 items per page.
- W76312321 endingPage "361" @default.
- W76312321 startingPage "352" @default.
- W76312321 abstract "Sequential decision tasks with incomplete information are characterized by the exploration problem; namely the trade-off between further exploration for learning more about the environment and immediate exploitation of the accrued information for decision-making. Within artificial intelligence, there has been an increasing interest in studying planning-while-learning algorithms for these decision tasks. In this paper we focus on the exploration problem in reinforcement learning and Q-learning in particular. The existing exploration strategies for Q-learning are of a heuristic nature and they exhibit limited scaleability in tasks with large (or infinite) state and action spaces. Efficient experimentation is needed for resolving uncertainties when possible plans are compared (i.e. exploration). The experimentation should be sufficient for selecting with statistical significance a locally optimal plan (i.e. exploitation). For this purpose, we develop a probabilistic hill-climbing algorithm that uses a statistical selection procedure to decide how much exploration is needed for selecting a plan which is, with arbitrarily high probabiiity, arbitrarily close to a locally optimal one. Due to its generality the algorithm can be employed for the exploration strategy of robust Q-learning. An experiment on a relatively complex control task shows that the proposed exploration strategy performs better than a typical exploration strategy." @default.
- W76312321 created "2016-06-24" @default.
- W76312321 creator A5080418197 @default.
- W76312321 date "1995-08-18" @default.
- W76312321 modified "2023-09-23" @default.
- W76312321 title "Probabilistic exploration in planning while learning" @default.
- W76312321 cites W1491843047 @default.
- W76312321 cites W1504212531 @default.
- W76312321 cites W1530998306 @default.
- W76312321 cites W1567249951 @default.
- W76312321 cites W1586504939 @default.
- W76312321 cites W1588206936 @default.
- W76312321 cites W1610678877 @default.
- W76312321 cites W1931792391 @default.
- W76312321 cites W1977567824 @default.
- W76312321 cites W1982997797 @default.
- W76312321 cites W1987780311 @default.
- W76312321 cites W2008140633 @default.
- W76312321 cites W203646419 @default.
- W76312321 cites W2066720893 @default.
- W76312321 cites W2076125241 @default.
- W76312321 cites W2103626435 @default.
- W76312321 cites W2127931805 @default.
- W76312321 cites W2132347775 @default.
- W76312321 cites W2132560759 @default.
- W76312321 cites W2341171179 @default.
- W76312321 cites W23603061 @default.
- W76312321 cites W2913060459 @default.
- W76312321 cites W2999780717 @default.
- W76312321 cites W3011120880 @default.
- W76312321 hasPublicationYear "1995" @default.
- W76312321 type Work @default.
- W76312321 sameAs 76312321 @default.
- W76312321 citedByCount "6" @default.
- W76312321 countsByYear W763123212012 @default.
- W76312321 countsByYear W763123212013 @default.
- W76312321 crossrefType "proceedings-article" @default.
- W76312321 hasAuthorship W76312321A5080418197 @default.
- W76312321 hasConcept C119857082 @default.
- W76312321 hasConcept C127413603 @default.
- W76312321 hasConcept C135450995 @default.
- W76312321 hasConcept C154945302 @default.
- W76312321 hasConcept C15744967 @default.
- W76312321 hasConcept C166957645 @default.
- W76312321 hasConcept C173801870 @default.
- W76312321 hasConcept C201995342 @default.
- W76312321 hasConcept C2776505523 @default.
- W76312321 hasConcept C2780451532 @default.
- W76312321 hasConcept C2780767217 @default.
- W76312321 hasConcept C41008148 @default.
- W76312321 hasConcept C49937458 @default.
- W76312321 hasConcept C542102704 @default.
- W76312321 hasConcept C95457728 @default.
- W76312321 hasConcept C97541855 @default.
- W76312321 hasConceptScore W76312321C119857082 @default.
- W76312321 hasConceptScore W76312321C127413603 @default.
- W76312321 hasConceptScore W76312321C135450995 @default.
- W76312321 hasConceptScore W76312321C154945302 @default.
- W76312321 hasConceptScore W76312321C15744967 @default.
- W76312321 hasConceptScore W76312321C166957645 @default.
- W76312321 hasConceptScore W76312321C173801870 @default.
- W76312321 hasConceptScore W76312321C201995342 @default.
- W76312321 hasConceptScore W76312321C2776505523 @default.
- W76312321 hasConceptScore W76312321C2780451532 @default.
- W76312321 hasConceptScore W76312321C2780767217 @default.
- W76312321 hasConceptScore W76312321C41008148 @default.
- W76312321 hasConceptScore W76312321C49937458 @default.
- W76312321 hasConceptScore W76312321C542102704 @default.
- W76312321 hasConceptScore W76312321C95457728 @default.
- W76312321 hasConceptScore W76312321C97541855 @default.
- W76312321 hasOpenAccess W76312321 @default.
- W76312321 hasRelatedWork W1491843047 @default.
- W76312321 hasRelatedWork W1931792391 @default.
- W76312321 hasRelatedWork W1961401824 @default.
- W76312321 hasRelatedWork W1976764479 @default.
- W76312321 hasRelatedWork W2024877309 @default.
- W76312321 hasRelatedWork W2096001037 @default.
- W76312321 hasRelatedWork W276460289 @default.
- W76312321 hasRelatedWork W2919449706 @default.
- W76312321 hasRelatedWork W2949243561 @default.
- W76312321 hasRelatedWork W2950722223 @default.
- W76312321 hasRelatedWork W2964855005 @default.
- W76312321 hasRelatedWork W2991032634 @default.
- W76312321 hasRelatedWork W3034973310 @default.
- W76312321 hasRelatedWork W3040161731 @default.
- W76312321 hasRelatedWork W3084024636 @default.
- W76312321 hasRelatedWork W3172839753 @default.
- W76312321 hasRelatedWork W3181203407 @default.
- W76312321 hasRelatedWork W3181701811 @default.
- W76312321 hasRelatedWork W3185399930 @default.
- W76312321 hasRelatedWork W3200996868 @default.
- W76312321 isParatext "false" @default.
- W76312321 isRetracted "false" @default.
- W76312321 magId "76312321" @default.
- W76312321 workType "article" @default.