Matches in SemOpenAlex for { <https://semopenalex.org/work/W3142184916> ?p ?o ?g. }
Showing items 1 to 52 of
52
with 100 items per page.
- W3142184916 abstract "In Chapter 2, we talked about the parts of the setup that form the agent and the part that forms the environment. The agent gets the state St = s and learns a policy π(s| a) that maps states to actions. The agent uses this policy to take an action At = a when in state St = s. The system transitions to the next time instant of t + 1. The environment responds to the action (At = a) by putting the agent in a new state of St + 1 = s’ and providing feedback to the agent in terms of a reward, Rt + 1. The agent has no control over what the new state St + 1 and reward Rt + 1 will be. This transition from (St = s, At = a) → (Rt + 1 = r, St + 1 = s’) is governed by the environment. This is known as transition dynamics. For a given pair of (s, a), there could be one or more pairs of (r, s’). In a deterministic world, we would have a single pair of (r, s’) for a fixed combination of (s, a). However, in stochastic environments, i.e., environments with uncertain outcomes, we could have many pairs of (r, s’) for a given (s, a)." @default.
- W3142184916 created "2021-04-13" @default.
- W3142184916 creator A5027838536 @default.
- W3142184916 date "2021-01-01" @default.
- W3142184916 modified "2023-09-25" @default.
- W3142184916 title "Model-Based Algorithms" @default.
- W3142184916 doi "https://doi.org/10.1007/978-1-4842-6809-4_3" @default.
- W3142184916 hasPublicationYear "2021" @default.
- W3142184916 type Work @default.
- W3142184916 sameAs 3142184916 @default.
- W3142184916 citedByCount "0" @default.
- W3142184916 crossrefType "book-chapter" @default.
- W3142184916 hasAuthorship W3142184916A5027838536 @default.
- W3142184916 hasConcept C104317684 @default.
- W3142184916 hasConcept C11413529 @default.
- W3142184916 hasConcept C121332964 @default.
- W3142184916 hasConcept C185592680 @default.
- W3142184916 hasConcept C194232998 @default.
- W3142184916 hasConcept C2780791683 @default.
- W3142184916 hasConcept C41008148 @default.
- W3142184916 hasConcept C48103436 @default.
- W3142184916 hasConcept C55493867 @default.
- W3142184916 hasConcept C62520636 @default.
- W3142184916 hasConcept C80444323 @default.
- W3142184916 hasConceptScore W3142184916C104317684 @default.
- W3142184916 hasConceptScore W3142184916C11413529 @default.
- W3142184916 hasConceptScore W3142184916C121332964 @default.
- W3142184916 hasConceptScore W3142184916C185592680 @default.
- W3142184916 hasConceptScore W3142184916C194232998 @default.
- W3142184916 hasConceptScore W3142184916C2780791683 @default.
- W3142184916 hasConceptScore W3142184916C41008148 @default.
- W3142184916 hasConceptScore W3142184916C48103436 @default.
- W3142184916 hasConceptScore W3142184916C55493867 @default.
- W3142184916 hasConceptScore W3142184916C62520636 @default.
- W3142184916 hasConceptScore W3142184916C80444323 @default.
- W3142184916 hasLocation W31421849161 @default.
- W3142184916 hasOpenAccess W3142184916 @default.
- W3142184916 hasPrimaryLocation W31421849161 @default.
- W3142184916 hasRelatedWork W10166426 @default.
- W3142184916 hasRelatedWork W10850456 @default.
- W3142184916 hasRelatedWork W1315694 @default.
- W3142184916 hasRelatedWork W13860204 @default.
- W3142184916 hasRelatedWork W3947704 @default.
- W3142184916 hasRelatedWork W5947967 @default.
- W3142184916 hasRelatedWork W660320 @default.
- W3142184916 hasRelatedWork W7832550 @default.
- W3142184916 hasRelatedWork W8435521 @default.
- W3142184916 hasRelatedWork W9638675 @default.
- W3142184916 isParatext "false" @default.
- W3142184916 isRetracted "false" @default.
- W3142184916 magId "3142184916" @default.
- W3142184916 workType "book-chapter" @default.