Matches in SemOpenAlex for { <https://semopenalex.org/work/W2556244909> ?p ?o ?g. }
- W2556244909 abstract "Reinforcement Learning (RL) is learning through direct experimentation. It does not assume the existence of a teacher that provides examples upon which learning of a task takes place. Instead, in RL experience is the only teacher. With historical roots on the study of conditioned reflexes, RL soon attracted the interest of Engineers and Computer Scientists because of its theoretical relevance and potential applications in fields as diverse as Operational Research and Robotics. Computationally, RL is intended to operate in a learning environment composed by two subjects: the learner and a dynamic process. At successive time steps, the learner makes an observation of the process state, selects an action and applies it back to the process. The goal of the learner is to find out an action policy that controls the behavior of this dynamic process, guided by signals (reinforcements) that indicate how well it is performing the required task. These signals are usually associated to some dramatic condition — e.g., accomplishment of a subtask (reward) or complete failure (punishment), and the learner’s goal is to optimize its behavior based on some performance measure (a function of the received reinforcements). The crucial point is that in order to do that, the learner must evaluate the conditions (associations between observed states and chosen actions) that lead to rewards or punishments. In other words, it must learn how to assign credit to past actions and states by correctly estimating costs associated to these events. Starting from basic concepts, this tutorial presents the many flavors of RL algorithms, develops the corresponding mathematical tools, assess their practical limitations and discusses alternatives that have been proposed for applying RL to realistic tasks, such as those involving large state spaces or partial observability. It relies on examples and diagrams to illustrate the main points, and provides many references to the specialized literature and to Internet sites where relevant demos and additional information can be obtained." @default.
- W2556244909 created "2016-11-30" @default.
- W2556244909 creator A5002662460 @default.
- W2556244909 creator A5071823258 @default.
- W2556244909 date "1999-01-01" @default.
- W2556244909 modified "2023-09-24" @default.
- W2556244909 title "Aprendizado por Reforco" @default.
- W2556244909 cites W10428601 @default.
- W2556244909 cites W1488778327 @default.
- W2556244909 cites W1491699534 @default.
- W2556244909 cites W1491843047 @default.
- W2556244909 cites W1506145880 @default.
- W2556244909 cites W1522632257 @default.
- W2556244909 cites W1535810436 @default.
- W2556244909 cites W1548889916 @default.
- W2556244909 cites W1576452626 @default.
- W2556244909 cites W1576798893 @default.
- W2556244909 cites W1583380718 @default.
- W2556244909 cites W1594297126 @default.
- W2556244909 cites W1610678877 @default.
- W2556244909 cites W1640646391 @default.
- W2556244909 cites W1657674574 @default.
- W2556244909 cites W1716849269 @default.
- W2556244909 cites W19246841 @default.
- W2556244909 cites W1966089223 @default.
- W2556244909 cites W1979071892 @default.
- W2556244909 cites W1990803481 @default.
- W2556244909 cites W1993192904 @default.
- W2556244909 cites W1994616650 @default.
- W2556244909 cites W2020677283 @default.
- W2556244909 cites W2043003550 @default.
- W2556244909 cites W2048226872 @default.
- W2556244909 cites W2048984163 @default.
- W2556244909 cites W2054940200 @default.
- W2556244909 cites W2061361125 @default.
- W2556244909 cites W2062533874 @default.
- W2556244909 cites W2065356613 @default.
- W2556244909 cites W2084378698 @default.
- W2556244909 cites W2091565802 @default.
- W2556244909 cites W2098432798 @default.
- W2556244909 cites W2100677568 @default.
- W2556244909 cites W2103019454 @default.
- W2556244909 cites W2103626435 @default.
- W2556244909 cites W2106425086 @default.
- W2556244909 cites W2110485445 @default.
- W2556244909 cites W2112032710 @default.
- W2556244909 cites W2113913482 @default.
- W2556244909 cites W2124175081 @default.
- W2556244909 cites W2124434282 @default.
- W2556244909 cites W2124599560 @default.
- W2556244909 cites W2124776405 @default.
- W2556244909 cites W2125074935 @default.
- W2556244909 cites W2125510930 @default.
- W2556244909 cites W2125838338 @default.
- W2556244909 cites W2127290018 @default.
- W2556244909 cites W2138178898 @default.
- W2556244909 cites W2141559645 @default.
- W2556244909 cites W2151742051 @default.
- W2556244909 cites W2165131254 @default.
- W2556244909 cites W2169022337 @default.
- W2556244909 cites W2169982856 @default.
- W2556244909 cites W2170192506 @default.
- W2556244909 cites W2265780932 @default.
- W2556244909 cites W2480285429 @default.
- W2556244909 cites W2610686804 @default.
- W2556244909 cites W2895774836 @default.
- W2556244909 cites W2914331897 @default.
- W2556244909 cites W3011120880 @default.
- W2556244909 cites W3021909058 @default.
- W2556244909 cites W3198350258 @default.
- W2556244909 cites W32198084 @default.
- W2556244909 cites W569084886 @default.
- W2556244909 cites W95627046 @default.
- W2556244909 cites W2131600418 @default.
- W2556244909 hasPublicationYear "1999" @default.
- W2556244909 type Work @default.
- W2556244909 sameAs 2556244909 @default.
- W2556244909 citedByCount "1" @default.
- W2556244909 crossrefType "journal-article" @default.
- W2556244909 hasAuthorship W2556244909A5002662460 @default.
- W2556244909 hasAuthorship W2556244909A5071823258 @default.
- W2556244909 hasConcept C111919701 @default.
- W2556244909 hasConcept C11413529 @default.
- W2556244909 hasConcept C121332964 @default.
- W2556244909 hasConcept C127413603 @default.
- W2556244909 hasConcept C14036430 @default.
- W2556244909 hasConcept C154945302 @default.
- W2556244909 hasConcept C15744967 @default.
- W2556244909 hasConcept C158154518 @default.
- W2556244909 hasConcept C17744445 @default.
- W2556244909 hasConcept C199539241 @default.
- W2556244909 hasConcept C201995342 @default.
- W2556244909 hasConcept C2524010 @default.
- W2556244909 hasConcept C2779295839 @default.
- W2556244909 hasConcept C2780451532 @default.
- W2556244909 hasConcept C2780791683 @default.
- W2556244909 hasConcept C28719098 @default.
- W2556244909 hasConcept C33923547 @default.
- W2556244909 hasConcept C34413123 @default.
- W2556244909 hasConcept C41008148 @default.