Matches in SemOpenAlex for { <https://semopenalex.org/work/W3016330197> ?p ?o ?g. }
- W3016330197 abstract "$Q$ -learning is a generic approach that uses a finite discrete state and an action domain to estimate action values using tabular or function approximation methods. An intelligent agent eventually learns policies from continuous sensory inputs and encodes these environmental inputs onto a discrete state space. The application of $Q$ -learning in a continuous state/action domain is the subject of many studies. This paper uses a tree structure to approximate a $Q$ -function using in a continuous state domain. The agent selects a discretized action with a maximum $Q$ -value and this discretized action is then extended to a continuous action using an action bias function. Reinforcement learning is difficult for a single agent when the state space is huge. This proposed architecture is also applied to a multiagent system, wherein an individual agent transfers its useful $Q$ -values to other agents to accelerate the learning process. Policy is shared between agents by grafting the branches of trees in which $Q$ -values are stored to other trees. The results for simulation show that the proposed architecture performs better than tabular $Q$ -learning and significantly accelerates the learning process because all agents use the sharing mechanisms to cooperate with each other." @default.
- W3016330197 created "2020-04-24" @default.
- W3016330197 creator A5001081733 @default.
- W3016330197 creator A5061189209 @default.
- W3016330197 creator A5064655192 @default.
- W3016330197 creator A5080915393 @default.
- W3016330197 date "2020-09-01" @default.
- W3016330197 modified "2023-09-24" @default.
- W3016330197 title "Policy Sharing Using Aggregation Trees for ${Q}$ -Learning in a Continuous State and Action Spaces" @default.
- W3016330197 cites W1557517019 @default.
- W3016330197 cites W1641379095 @default.
- W3016330197 cites W1783866091 @default.
- W3016330197 cites W1896356261 @default.
- W3016330197 cites W1967769980 @default.
- W3016330197 cites W1975987482 @default.
- W3016330197 cites W2000867768 @default.
- W3016330197 cites W2022795107 @default.
- W3016330197 cites W2072931156 @default.
- W3016330197 cites W2075754841 @default.
- W3016330197 cites W2078196735 @default.
- W3016330197 cites W2080759927 @default.
- W3016330197 cites W2095989982 @default.
- W3016330197 cites W2097082695 @default.
- W3016330197 cites W2099618002 @default.
- W3016330197 cites W2103902778 @default.
- W3016330197 cites W2105715607 @default.
- W3016330197 cites W2121863487 @default.
- W3016330197 cites W2124175081 @default.
- W3016330197 cites W2124422817 @default.
- W3016330197 cites W2124491905 @default.
- W3016330197 cites W2124716489 @default.
- W3016330197 cites W2135637601 @default.
- W3016330197 cites W2147319420 @default.
- W3016330197 cites W2155027007 @default.
- W3016330197 cites W2157080539 @default.
- W3016330197 cites W2188644438 @default.
- W3016330197 cites W2402164873 @default.
- W3016330197 cites W2595292627 @default.
- W3016330197 cites W2782383122 @default.
- W3016330197 cites W2793798239 @default.
- W3016330197 doi "https://doi.org/10.1109/tcds.2019.2926477" @default.
- W3016330197 hasPublicationYear "2020" @default.
- W3016330197 type Work @default.
- W3016330197 sameAs 3016330197 @default.
- W3016330197 citedByCount "0" @default.
- W3016330197 crossrefType "journal-article" @default.
- W3016330197 hasAuthorship W3016330197A5001081733 @default.
- W3016330197 hasAuthorship W3016330197A5061189209 @default.
- W3016330197 hasAuthorship W3016330197A5064655192 @default.
- W3016330197 hasAuthorship W3016330197A5080915393 @default.
- W3016330197 hasConcept C105795698 @default.
- W3016330197 hasConcept C113174947 @default.
- W3016330197 hasConcept C11413529 @default.
- W3016330197 hasConcept C121332964 @default.
- W3016330197 hasConcept C134306372 @default.
- W3016330197 hasConcept C14036430 @default.
- W3016330197 hasConcept C154945302 @default.
- W3016330197 hasConcept C188116033 @default.
- W3016330197 hasConcept C2780791683 @default.
- W3016330197 hasConcept C33923547 @default.
- W3016330197 hasConcept C36503486 @default.
- W3016330197 hasConcept C41008148 @default.
- W3016330197 hasConcept C48103436 @default.
- W3016330197 hasConcept C62520636 @default.
- W3016330197 hasConcept C72434380 @default.
- W3016330197 hasConcept C73000952 @default.
- W3016330197 hasConcept C78458016 @default.
- W3016330197 hasConcept C86803240 @default.
- W3016330197 hasConcept C97541855 @default.
- W3016330197 hasConceptScore W3016330197C105795698 @default.
- W3016330197 hasConceptScore W3016330197C113174947 @default.
- W3016330197 hasConceptScore W3016330197C11413529 @default.
- W3016330197 hasConceptScore W3016330197C121332964 @default.
- W3016330197 hasConceptScore W3016330197C134306372 @default.
- W3016330197 hasConceptScore W3016330197C14036430 @default.
- W3016330197 hasConceptScore W3016330197C154945302 @default.
- W3016330197 hasConceptScore W3016330197C188116033 @default.
- W3016330197 hasConceptScore W3016330197C2780791683 @default.
- W3016330197 hasConceptScore W3016330197C33923547 @default.
- W3016330197 hasConceptScore W3016330197C36503486 @default.
- W3016330197 hasConceptScore W3016330197C41008148 @default.
- W3016330197 hasConceptScore W3016330197C48103436 @default.
- W3016330197 hasConceptScore W3016330197C62520636 @default.
- W3016330197 hasConceptScore W3016330197C72434380 @default.
- W3016330197 hasConceptScore W3016330197C73000952 @default.
- W3016330197 hasConceptScore W3016330197C78458016 @default.
- W3016330197 hasConceptScore W3016330197C86803240 @default.
- W3016330197 hasConceptScore W3016330197C97541855 @default.
- W3016330197 hasFunder F4320322795 @default.
- W3016330197 hasLocation W30163301971 @default.
- W3016330197 hasOpenAccess W3016330197 @default.
- W3016330197 hasPrimaryLocation W30163301971 @default.
- W3016330197 hasRelatedWork W11035765 @default.
- W3016330197 hasRelatedWork W12045665 @default.
- W3016330197 hasRelatedWork W3471107 @default.
- W3016330197 hasRelatedWork W4191668 @default.
- W3016330197 hasRelatedWork W547392 @default.
- W3016330197 hasRelatedWork W5991403 @default.
- W3016330197 hasRelatedWork W6915741 @default.
- W3016330197 hasRelatedWork W8539471 @default.