Matches in SemOpenAlex for { <https://semopenalex.org/work/W4318962912> ?p ?o ?g. }
- W4318962912 endingPage "296" @default.
- W4318962912 startingPage "281" @default.
- W4318962912 abstract "Deep reinforcement learning (DRL) has achieved remarkable results on high-dimension state tasks. However, it suffers in hard convergence and low sample efficiency when solving large discrete action space problems. To meet these challenges, we develop a cooperative modular reinforcement learning (CMRL) method to distributedly solve the problems with a large discrete action space. A general yet effective task decomposition method is proposed to decompose the complex decision task in a large action space into multiple decision sub-tasks in small action subsets, using a rule-based action division method. The CMRL method consisting of multiple Critic networks is proposed to settle the multiple sub-tasks, where each Critic network learns a decomposed value function to obtain the local optimal action in a sub-task. The global optimal action is cooperatively chosen by all local optimal actions. Moreover, we propose a new parallel training mechanism, which trains multiple Critic networks with different models and multi-data in parallel. Mathematical properties are proposed to analyze the rationality and superiority of CMRL. Four different simulation experiments are conducted to verify the generality and effectiveness of CMRL for large action space problems. The results show that CMRL has superior performance on training efficiency compared with classical and latest DRL methods while maintaining the accuracy of the solution." @default.
- W4318962912 created "2023-02-03" @default.
- W4318962912 creator A5003952431 @default.
- W4318962912 creator A5019944862 @default.
- W4318962912 creator A5023284677 @default.
- W4318962912 creator A5050501022 @default.
- W4318962912 date "2023-04-01" @default.
- W4318962912 modified "2023-10-18" @default.
- W4318962912 title "Cooperative modular reinforcement learning for large discrete action space problem" @default.
- W4318962912 cites W2050883661 @default.
- W4318962912 cites W2137983211 @default.
- W4318962912 cites W2145339207 @default.
- W4318962912 cites W2165131254 @default.
- W4318962912 cites W2335835108 @default.
- W4318962912 cites W2790604742 @default.
- W4318962912 cites W2793798239 @default.
- W4318962912 cites W2809208194 @default.
- W4318962912 cites W2963317745 @default.
- W4318962912 cites W2964180249 @default.
- W4318962912 cites W2969489794 @default.
- W4318962912 cites W2981038142 @default.
- W4318962912 cites W2988592481 @default.
- W4318962912 cites W3030840723 @default.
- W4318962912 cites W3033682727 @default.
- W4318962912 cites W3044867970 @default.
- W4318962912 cites W3080884797 @default.
- W4318962912 cites W3081211774 @default.
- W4318962912 cites W3089732347 @default.
- W4318962912 cites W3100789280 @default.
- W4318962912 cites W3110125326 @default.
- W4318962912 cites W3139071578 @default.
- W4318962912 cites W3161646632 @default.
- W4318962912 cites W3171909095 @default.
- W4318962912 cites W3175523739 @default.
- W4318962912 cites W4214486317 @default.
- W4318962912 cites W4285213037 @default.
- W4318962912 doi "https://doi.org/10.1016/j.neunet.2023.01.046" @default.
- W4318962912 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/36774866" @default.
- W4318962912 hasPublicationYear "2023" @default.
- W4318962912 type Work @default.
- W4318962912 citedByCount "1" @default.
- W4318962912 countsByYear W43189629122023 @default.
- W4318962912 crossrefType "journal-article" @default.
- W4318962912 hasAuthorship W4318962912A5003952431 @default.
- W4318962912 hasAuthorship W4318962912A5019944862 @default.
- W4318962912 hasAuthorship W4318962912A5023284677 @default.
- W4318962912 hasAuthorship W4318962912A5050501022 @default.
- W4318962912 hasConcept C101468663 @default.
- W4318962912 hasConcept C111919701 @default.
- W4318962912 hasConcept C11413529 @default.
- W4318962912 hasConcept C121332964 @default.
- W4318962912 hasConcept C126255220 @default.
- W4318962912 hasConcept C154945302 @default.
- W4318962912 hasConcept C15744967 @default.
- W4318962912 hasConcept C162324750 @default.
- W4318962912 hasConcept C187736073 @default.
- W4318962912 hasConcept C2777303404 @default.
- W4318962912 hasConcept C2780451532 @default.
- W4318962912 hasConcept C2780767217 @default.
- W4318962912 hasConcept C2780791683 @default.
- W4318962912 hasConcept C33923547 @default.
- W4318962912 hasConcept C41008148 @default.
- W4318962912 hasConcept C50522688 @default.
- W4318962912 hasConcept C542102704 @default.
- W4318962912 hasConcept C62520636 @default.
- W4318962912 hasConcept C97541855 @default.
- W4318962912 hasConceptScore W4318962912C101468663 @default.
- W4318962912 hasConceptScore W4318962912C111919701 @default.
- W4318962912 hasConceptScore W4318962912C11413529 @default.
- W4318962912 hasConceptScore W4318962912C121332964 @default.
- W4318962912 hasConceptScore W4318962912C126255220 @default.
- W4318962912 hasConceptScore W4318962912C154945302 @default.
- W4318962912 hasConceptScore W4318962912C15744967 @default.
- W4318962912 hasConceptScore W4318962912C162324750 @default.
- W4318962912 hasConceptScore W4318962912C187736073 @default.
- W4318962912 hasConceptScore W4318962912C2777303404 @default.
- W4318962912 hasConceptScore W4318962912C2780451532 @default.
- W4318962912 hasConceptScore W4318962912C2780767217 @default.
- W4318962912 hasConceptScore W4318962912C2780791683 @default.
- W4318962912 hasConceptScore W4318962912C33923547 @default.
- W4318962912 hasConceptScore W4318962912C41008148 @default.
- W4318962912 hasConceptScore W4318962912C50522688 @default.
- W4318962912 hasConceptScore W4318962912C542102704 @default.
- W4318962912 hasConceptScore W4318962912C62520636 @default.
- W4318962912 hasConceptScore W4318962912C97541855 @default.
- W4318962912 hasFunder F4320321001 @default.
- W4318962912 hasLocation W43189629121 @default.
- W4318962912 hasLocation W43189629122 @default.
- W4318962912 hasOpenAccess W4318962912 @default.
- W4318962912 hasPrimaryLocation W43189629121 @default.
- W4318962912 hasRelatedWork W2045236383 @default.
- W4318962912 hasRelatedWork W2135769783 @default.
- W4318962912 hasRelatedWork W2803281228 @default.
- W4318962912 hasRelatedWork W2950614095 @default.
- W4318962912 hasRelatedWork W2959276766 @default.
- W4318962912 hasRelatedWork W3074294383 @default.
- W4318962912 hasRelatedWork W4206669594 @default.
- W4318962912 hasRelatedWork W4294567340 @default.