Matches in SemOpenAlex for { <https://semopenalex.org/work/W4287024322> ?p ?o ?g. }
Showing items 1 to 77 of
77
with 100 items per page.
- W4287024322 abstract "Actor-critic (AC) algorithms are known for their efficacy and high performance in solving reinforcement learning problems, but they also suffer from low sampling efficiency. An AC based policy optimization process is iterative and needs to frequently access the agent-environment system to evaluate and update the policy by rolling out the policy, collecting rewards and states (i.e. samples), and learning from them. It ultimately requires a huge number of samples to learn an optimal policy. To improve sampling efficiency, we propose a strategy to optimize the training dataset that contains significantly less samples collected from the AC process. The dataset optimization is made of a best episode only operation, a policy parameter-fitness model, and a genetic algorithm module. The optimal policy network trained by the optimized training dataset exhibits superior performance compared to many contemporary AC algorithms in controlling autonomous dynamical systems. Evaluation on standard benchmarks show that the method improves sampling efficiency, ensures faster convergence to optima, and is more data-efficient than its counterparts." @default.
- W4287024322 created "2022-07-25" @default.
- W4287024322 creator A5008309887 @default.
- W4287024322 creator A5032512088 @default.
- W4287024322 creator A5043976016 @default.
- W4287024322 creator A5079420724 @default.
- W4287024322 date "2021-08-16" @default.
- W4287024322 modified "2023-10-17" @default.
- W4287024322 title "Optimal Actor-Critic Policy with Optimized Training Datasets" @default.
- W4287024322 doi "https://doi.org/10.48550/arxiv.2108.06911" @default.
- W4287024322 hasPublicationYear "2021" @default.
- W4287024322 type Work @default.
- W4287024322 citedByCount "0" @default.
- W4287024322 crossrefType "posted-content" @default.
- W4287024322 hasAuthorship W4287024322A5008309887 @default.
- W4287024322 hasAuthorship W4287024322A5032512088 @default.
- W4287024322 hasAuthorship W4287024322A5043976016 @default.
- W4287024322 hasAuthorship W4287024322A5079420724 @default.
- W4287024322 hasBestOaLocation W42870243221 @default.
- W4287024322 hasConcept C106131492 @default.
- W4287024322 hasConcept C111919701 @default.
- W4287024322 hasConcept C115903868 @default.
- W4287024322 hasConcept C119857082 @default.
- W4287024322 hasConcept C121332964 @default.
- W4287024322 hasConcept C126255220 @default.
- W4287024322 hasConcept C140779682 @default.
- W4287024322 hasConcept C141934464 @default.
- W4287024322 hasConcept C143587482 @default.
- W4287024322 hasConcept C153294291 @default.
- W4287024322 hasConcept C154945302 @default.
- W4287024322 hasConcept C162324750 @default.
- W4287024322 hasConcept C2777211547 @default.
- W4287024322 hasConcept C2777303404 @default.
- W4287024322 hasConcept C31972630 @default.
- W4287024322 hasConcept C33923547 @default.
- W4287024322 hasConcept C41008148 @default.
- W4287024322 hasConcept C50522688 @default.
- W4287024322 hasConcept C8880873 @default.
- W4287024322 hasConcept C97541855 @default.
- W4287024322 hasConcept C98045186 @default.
- W4287024322 hasConceptScore W4287024322C106131492 @default.
- W4287024322 hasConceptScore W4287024322C111919701 @default.
- W4287024322 hasConceptScore W4287024322C115903868 @default.
- W4287024322 hasConceptScore W4287024322C119857082 @default.
- W4287024322 hasConceptScore W4287024322C121332964 @default.
- W4287024322 hasConceptScore W4287024322C126255220 @default.
- W4287024322 hasConceptScore W4287024322C140779682 @default.
- W4287024322 hasConceptScore W4287024322C141934464 @default.
- W4287024322 hasConceptScore W4287024322C143587482 @default.
- W4287024322 hasConceptScore W4287024322C153294291 @default.
- W4287024322 hasConceptScore W4287024322C154945302 @default.
- W4287024322 hasConceptScore W4287024322C162324750 @default.
- W4287024322 hasConceptScore W4287024322C2777211547 @default.
- W4287024322 hasConceptScore W4287024322C2777303404 @default.
- W4287024322 hasConceptScore W4287024322C31972630 @default.
- W4287024322 hasConceptScore W4287024322C33923547 @default.
- W4287024322 hasConceptScore W4287024322C41008148 @default.
- W4287024322 hasConceptScore W4287024322C50522688 @default.
- W4287024322 hasConceptScore W4287024322C8880873 @default.
- W4287024322 hasConceptScore W4287024322C97541855 @default.
- W4287024322 hasConceptScore W4287024322C98045186 @default.
- W4287024322 hasLocation W42870243221 @default.
- W4287024322 hasOpenAccess W4287024322 @default.
- W4287024322 hasPrimaryLocation W42870243221 @default.
- W4287024322 hasRelatedWork W2017766371 @default.
- W4287024322 hasRelatedWork W2099421003 @default.
- W4287024322 hasRelatedWork W2353889708 @default.
- W4287024322 hasRelatedWork W2375660666 @default.
- W4287024322 hasRelatedWork W2557378401 @default.
- W4287024322 hasRelatedWork W2586739852 @default.
- W4287024322 hasRelatedWork W2887515087 @default.
- W4287024322 hasRelatedWork W3022038857 @default.
- W4287024322 hasRelatedWork W3194626037 @default.
- W4287024322 hasRelatedWork W4319083788 @default.
- W4287024322 isParatext "false" @default.
- W4287024322 isRetracted "false" @default.
- W4287024322 workType "article" @default.