Matches in SemOpenAlex for { <https://semopenalex.org/work/W3114356733> ?p ?o ?g. }
- W3114356733 abstract "As reinforcement learning techniques are increasingly applied to real-world decision problems, attention has turned to how these algorithms use potentially sensitive information. We consider the task of training a policy that maximizes reward while minimizing disclosure of certain sensitive state variables through the actions. We give examples of how this setting covers real-world problems in privacy for sequential decision-making. We solve this problem in the policy gradients framework by introducing a regularizer based on the mutual information (MI) between the sensitive state and the actions at a given timestep. We develop a model-based stochastic gradient estimator for optimization of privacy-constrained policies. We also discuss an alternative MI regularizer that serves as an upper bound to our main MI regularizer and can be optimized in a model-free setting. We contrast previous work in differentially-private RL to our mutual-information formulation of information disclosure. Experimental results show that our training method results in policies which hide the sensitive state." @default.
- W3114356733 created "2021-01-05" @default.
- W3114356733 creator A5034127198 @default.
- W3114356733 creator A5091179481 @default.
- W3114356733 date "2020-12-30" @default.
- W3114356733 modified "2023-09-27" @default.
- W3114356733 title "Privacy-Constrained Policies via Mutual Information Regularized Policy Gradients." @default.
- W3114356733 cites W1873763122 @default.
- W3114356733 cites W1970302770 @default.
- W3114356733 cites W2073193337 @default.
- W3114356733 cites W2099129655 @default.
- W3114356733 cites W2114771311 @default.
- W3114356733 cites W2130964760 @default.
- W3114356733 cites W2150230313 @default.
- W3114356733 cites W2162670686 @default.
- W3114356733 cites W2257979135 @default.
- W3114356733 cites W2406823515 @default.
- W3114356733 cites W2564029303 @default.
- W3114356733 cites W2583220279 @default.
- W3114356733 cites W2583515081 @default.
- W3114356733 cites W2784465508 @default.
- W3114356733 cites W2896408693 @default.
- W3114356733 cites W2898621204 @default.
- W3114356733 cites W2904539038 @default.
- W3114356733 cites W2913854057 @default.
- W3114356733 cites W2920362155 @default.
- W3114356733 cites W2947119983 @default.
- W3114356733 cites W2947330612 @default.
- W3114356733 cites W2950152428 @default.
- W3114356733 cites W2950318927 @default.
- W3114356733 cites W2962902376 @default.
- W3114356733 cites W2963535017 @default.
- W3114356733 cites W2963595477 @default.
- W3114356733 cites W2963618951 @default.
- W3114356733 cites W2963750948 @default.
- W3114356733 cites W2963771282 @default.
- W3114356733 cites W2963800509 @default.
- W3114356733 cites W2964121744 @default.
- W3114356733 cites W2970356364 @default.
- W3114356733 cites W2979895842 @default.
- W3114356733 cites W3113688645 @default.
- W3114356733 cites W3150807214 @default.
- W3114356733 cites W3157409643 @default.
- W3114356733 cites W2891767478 @default.
- W3114356733 hasPublicationYear "2020" @default.
- W3114356733 type Work @default.
- W3114356733 sameAs 3114356733 @default.
- W3114356733 citedByCount "0" @default.
- W3114356733 crossrefType "posted-content" @default.
- W3114356733 hasAuthorship W3114356733A5034127198 @default.
- W3114356733 hasAuthorship W3114356733A5091179481 @default.
- W3114356733 hasConcept C105795698 @default.
- W3114356733 hasConcept C11413529 @default.
- W3114356733 hasConcept C119857082 @default.
- W3114356733 hasConcept C126255220 @default.
- W3114356733 hasConcept C137822555 @default.
- W3114356733 hasConcept C152139883 @default.
- W3114356733 hasConcept C154945302 @default.
- W3114356733 hasConcept C162324750 @default.
- W3114356733 hasConcept C185429906 @default.
- W3114356733 hasConcept C187736073 @default.
- W3114356733 hasConcept C2776502983 @default.
- W3114356733 hasConcept C2780451532 @default.
- W3114356733 hasConcept C33923547 @default.
- W3114356733 hasConcept C38652104 @default.
- W3114356733 hasConcept C41008148 @default.
- W3114356733 hasConcept C48103436 @default.
- W3114356733 hasConcept C52622258 @default.
- W3114356733 hasConcept C97541855 @default.
- W3114356733 hasConcept C99221444 @default.
- W3114356733 hasConceptScore W3114356733C105795698 @default.
- W3114356733 hasConceptScore W3114356733C11413529 @default.
- W3114356733 hasConceptScore W3114356733C119857082 @default.
- W3114356733 hasConceptScore W3114356733C126255220 @default.
- W3114356733 hasConceptScore W3114356733C137822555 @default.
- W3114356733 hasConceptScore W3114356733C152139883 @default.
- W3114356733 hasConceptScore W3114356733C154945302 @default.
- W3114356733 hasConceptScore W3114356733C162324750 @default.
- W3114356733 hasConceptScore W3114356733C185429906 @default.
- W3114356733 hasConceptScore W3114356733C187736073 @default.
- W3114356733 hasConceptScore W3114356733C2776502983 @default.
- W3114356733 hasConceptScore W3114356733C2780451532 @default.
- W3114356733 hasConceptScore W3114356733C33923547 @default.
- W3114356733 hasConceptScore W3114356733C38652104 @default.
- W3114356733 hasConceptScore W3114356733C41008148 @default.
- W3114356733 hasConceptScore W3114356733C48103436 @default.
- W3114356733 hasConceptScore W3114356733C52622258 @default.
- W3114356733 hasConceptScore W3114356733C97541855 @default.
- W3114356733 hasConceptScore W3114356733C99221444 @default.
- W3114356733 hasLocation W31143567331 @default.
- W3114356733 hasOpenAccess W3114356733 @default.
- W3114356733 hasPrimaryLocation W31143567331 @default.
- W3114356733 hasRelatedWork W1656041229 @default.
- W3114356733 hasRelatedWork W2395778028 @default.
- W3114356733 hasRelatedWork W2787471386 @default.
- W3114356733 hasRelatedWork W2789561913 @default.
- W3114356733 hasRelatedWork W2897714018 @default.
- W3114356733 hasRelatedWork W2912212713 @default.
- W3114356733 hasRelatedWork W2945408776 @default.
- W3114356733 hasRelatedWork W2946153830 @default.