Matches in SemOpenAlex for { <https://semopenalex.org/work/W3166342932> ?p ?o ?g. }
Showing items 1 to 93 of
93
with 100 items per page.
- W3166342932 abstract "Abstract symbolic reasoning, as required in domains such as mathematics and logic, is a key component of human intelligence. Solvers for these domains have important applications, especially to computer-assisted education. But learning to solve symbolic problems is challenging for machine learning algorithms. Existing models either learn from human solutions or use hand-engineered features, making them expensive to apply in new domains. In this paper, we instead consider symbolic domains as simple environments where states and actions are given as unstructured text, and binary rewards indicate whether a problem is solved. This flexible setup makes it easy to specify new domains, but search and planning become challenging. We introduce four environments inspired by the Mathematics Common Core Curriculum, and observe that existing Reinforcement Learning baselines perform poorly. We then present a novel learning algorithm, Contrastive Policy Learning (ConPoLe) that explicitly optimizes the InfoNCE loss, which lower bounds the mutual information between the current state and next states that continue on a path to the solution. ConPoLe successfully solves all four domains. Moreover, problem representations learned by ConPoLe enable accurate prediction of the categories of problems in a real mathematics curriculum. Our results suggest new directions for reinforcement learning in symbolic domains, as well as applications to mathematics education." @default.
- W3166342932 created "2021-06-22" @default.
- W3166342932 creator A5001961716 @default.
- W3166342932 creator A5033725744 @default.
- W3166342932 creator A5043645877 @default.
- W3166342932 date "2021-06-16" @default.
- W3166342932 modified "2023-09-27" @default.
- W3166342932 title "Contrastive Reinforcement Learning of Symbolic Reasoning Domains" @default.
- W3166342932 cites W128206494 @default.
- W3166342932 cites W1464569014 @default.
- W3166342932 cites W1497731808 @default.
- W3166342932 cites W1527272073 @default.
- W3166342932 cites W1560253649 @default.
- W3166342932 cites W1563218432 @default.
- W3166342932 cites W1971228271 @default.
- W3166342932 cites W1973885939 @default.
- W3166342932 cites W2004286722 @default.
- W3166342932 cites W2051339053 @default.
- W3166342932 cites W2064675550 @default.
- W3166342932 cites W2114501209 @default.
- W3166342932 cites W2117655204 @default.
- W3166342932 cites W2136208491 @default.
- W3166342932 cites W2428112462 @default.
- W3166342932 cites W2618342356 @default.
- W3166342932 cites W2752885492 @default.
- W3166342932 cites W2842511635 @default.
- W3166342932 cites W2946011656 @default.
- W3166342932 cites W2951855880 @default.
- W3166342932 cites W2960567166 @default.
- W3166342932 cites W2962905782 @default.
- W3166342932 cites W2963147113 @default.
- W3166342932 cites W2963407845 @default.
- W3166342932 cites W2964179661 @default.
- W3166342932 cites W3029636270 @default.
- W3166342932 cites W3083835029 @default.
- W3166342932 cites W3089433505 @default.
- W3166342932 cites W3115293622 @default.
- W3166342932 cites W3169291081 @default.
- W3166342932 cites W2922212713 @default.
- W3166342932 cites W3125808012 @default.
- W3166342932 hasPublicationYear "2021" @default.
- W3166342932 type Work @default.
- W3166342932 sameAs 3166342932 @default.
- W3166342932 citedByCount "0" @default.
- W3166342932 crossrefType "posted-content" @default.
- W3166342932 hasAuthorship W3166342932A5001961716 @default.
- W3166342932 hasAuthorship W3166342932A5033725744 @default.
- W3166342932 hasAuthorship W3166342932A5043645877 @default.
- W3166342932 hasConcept C111472728 @default.
- W3166342932 hasConcept C138885662 @default.
- W3166342932 hasConcept C154945302 @default.
- W3166342932 hasConcept C26517878 @default.
- W3166342932 hasConcept C2780586882 @default.
- W3166342932 hasConcept C38652104 @default.
- W3166342932 hasConcept C41008148 @default.
- W3166342932 hasConcept C80444323 @default.
- W3166342932 hasConcept C97541855 @default.
- W3166342932 hasConceptScore W3166342932C111472728 @default.
- W3166342932 hasConceptScore W3166342932C138885662 @default.
- W3166342932 hasConceptScore W3166342932C154945302 @default.
- W3166342932 hasConceptScore W3166342932C26517878 @default.
- W3166342932 hasConceptScore W3166342932C2780586882 @default.
- W3166342932 hasConceptScore W3166342932C38652104 @default.
- W3166342932 hasConceptScore W3166342932C41008148 @default.
- W3166342932 hasConceptScore W3166342932C80444323 @default.
- W3166342932 hasConceptScore W3166342932C97541855 @default.
- W3166342932 hasLocation W31663429321 @default.
- W3166342932 hasOpenAccess W3166342932 @default.
- W3166342932 hasPrimaryLocation W31663429321 @default.
- W3166342932 hasRelatedWork W1587813652 @default.
- W3166342932 hasRelatedWork W1882226547 @default.
- W3166342932 hasRelatedWork W189652262 @default.
- W3166342932 hasRelatedWork W2114451917 @default.
- W3166342932 hasRelatedWork W2149390907 @default.
- W3166342932 hasRelatedWork W2248206634 @default.
- W3166342932 hasRelatedWork W2260846929 @default.
- W3166342932 hasRelatedWork W2270835334 @default.
- W3166342932 hasRelatedWork W2476698163 @default.
- W3166342932 hasRelatedWork W276460289 @default.
- W3166342932 hasRelatedWork W2811111819 @default.
- W3166342932 hasRelatedWork W2945794705 @default.
- W3166342932 hasRelatedWork W2978955179 @default.
- W3166342932 hasRelatedWork W2987047801 @default.
- W3166342932 hasRelatedWork W3003712948 @default.
- W3166342932 hasRelatedWork W3005727454 @default.
- W3166342932 hasRelatedWork W3018946989 @default.
- W3166342932 hasRelatedWork W3037845067 @default.
- W3166342932 hasRelatedWork W3131034247 @default.
- W3166342932 hasRelatedWork W3211963390 @default.
- W3166342932 isParatext "false" @default.
- W3166342932 isRetracted "false" @default.
- W3166342932 magId "3166342932" @default.
- W3166342932 workType "article" @default.