SemOpenAlex |

SemOpenAlex

Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387322626> ?p ?o ?g. }

Showing items 1 to 73 of 73 with 100 items per page.

W4387322626 abstract "Reward modeling (a.k.a., preference modeling) is instrumental for aligning large language models with human preferences, particularly within the context of reinforcement learning from human feedback (RLHF). While conventional reward models (RMs) have exhibited remarkable scalability, they oft struggle with fundamental functionality such as arithmetic computation, code execution, and factual lookup. In this paper, we propose a tool-augmented preference modeling approach, named name, to address these limitations by empowering RMs with access to external environments, including calculators and search engines. This approach not only fosters synergy between tool utilization and reward grading but also enhances interpretive capacity and scoring reliability. Our study delves into the integration of external tools into RMs, enabling them to interact with diverse external sources and construct task-specific tool engagement and reasoning traces in an autoregressive manner. We validate our approach across a wide range of domains, incorporating seven distinct external tools. Our experimental results demonstrate a noteworthy overall improvement of 17.7% across eight tasks in preference ranking. Furthermore, our approach outperforms Gopher 280B by 7.3% on TruthfulQA task in zero-shot evaluation. In human evaluations, RLHF trained with Themis attains an average win rate of 32% when compared to baselines across four distinct tasks. Additionally, we provide a comprehensive collection of tool-related RM datasets, incorporating data from seven distinct tool APIs, totaling 15,000 instances. We anticipate that this publicly available dataset will facilitate and inspire further research advancements in the field." @default.
W4387322626 created "2023-10-04" @default.
W4387322626 creator A5000354596 @default.
W4387322626 creator A5006542450 @default.
W4387322626 creator A5030810655 @default.
W4387322626 creator A5058880150 @default.
W4387322626 creator A5072082849 @default.
W4387322626 creator A5074864108 @default.
W4387322626 creator A5075889094 @default.
W4387322626 date "2023-10-02" @default.
W4387322626 modified "2023-10-14" @default.
W4387322626 title "Tool-Augmented Reward Modeling" @default.
W4387322626 doi "https://doi.org/10.48550/arxiv.2310.01045" @default.
W4387322626 hasPublicationYear "2023" @default.
W4387322626 type Work @default.
W4387322626 citedByCount "0" @default.
W4387322626 crossrefType "posted-content" @default.
W4387322626 hasAuthorship W4387322626A5000354596 @default.
W4387322626 hasAuthorship W4387322626A5006542450 @default.
W4387322626 hasAuthorship W4387322626A5030810655 @default.
W4387322626 hasAuthorship W4387322626A5058880150 @default.
W4387322626 hasAuthorship W4387322626A5072082849 @default.
W4387322626 hasAuthorship W4387322626A5074864108 @default.
W4387322626 hasAuthorship W4387322626A5075889094 @default.
W4387322626 hasBestOaLocation W43873226261 @default.
W4387322626 hasConcept C119857082 @default.
W4387322626 hasConcept C151730666 @default.
W4387322626 hasConcept C154945302 @default.
W4387322626 hasConcept C162324750 @default.
W4387322626 hasConcept C187736073 @default.
W4387322626 hasConcept C189430467 @default.
W4387322626 hasConcept C202444582 @default.
W4387322626 hasConcept C2779343474 @default.
W4387322626 hasConcept C2780451532 @default.
W4387322626 hasConcept C33923547 @default.
W4387322626 hasConcept C41008148 @default.
W4387322626 hasConcept C48044578 @default.
W4387322626 hasConcept C77088390 @default.
W4387322626 hasConcept C86803240 @default.
W4387322626 hasConcept C9652623 @default.
W4387322626 hasConcept C97541855 @default.
W4387322626 hasConceptScore W4387322626C119857082 @default.
W4387322626 hasConceptScore W4387322626C151730666 @default.
W4387322626 hasConceptScore W4387322626C154945302 @default.
W4387322626 hasConceptScore W4387322626C162324750 @default.
W4387322626 hasConceptScore W4387322626C187736073 @default.
W4387322626 hasConceptScore W4387322626C189430467 @default.
W4387322626 hasConceptScore W4387322626C202444582 @default.
W4387322626 hasConceptScore W4387322626C2779343474 @default.
W4387322626 hasConceptScore W4387322626C2780451532 @default.
W4387322626 hasConceptScore W4387322626C33923547 @default.
W4387322626 hasConceptScore W4387322626C41008148 @default.
W4387322626 hasConceptScore W4387322626C48044578 @default.
W4387322626 hasConceptScore W4387322626C77088390 @default.
W4387322626 hasConceptScore W4387322626C86803240 @default.
W4387322626 hasConceptScore W4387322626C9652623 @default.
W4387322626 hasConceptScore W4387322626C97541855 @default.
W4387322626 hasLocation W43873226261 @default.
W4387322626 hasOpenAccess W4387322626 @default.
W4387322626 hasPrimaryLocation W43873226261 @default.
W4387322626 hasRelatedWork W1525643724 @default.
W4387322626 hasRelatedWork W2302028273 @default.
W4387322626 hasRelatedWork W2364921833 @default.
W4387322626 hasRelatedWork W2961085424 @default.
W4387322626 hasRelatedWork W3074294383 @default.
W4387322626 hasRelatedWork W4206669594 @default.
W4387322626 hasRelatedWork W4285805438 @default.
W4387322626 hasRelatedWork W4294811414 @default.
W4387322626 hasRelatedWork W4319083788 @default.
W4387322626 hasRelatedWork W4323349240 @default.
W4387322626 isParatext "false" @default.
W4387322626 isRetracted "false" @default.
W4387322626 workType "article" @default.