Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387294204> ?p ?o ?g. }
Showing items 1 to 69 of
69
with 100 items per page.
- W4387294204 abstract "Large language models (LLMs) typically employ sampling or beam search, accompanied by prompts such as Chain-of-Thought (CoT), to boost reasoning and decoding ability. Recent work like Tree-of-Thought (ToT) and Reasoning via Planning (RAP) aim to augment the reasoning capabilities of LLMs by utilizing tree-search algorithms to guide multi-step reasoning. These methods mainly focus on LLMs' reasoning ability during inference and heavily rely on human-designed prompts to activate LLM as a value function, which lacks general applicability and scalability. To address these limitations, we present an AlphaZero-like tree-search framework for LLMs (termed TS-LLM), systematically illustrating how tree-search with a learned value function can guide LLMs' decoding ability. TS-LLM distinguishes itself in two key ways: (1) Leveraging a learned value function, our approach can be generally applied to different tasks beyond reasoning (such as RLHF alignment), and LLMs of any size, without prompting advanced, large-scale models. (2) It can guide LLM's decoding during both inference and training. Empirical evaluations across reasoning, planning, and RLHF alignment tasks validate the effectiveness of TS-LLM, even on trees with a depth of 64." @default.
- W4387294204 created "2023-10-03" @default.
- W4387294204 creator A5026082030 @default.
- W4387294204 creator A5042241049 @default.
- W4387294204 creator A5049802452 @default.
- W4387294204 creator A5077548608 @default.
- W4387294204 creator A5086665145 @default.
- W4387294204 creator A5090720315 @default.
- W4387294204 date "2023-09-29" @default.
- W4387294204 modified "2023-10-04" @default.
- W4387294204 title "Alphazero-like Tree-Search can Guide Large Language Model Decoding and Training" @default.
- W4387294204 doi "https://doi.org/10.48550/arxiv.2309.17179" @default.
- W4387294204 hasPublicationYear "2023" @default.
- W4387294204 type Work @default.
- W4387294204 citedByCount "0" @default.
- W4387294204 crossrefType "posted-content" @default.
- W4387294204 hasAuthorship W4387294204A5026082030 @default.
- W4387294204 hasAuthorship W4387294204A5042241049 @default.
- W4387294204 hasAuthorship W4387294204A5049802452 @default.
- W4387294204 hasAuthorship W4387294204A5077548608 @default.
- W4387294204 hasAuthorship W4387294204A5086665145 @default.
- W4387294204 hasAuthorship W4387294204A5090720315 @default.
- W4387294204 hasBestOaLocation W43872942041 @default.
- W4387294204 hasConcept C113174947 @default.
- W4387294204 hasConcept C11413529 @default.
- W4387294204 hasConcept C119857082 @default.
- W4387294204 hasConcept C134306372 @default.
- W4387294204 hasConcept C14036430 @default.
- W4387294204 hasConcept C154945302 @default.
- W4387294204 hasConcept C2776214188 @default.
- W4387294204 hasConcept C2776291640 @default.
- W4387294204 hasConcept C33923547 @default.
- W4387294204 hasConcept C41008148 @default.
- W4387294204 hasConcept C48044578 @default.
- W4387294204 hasConcept C57273362 @default.
- W4387294204 hasConcept C77088390 @default.
- W4387294204 hasConcept C78458016 @default.
- W4387294204 hasConcept C86803240 @default.
- W4387294204 hasConceptScore W4387294204C113174947 @default.
- W4387294204 hasConceptScore W4387294204C11413529 @default.
- W4387294204 hasConceptScore W4387294204C119857082 @default.
- W4387294204 hasConceptScore W4387294204C134306372 @default.
- W4387294204 hasConceptScore W4387294204C14036430 @default.
- W4387294204 hasConceptScore W4387294204C154945302 @default.
- W4387294204 hasConceptScore W4387294204C2776214188 @default.
- W4387294204 hasConceptScore W4387294204C2776291640 @default.
- W4387294204 hasConceptScore W4387294204C33923547 @default.
- W4387294204 hasConceptScore W4387294204C41008148 @default.
- W4387294204 hasConceptScore W4387294204C48044578 @default.
- W4387294204 hasConceptScore W4387294204C57273362 @default.
- W4387294204 hasConceptScore W4387294204C77088390 @default.
- W4387294204 hasConceptScore W4387294204C78458016 @default.
- W4387294204 hasConceptScore W4387294204C86803240 @default.
- W4387294204 hasLocation W43872942041 @default.
- W4387294204 hasOpenAccess W4387294204 @default.
- W4387294204 hasPrimaryLocation W43872942041 @default.
- W4387294204 hasRelatedWork W1525643724 @default.
- W4387294204 hasRelatedWork W2067938758 @default.
- W4387294204 hasRelatedWork W2302028273 @default.
- W4387294204 hasRelatedWork W2333420780 @default.
- W4387294204 hasRelatedWork W2358034992 @default.
- W4387294204 hasRelatedWork W2364921833 @default.
- W4387294204 hasRelatedWork W2367936931 @default.
- W4387294204 hasRelatedWork W2382623646 @default.
- W4387294204 hasRelatedWork W2961085424 @default.
- W4387294204 hasRelatedWork W3087771547 @default.
- W4387294204 isParatext "false" @default.
- W4387294204 isRetracted "false" @default.
- W4387294204 workType "article" @default.