Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385890378> ?p ?o ?g. }
Showing items 1 to 71 of
71
with 100 items per page.
- W4385890378 abstract "Recent progress in large language models (LLMs) like GPT-4 and PaLM-2 has brought significant advancements in addressing math reasoning problems. In particular, OpenAI's latest version of GPT-4, known as GPT-4 Code Interpreter, shows remarkable performance on challenging math datasets. In this paper, we explore the effect of code on enhancing LLMs' reasoning capability by introducing different constraints on the textit{Code Usage Frequency} of GPT-4 Code Interpreter. We found that its success can be largely attributed to its powerful skills in generating and executing code, evaluating the output of code execution, and rectifying its solution when receiving unreasonable outputs. Based on this insight, we propose a novel and effective prompting method, explicit uline{c}ode-based uline{s}elf-uline{v}erification~(CSV), to further boost the mathematical reasoning potential of GPT-4 Code Interpreter. This method employs a zero-shot prompt on GPT-4 Code Interpreter to encourage it to use code to self-verify its answers. In instances where the verification state registers as ``False'', the model shall automatically amend its solution, analogous to our approach of rectifying errors during a mathematics examination. Furthermore, we recognize that the states of the verification result indicate the confidence of a solution, which can improve the effectiveness of majority voting. With GPT-4 Code Interpreter and CSV, we achieve an impressive zero-shot accuracy on MATH dataset textbf{(53.9% $to$ 84.3%)}." @default.
- W4385890378 created "2023-08-17" @default.
- W4385890378 creator A5026906414 @default.
- W4385890378 creator A5032699291 @default.
- W4385890378 creator A5035185924 @default.
- W4385890378 creator A5036568215 @default.
- W4385890378 creator A5041431476 @default.
- W4385890378 creator A5054491507 @default.
- W4385890378 creator A5061386830 @default.
- W4385890378 creator A5065073978 @default.
- W4385890378 creator A5069973483 @default.
- W4385890378 creator A5077104218 @default.
- W4385890378 creator A5092801520 @default.
- W4385890378 date "2023-08-15" @default.
- W4385890378 modified "2023-10-16" @default.
- W4385890378 title "Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification" @default.
- W4385890378 doi "https://doi.org/10.48550/arxiv.2308.07921" @default.
- W4385890378 hasPublicationYear "2023" @default.
- W4385890378 type Work @default.
- W4385890378 citedByCount "0" @default.
- W4385890378 crossrefType "posted-content" @default.
- W4385890378 hasAuthorship W4385890378A5026906414 @default.
- W4385890378 hasAuthorship W4385890378A5032699291 @default.
- W4385890378 hasAuthorship W4385890378A5035185924 @default.
- W4385890378 hasAuthorship W4385890378A5036568215 @default.
- W4385890378 hasAuthorship W4385890378A5041431476 @default.
- W4385890378 hasAuthorship W4385890378A5054491507 @default.
- W4385890378 hasAuthorship W4385890378A5061386830 @default.
- W4385890378 hasAuthorship W4385890378A5065073978 @default.
- W4385890378 hasAuthorship W4385890378A5069973483 @default.
- W4385890378 hasAuthorship W4385890378A5077104218 @default.
- W4385890378 hasAuthorship W4385890378A5092801520 @default.
- W4385890378 hasBestOaLocation W43858903781 @default.
- W4385890378 hasConcept C11413529 @default.
- W4385890378 hasConcept C122783720 @default.
- W4385890378 hasConcept C177264268 @default.
- W4385890378 hasConcept C199360897 @default.
- W4385890378 hasConcept C2524010 @default.
- W4385890378 hasConcept C2776760102 @default.
- W4385890378 hasConcept C33923547 @default.
- W4385890378 hasConcept C41008148 @default.
- W4385890378 hasConcept C80444323 @default.
- W4385890378 hasConcept C90805587 @default.
- W4385890378 hasConcept C94375191 @default.
- W4385890378 hasConceptScore W4385890378C11413529 @default.
- W4385890378 hasConceptScore W4385890378C122783720 @default.
- W4385890378 hasConceptScore W4385890378C177264268 @default.
- W4385890378 hasConceptScore W4385890378C199360897 @default.
- W4385890378 hasConceptScore W4385890378C2524010 @default.
- W4385890378 hasConceptScore W4385890378C2776760102 @default.
- W4385890378 hasConceptScore W4385890378C33923547 @default.
- W4385890378 hasConceptScore W4385890378C41008148 @default.
- W4385890378 hasConceptScore W4385890378C80444323 @default.
- W4385890378 hasConceptScore W4385890378C90805587 @default.
- W4385890378 hasConceptScore W4385890378C94375191 @default.
- W4385890378 hasLocation W43858903781 @default.
- W4385890378 hasOpenAccess W4385890378 @default.
- W4385890378 hasPrimaryLocation W43858903781 @default.
- W4385890378 hasRelatedWork W1515791128 @default.
- W4385890378 hasRelatedWork W2018297885 @default.
- W4385890378 hasRelatedWork W2088766201 @default.
- W4385890378 hasRelatedWork W2161646044 @default.
- W4385890378 hasRelatedWork W2166247150 @default.
- W4385890378 hasRelatedWork W2353965302 @default.
- W4385890378 hasRelatedWork W2384847609 @default.
- W4385890378 hasRelatedWork W4243252198 @default.
- W4385890378 hasRelatedWork W4245681215 @default.
- W4385890378 hasRelatedWork W83884855 @default.
- W4385890378 isParatext "false" @default.
- W4385890378 isRetracted "false" @default.
- W4385890378 workType "article" @default.