Matches in SemOpenAlex for { <https://semopenalex.org/work/W4367189299> ?p ?o ?g. }
Showing items 1 to 61 of
61
with 100 items per page.
- W4367189299 abstract "Large language models (LLMs) excel in many tasks in 2023, but they still face challenges in complex reasoning. Theory-of-mind (ToM) tasks, which require understanding agents' beliefs, goals, and mental states, are essential for common-sense reasoning involving humans, making it crucial to enhance LLM performance in this area. This study measures the ToM performance of GPT-4 and three GPT-3.5 variants (Davinci-2, Davinci-3, GPT-3.5-Turbo), and investigates the effectiveness of in-context learning in improving their ToM comprehension. We evaluated prompts featuring two-shot chain of thought reasoning and step-by-step thinking instructions. We found that LLMs trained with Reinforcement Learning from Human Feedback (RLHF) (all models excluding Davinci-2) improved their ToM accuracy via in-context learning. GPT-4 performed best in zero-shot settings, reaching nearly 80% ToM accuracy, but still fell short of the 87% human accuracy on the test set. However, when supplied with prompts for in-context learning, all RLHF-trained LLMs exceeded 80% ToM accuracy, with GPT-4 reaching 100%. These results demonstrate that appropriate prompting enhances LLM ToM reasoning, and they underscore the context-dependent nature of LLM cognitive capacities." @default.
- W4367189299 created "2023-04-28" @default.
- W4367189299 creator A5019919905 @default.
- W4367189299 creator A5037309858 @default.
- W4367189299 date "2023-04-22" @default.
- W4367189299 modified "2023-10-01" @default.
- W4367189299 title "Boosting Theory-of-Mind Performance in Large Language Models via Prompting" @default.
- W4367189299 doi "https://doi.org/10.48550/arxiv.2304.11490" @default.
- W4367189299 hasPublicationYear "2023" @default.
- W4367189299 type Work @default.
- W4367189299 citedByCount "0" @default.
- W4367189299 crossrefType "posted-content" @default.
- W4367189299 hasAuthorship W4367189299A5019919905 @default.
- W4367189299 hasAuthorship W4367189299A5037309858 @default.
- W4367189299 hasBestOaLocation W43671892991 @default.
- W4367189299 hasConcept C151730666 @default.
- W4367189299 hasConcept C154945302 @default.
- W4367189299 hasConcept C15744967 @default.
- W4367189299 hasConcept C169760540 @default.
- W4367189299 hasConcept C169900460 @default.
- W4367189299 hasConcept C180747234 @default.
- W4367189299 hasConcept C188147891 @default.
- W4367189299 hasConcept C199360897 @default.
- W4367189299 hasConcept C2779343474 @default.
- W4367189299 hasConcept C2779560602 @default.
- W4367189299 hasConcept C41008148 @default.
- W4367189299 hasConcept C46686674 @default.
- W4367189299 hasConcept C511192102 @default.
- W4367189299 hasConcept C86803240 @default.
- W4367189299 hasConcept C97541855 @default.
- W4367189299 hasConceptScore W4367189299C151730666 @default.
- W4367189299 hasConceptScore W4367189299C154945302 @default.
- W4367189299 hasConceptScore W4367189299C15744967 @default.
- W4367189299 hasConceptScore W4367189299C169760540 @default.
- W4367189299 hasConceptScore W4367189299C169900460 @default.
- W4367189299 hasConceptScore W4367189299C180747234 @default.
- W4367189299 hasConceptScore W4367189299C188147891 @default.
- W4367189299 hasConceptScore W4367189299C199360897 @default.
- W4367189299 hasConceptScore W4367189299C2779343474 @default.
- W4367189299 hasConceptScore W4367189299C2779560602 @default.
- W4367189299 hasConceptScore W4367189299C41008148 @default.
- W4367189299 hasConceptScore W4367189299C46686674 @default.
- W4367189299 hasConceptScore W4367189299C511192102 @default.
- W4367189299 hasConceptScore W4367189299C86803240 @default.
- W4367189299 hasConceptScore W4367189299C97541855 @default.
- W4367189299 hasLocation W43671892991 @default.
- W4367189299 hasOpenAccess W4367189299 @default.
- W4367189299 hasPrimaryLocation W43671892991 @default.
- W4367189299 hasRelatedWork W1997301722 @default.
- W4367189299 hasRelatedWork W2046791759 @default.
- W4367189299 hasRelatedWork W2059742394 @default.
- W4367189299 hasRelatedWork W2079468847 @default.
- W4367189299 hasRelatedWork W2105897147 @default.
- W4367189299 hasRelatedWork W2134694734 @default.
- W4367189299 hasRelatedWork W2171282200 @default.
- W4367189299 hasRelatedWork W2595423808 @default.
- W4367189299 hasRelatedWork W4231534512 @default.
- W4367189299 hasRelatedWork W4367048644 @default.
- W4367189299 isParatext "false" @default.
- W4367189299 isRetracted "false" @default.
- W4367189299 workType "article" @default.