Matches in SemOpenAlex for { <https://semopenalex.org/work/W4378474098> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W4378474098 abstract "The information stored in large language models (LLMs) falls out of date quickly, and retraining from scratch is often not an option. This has recently given rise to a range of techniques for injecting new facts through updating model weights. Current evaluation paradigms are extremely limited, mainly validating the recall of edited facts, but changing one fact should cause rippling changes to the model's related beliefs. If we edit the UK Prime Minister to now be Rishi Sunak, then we should get a different answer to Who is married to the British Prime Minister? In this work, we present a benchmark MQuAKE (Multi-hop Question Answering for Knowledge Editing) comprising multi-hop questions that assess whether edited models correctly answer questions where the answer should change as an entailed consequence of edited facts. While we find that current knowledge-editing approaches can recall edited facts accurately, they fail catastrophically on the constructed multi-hop questions. We thus propose a simple memory-based approach, MeLLo, which stores all edited facts externally while prompting the language model iteratively to generate answers that are consistent with the edited facts. While MQuAKE remains challenging, we show that MeLLo scales well with LLMs (up to 175B) and outperforms previous model editors by a large margin." @default.
- W4378474098 created "2023-05-27" @default.
- W4378474098 creator A5018694855 @default.
- W4378474098 creator A5034055494 @default.
- W4378474098 creator A5042601761 @default.
- W4378474098 creator A5046006076 @default.
- W4378474098 creator A5051064208 @default.
- W4378474098 date "2023-05-24" @default.
- W4378474098 modified "2023-09-25" @default.
- W4378474098 title "MQuAKE: Assessing Knowledge Editing in Language Models via Multi-Hop Questions" @default.
- W4378474098 doi "https://doi.org/10.48550/arxiv.2305.14795" @default.
- W4378474098 hasPublicationYear "2023" @default.
- W4378474098 type Work @default.
- W4378474098 citedByCount "0" @default.
- W4378474098 crossrefType "posted-content" @default.
- W4378474098 hasAuthorship W4378474098A5018694855 @default.
- W4378474098 hasAuthorship W4378474098A5034055494 @default.
- W4378474098 hasAuthorship W4378474098A5042601761 @default.
- W4378474098 hasAuthorship W4378474098A5046006076 @default.
- W4378474098 hasAuthorship W4378474098A5051064208 @default.
- W4378474098 hasBestOaLocation W43784740981 @default.
- W4378474098 hasConcept C100660578 @default.
- W4378474098 hasConcept C119857082 @default.
- W4378474098 hasConcept C137293760 @default.
- W4378474098 hasConcept C154945302 @default.
- W4378474098 hasConcept C15744967 @default.
- W4378474098 hasConcept C17744445 @default.
- W4378474098 hasConcept C180747234 @default.
- W4378474098 hasConcept C188147891 @default.
- W4378474098 hasConcept C199360897 @default.
- W4378474098 hasConcept C199539241 @default.
- W4378474098 hasConcept C204321447 @default.
- W4378474098 hasConcept C25906391 @default.
- W4378474098 hasConcept C2781235140 @default.
- W4378474098 hasConcept C2993486354 @default.
- W4378474098 hasConcept C31258907 @default.
- W4378474098 hasConcept C41008148 @default.
- W4378474098 hasConcept C44291984 @default.
- W4378474098 hasConcept C774472 @default.
- W4378474098 hasConcept C94625758 @default.
- W4378474098 hasConceptScore W4378474098C100660578 @default.
- W4378474098 hasConceptScore W4378474098C119857082 @default.
- W4378474098 hasConceptScore W4378474098C137293760 @default.
- W4378474098 hasConceptScore W4378474098C154945302 @default.
- W4378474098 hasConceptScore W4378474098C15744967 @default.
- W4378474098 hasConceptScore W4378474098C17744445 @default.
- W4378474098 hasConceptScore W4378474098C180747234 @default.
- W4378474098 hasConceptScore W4378474098C188147891 @default.
- W4378474098 hasConceptScore W4378474098C199360897 @default.
- W4378474098 hasConceptScore W4378474098C199539241 @default.
- W4378474098 hasConceptScore W4378474098C204321447 @default.
- W4378474098 hasConceptScore W4378474098C25906391 @default.
- W4378474098 hasConceptScore W4378474098C2781235140 @default.
- W4378474098 hasConceptScore W4378474098C2993486354 @default.
- W4378474098 hasConceptScore W4378474098C31258907 @default.
- W4378474098 hasConceptScore W4378474098C41008148 @default.
- W4378474098 hasConceptScore W4378474098C44291984 @default.
- W4378474098 hasConceptScore W4378474098C774472 @default.
- W4378474098 hasConceptScore W4378474098C94625758 @default.
- W4378474098 hasLocation W43784740981 @default.
- W4378474098 hasOpenAccess W4378474098 @default.
- W4378474098 hasPrimaryLocation W43784740981 @default.
- W4378474098 hasRelatedWork W128392744 @default.
- W4378474098 hasRelatedWork W207304934 @default.
- W4378474098 hasRelatedWork W20999564 @default.
- W4378474098 hasRelatedWork W2747680751 @default.
- W4378474098 hasRelatedWork W3107474891 @default.
- W4378474098 hasRelatedWork W3207693618 @default.
- W4378474098 hasRelatedWork W3217585508 @default.
- W4378474098 hasRelatedWork W4288334313 @default.
- W4378474098 hasRelatedWork W4377703168 @default.
- W4378474098 hasRelatedWork W2972820682 @default.
- W4378474098 isParatext "false" @default.
- W4378474098 isRetracted "false" @default.
- W4378474098 workType "article" @default.