SemOpenAlex |

SemOpenAlex

Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387156634> ?p ?o ?g. }

Showing items 1 to 73 of 73 with 100 items per page.

W4387156634 abstract "Recent years have witnessed remarkable progress made in large language models (LLMs). Such advancements, while garnering significant attention, have concurrently elicited various concerns. The potential of these models is undeniably vast; however, they may yield texts that are imprecise, misleading, or even detrimental. Consequently, it becomes paramount to employ alignment techniques to ensure these models to exhibit behaviors consistent with human values. This survey endeavors to furnish an extensive exploration of alignment methodologies designed for LLMs, in conjunction with the extant capability research in this domain. Adopting the lens of AI alignment, we categorize the prevailing methods and emergent proposals for the alignment of LLMs into outer and inner alignment. We also probe into salient issues including the models' interpretability, and potential vulnerabilities to adversarial attacks. To assess LLM alignment, we present a wide variety of benchmarks and evaluation methodologies. After discussing the state of alignment research for LLMs, we finally cast a vision toward the future, contemplating the promising avenues of research that lie ahead. Our aspiration for this survey extends beyond merely spurring research interests in this realm. We also envision bridging the gap between the AI alignment research community and the researchers engrossed in the capability exploration of LLMs for both capable and safe LLMs." @default.
W4387156634 created "2023-09-30" @default.
W4387156634 creator A5005792080 @default.
W4387156634 creator A5021293751 @default.
W4387156634 creator A5050068243 @default.
W4387156634 creator A5055232825 @default.
W4387156634 creator A5060381970 @default.
W4387156634 creator A5068887242 @default.
W4387156634 creator A5071712748 @default.
W4387156634 creator A5074106607 @default.
W4387156634 creator A5086903789 @default.
W4387156634 date "2023-09-26" @default.
W4387156634 modified "2023-10-17" @default.
W4387156634 title "Large Language Model Alignment: A Survey" @default.
W4387156634 doi "https://doi.org/10.48550/arxiv.2309.15025" @default.
W4387156634 hasPublicationYear "2023" @default.
W4387156634 type Work @default.
W4387156634 citedByCount "0" @default.
W4387156634 crossrefType "posted-content" @default.
W4387156634 hasAuthorship W4387156634A5005792080 @default.
W4387156634 hasAuthorship W4387156634A5021293751 @default.
W4387156634 hasAuthorship W4387156634A5050068243 @default.
W4387156634 hasAuthorship W4387156634A5055232825 @default.
W4387156634 hasAuthorship W4387156634A5060381970 @default.
W4387156634 hasAuthorship W4387156634A5068887242 @default.
W4387156634 hasAuthorship W4387156634A5071712748 @default.
W4387156634 hasAuthorship W4387156634A5074106607 @default.
W4387156634 hasAuthorship W4387156634A5086903789 @default.
W4387156634 hasBestOaLocation W43871566341 @default.
W4387156634 hasConcept C127413603 @default.
W4387156634 hasConcept C136197465 @default.
W4387156634 hasConcept C154945302 @default.
W4387156634 hasConcept C17744445 @default.
W4387156634 hasConcept C199539241 @default.
W4387156634 hasConcept C2522767166 @default.
W4387156634 hasConcept C2778757428 @default.
W4387156634 hasConcept C2780719617 @default.
W4387156634 hasConcept C2781067378 @default.
W4387156634 hasConcept C37736160 @default.
W4387156634 hasConcept C41008148 @default.
W4387156634 hasConcept C539667460 @default.
W4387156634 hasConcept C55587333 @default.
W4387156634 hasConcept C94124525 @default.
W4387156634 hasConceptScore W4387156634C127413603 @default.
W4387156634 hasConceptScore W4387156634C136197465 @default.
W4387156634 hasConceptScore W4387156634C154945302 @default.
W4387156634 hasConceptScore W4387156634C17744445 @default.
W4387156634 hasConceptScore W4387156634C199539241 @default.
W4387156634 hasConceptScore W4387156634C2522767166 @default.
W4387156634 hasConceptScore W4387156634C2778757428 @default.
W4387156634 hasConceptScore W4387156634C2780719617 @default.
W4387156634 hasConceptScore W4387156634C2781067378 @default.
W4387156634 hasConceptScore W4387156634C37736160 @default.
W4387156634 hasConceptScore W4387156634C41008148 @default.
W4387156634 hasConceptScore W4387156634C539667460 @default.
W4387156634 hasConceptScore W4387156634C55587333 @default.
W4387156634 hasConceptScore W4387156634C94124525 @default.
W4387156634 hasLocation W43871566341 @default.
W4387156634 hasOpenAccess W4387156634 @default.
W4387156634 hasPrimaryLocation W43871566341 @default.
W4387156634 hasRelatedWork W2100224582 @default.
W4387156634 hasRelatedWork W2736015634 @default.
W4387156634 hasRelatedWork W2748952813 @default.
W4387156634 hasRelatedWork W2899084033 @default.
W4387156634 hasRelatedWork W2901810203 @default.
W4387156634 hasRelatedWork W2924512726 @default.
W4387156634 hasRelatedWork W3124582951 @default.
W4387156634 hasRelatedWork W3198184493 @default.
W4387156634 hasRelatedWork W4283759846 @default.
W4387156634 hasRelatedWork W4384648009 @default.
W4387156634 isParatext "false" @default.
W4387156634 isRetracted "false" @default.
W4387156634 workType "article" @default.