Matches in SemOpenAlex for { <https://semopenalex.org/work/W4379256047> ?p ?o ?g. }
Showing items 1 to 87 of
87
with 100 items per page.
- W4379256047 abstract "Aligning language models (LMs) with preferences is an important problem in natural language generation. A key challenge is that preferences are typically provided at the *sequence level* while LM training and generation both occur at the *token level*. There is, therefore, a *granularity mismatch* between the preference and the LM training losses, which may complicate the learning problem. In this paper, we address this issue by developing an alternate training process, where we iterate between grounding the sequence-level preference into token-level training guidance, and improving the LM with the learned guidance. For guidance learning, we design a framework that extends the pairwise-preference learning in imitation learning to both variable-length LM generation and the utilization of the preference among multiple generations. For LM training, based on the amount of supervised data, we present two *minimalist* learning objectives that utilize the learned guidance. In experiments, our method performs competitively on two distinct representative LM tasks -- discrete-prompt generation and text summarization." @default.
- W4379256047 created "2023-06-04" @default.
- W4379256047 creator A5010394308 @default.
- W4379256047 creator A5015122283 @default.
- W4379256047 creator A5032046813 @default.
- W4379256047 creator A5036611759 @default.
- W4379256047 creator A5056212977 @default.
- W4379256047 creator A5067027007 @default.
- W4379256047 date "2023-06-01" @default.
- W4379256047 modified "2023-10-13" @default.
- W4379256047 title "Preference-grounded Token-level Guidance for Language Model Fine-tuning" @default.
- W4379256047 doi "https://doi.org/10.48550/arxiv.2306.00398" @default.
- W4379256047 hasPublicationYear "2023" @default.
- W4379256047 type Work @default.
- W4379256047 citedByCount "0" @default.
- W4379256047 crossrefType "posted-content" @default.
- W4379256047 hasAuthorship W4379256047A5010394308 @default.
- W4379256047 hasAuthorship W4379256047A5015122283 @default.
- W4379256047 hasAuthorship W4379256047A5032046813 @default.
- W4379256047 hasAuthorship W4379256047A5036611759 @default.
- W4379256047 hasAuthorship W4379256047A5056212977 @default.
- W4379256047 hasAuthorship W4379256047A5067027007 @default.
- W4379256047 hasBestOaLocation W43792560471 @default.
- W4379256047 hasConcept C105795698 @default.
- W4379256047 hasConcept C111919701 @default.
- W4379256047 hasConcept C119857082 @default.
- W4379256047 hasConcept C126388530 @default.
- W4379256047 hasConcept C154945302 @default.
- W4379256047 hasConcept C15744967 @default.
- W4379256047 hasConcept C170858558 @default.
- W4379256047 hasConcept C181204326 @default.
- W4379256047 hasConcept C184898388 @default.
- W4379256047 hasConcept C195324797 @default.
- W4379256047 hasConcept C204321447 @default.
- W4379256047 hasConcept C26517878 @default.
- W4379256047 hasConcept C2776187449 @default.
- W4379256047 hasConcept C2778112365 @default.
- W4379256047 hasConcept C2781249084 @default.
- W4379256047 hasConcept C33923547 @default.
- W4379256047 hasConcept C38652104 @default.
- W4379256047 hasConcept C40506919 @default.
- W4379256047 hasConcept C41008148 @default.
- W4379256047 hasConcept C48145219 @default.
- W4379256047 hasConcept C54355233 @default.
- W4379256047 hasConcept C77805123 @default.
- W4379256047 hasConcept C86803240 @default.
- W4379256047 hasConcept C98045186 @default.
- W4379256047 hasConceptScore W4379256047C105795698 @default.
- W4379256047 hasConceptScore W4379256047C111919701 @default.
- W4379256047 hasConceptScore W4379256047C119857082 @default.
- W4379256047 hasConceptScore W4379256047C126388530 @default.
- W4379256047 hasConceptScore W4379256047C154945302 @default.
- W4379256047 hasConceptScore W4379256047C15744967 @default.
- W4379256047 hasConceptScore W4379256047C170858558 @default.
- W4379256047 hasConceptScore W4379256047C181204326 @default.
- W4379256047 hasConceptScore W4379256047C184898388 @default.
- W4379256047 hasConceptScore W4379256047C195324797 @default.
- W4379256047 hasConceptScore W4379256047C204321447 @default.
- W4379256047 hasConceptScore W4379256047C26517878 @default.
- W4379256047 hasConceptScore W4379256047C2776187449 @default.
- W4379256047 hasConceptScore W4379256047C2778112365 @default.
- W4379256047 hasConceptScore W4379256047C2781249084 @default.
- W4379256047 hasConceptScore W4379256047C33923547 @default.
- W4379256047 hasConceptScore W4379256047C38652104 @default.
- W4379256047 hasConceptScore W4379256047C40506919 @default.
- W4379256047 hasConceptScore W4379256047C41008148 @default.
- W4379256047 hasConceptScore W4379256047C48145219 @default.
- W4379256047 hasConceptScore W4379256047C54355233 @default.
- W4379256047 hasConceptScore W4379256047C77805123 @default.
- W4379256047 hasConceptScore W4379256047C86803240 @default.
- W4379256047 hasConceptScore W4379256047C98045186 @default.
- W4379256047 hasLocation W43792560471 @default.
- W4379256047 hasOpenAccess W4379256047 @default.
- W4379256047 hasPrimaryLocation W43792560471 @default.
- W4379256047 hasRelatedWork W1495108544 @default.
- W4379256047 hasRelatedWork W1517524280 @default.
- W4379256047 hasRelatedWork W2091301346 @default.
- W4379256047 hasRelatedWork W2150160875 @default.
- W4379256047 hasRelatedWork W2366403280 @default.
- W4379256047 hasRelatedWork W3111372071 @default.
- W4379256047 hasRelatedWork W3148229873 @default.
- W4379256047 hasRelatedWork W4242223894 @default.
- W4379256047 hasRelatedWork W4306886878 @default.
- W4379256047 hasRelatedWork W4323520239 @default.
- W4379256047 isParatext "false" @default.
- W4379256047 isRetracted "false" @default.
- W4379256047 workType "article" @default.