Matches in SemOpenAlex for { <https://semopenalex.org/work/W2891543607> ?p ?o ?g. }
- W2891543607 abstract "The task of multi-step ahead prediction in language models is challenging considering the discrepancy between training and testing. At test time, a language model is required to make predictions given past predictions as input, instead of the past targets that are provided during training. This difference, known as exposure bias, can lead to the compounding of errors along a generated sequence at test time. In order to improve generalization in neural language models and address compounding errors, we propose a curriculum learning based method that gradually changes an initially deterministic teacher policy to a gradually more stochastic policy, which we refer to as textit{Nearest-Neighbor Replacement Sampling}. A chosen input at a given timestep is replaced with a sampled nearest neighbor of the past target with a truncated probability proportional to the cosine similarity between the original word and its top $k$ most similar words. This allows the teacher to explore alternatives when the teacher provides a sub-optimal policy or when the initial policy is difficult for the learner to model. The proposed strategy is straightforward, online and requires little additional memory requirements. We report our main findings on two language modelling benchmarks and find that the proposed approach performs particularly well when used in conjunction with scheduled sampling, that too attempts to mitigate compounding errors in language models." @default.
- W2891543607 created "2018-09-27" @default.
- W2891543607 creator A5073503574 @default.
- W2891543607 creator A5083742715 @default.
- W2891543607 date "2018-09-16" @default.
- W2891543607 modified "2023-10-17" @default.
- W2891543607 title "Curriculum-Based Neighborhood Sampling For Sequence Prediction." @default.
- W2891543607 cites W112666333 @default.
- W2891543607 cites W179875071 @default.
- W2891543607 cites W1821462560 @default.
- W2891543607 cites W2100677568 @default.
- W2891543607 cites W2119717200 @default.
- W2891543607 cites W2152588577 @default.
- W2891543607 cites W2174424190 @default.
- W2891543607 cites W2176263492 @default.
- W2891543607 cites W2487501366 @default.
- W2891543607 cites W2508728158 @default.
- W2891543607 cites W2578330760 @default.
- W2891543607 cites W2593634001 @default.
- W2891543607 cites W2785523195 @default.
- W2891543607 cites W2949888546 @default.
- W2891543607 cites W2950304420 @default.
- W2891543607 cites W2952840881 @default.
- W2891543607 cites W2962957031 @default.
- W2891543607 hasPublicationYear "2018" @default.
- W2891543607 type Work @default.
- W2891543607 sameAs 2891543607 @default.
- W2891543607 citedByCount "1" @default.
- W2891543607 countsByYear W28915436072018 @default.
- W2891543607 crossrefType "posted-content" @default.
- W2891543607 hasAuthorship W2891543607A5073503574 @default.
- W2891543607 hasAuthorship W2891543607A5083742715 @default.
- W2891543607 hasConcept C103278499 @default.
- W2891543607 hasConcept C106131492 @default.
- W2891543607 hasConcept C113238511 @default.
- W2891543607 hasConcept C115961682 @default.
- W2891543607 hasConcept C119857082 @default.
- W2891543607 hasConcept C134306372 @default.
- W2891543607 hasConcept C137293760 @default.
- W2891543607 hasConcept C140779682 @default.
- W2891543607 hasConcept C154945302 @default.
- W2891543607 hasConcept C162324750 @default.
- W2891543607 hasConcept C177148314 @default.
- W2891543607 hasConcept C185592680 @default.
- W2891543607 hasConcept C187736073 @default.
- W2891543607 hasConcept C198531522 @default.
- W2891543607 hasConcept C2524010 @default.
- W2891543607 hasConcept C2778112365 @default.
- W2891543607 hasConcept C2780451532 @default.
- W2891543607 hasConcept C31972630 @default.
- W2891543607 hasConcept C33923547 @default.
- W2891543607 hasConcept C41008148 @default.
- W2891543607 hasConcept C43617362 @default.
- W2891543607 hasConcept C54355233 @default.
- W2891543607 hasConcept C86803240 @default.
- W2891543607 hasConcept C90805587 @default.
- W2891543607 hasConceptScore W2891543607C103278499 @default.
- W2891543607 hasConceptScore W2891543607C106131492 @default.
- W2891543607 hasConceptScore W2891543607C113238511 @default.
- W2891543607 hasConceptScore W2891543607C115961682 @default.
- W2891543607 hasConceptScore W2891543607C119857082 @default.
- W2891543607 hasConceptScore W2891543607C134306372 @default.
- W2891543607 hasConceptScore W2891543607C137293760 @default.
- W2891543607 hasConceptScore W2891543607C140779682 @default.
- W2891543607 hasConceptScore W2891543607C154945302 @default.
- W2891543607 hasConceptScore W2891543607C162324750 @default.
- W2891543607 hasConceptScore W2891543607C177148314 @default.
- W2891543607 hasConceptScore W2891543607C185592680 @default.
- W2891543607 hasConceptScore W2891543607C187736073 @default.
- W2891543607 hasConceptScore W2891543607C198531522 @default.
- W2891543607 hasConceptScore W2891543607C2524010 @default.
- W2891543607 hasConceptScore W2891543607C2778112365 @default.
- W2891543607 hasConceptScore W2891543607C2780451532 @default.
- W2891543607 hasConceptScore W2891543607C31972630 @default.
- W2891543607 hasConceptScore W2891543607C33923547 @default.
- W2891543607 hasConceptScore W2891543607C41008148 @default.
- W2891543607 hasConceptScore W2891543607C43617362 @default.
- W2891543607 hasConceptScore W2891543607C54355233 @default.
- W2891543607 hasConceptScore W2891543607C86803240 @default.
- W2891543607 hasConceptScore W2891543607C90805587 @default.
- W2891543607 hasLocation W28915436071 @default.
- W2891543607 hasOpenAccess W2891543607 @default.
- W2891543607 hasPrimaryLocation W28915436071 @default.
- W2891543607 hasRelatedWork W1521579603 @default.
- W2891543607 hasRelatedWork W2140407154 @default.
- W2891543607 hasRelatedWork W2151924253 @default.
- W2891543607 hasRelatedWork W2236209865 @default.
- W2891543607 hasRelatedWork W2484734762 @default.
- W2891543607 hasRelatedWork W2553246593 @default.
- W2891543607 hasRelatedWork W2795900505 @default.
- W2891543607 hasRelatedWork W2828541159 @default.
- W2891543607 hasRelatedWork W2888738308 @default.
- W2891543607 hasRelatedWork W2906094846 @default.
- W2891543607 hasRelatedWork W2965018299 @default.
- W2891543607 hasRelatedWork W2970789589 @default.
- W2891543607 hasRelatedWork W2972449768 @default.
- W2891543607 hasRelatedWork W2991952982 @default.
- W2891543607 hasRelatedWork W3011423783 @default.
- W2891543607 hasRelatedWork W3046140942 @default.
- W2891543607 hasRelatedWork W3115278650 @default.