Matches in SemOpenAlex for { <https://semopenalex.org/work/W4324016655> ?p ?o ?g. }
Showing items 1 to 53 of
53
with 100 items per page.
- W4324016655 abstract "Recent research demonstrates the effectiveness of using fine-tuned language models~(LM) for dense retrieval. However, dense retrievers are hard to train, typically requiring heavily engineered fine-tuning pipelines to realize their full potential. In this paper, we identify and address two underlying problems of dense retrievers: i)~fragility to training data noise and ii)~requiring large batches to robustly learn the embedding space. We use the recently proposed Condenser pre-training architecture, which learns to condense information into the dense vector through LM pre-training. On top of it, we propose coCondenser, which adds an unsupervised corpus-level contrastive loss to warm up the passage embedding space. Retrieval experiments on MS-MARCO, Natural Question, and Trivia QA datasets show that coCondenser removes the need for heavy data engineering such as augmentation, synthesis, or filtering, as well as the need for large batch training. It shows comparable performance to RocketQA, a state-of-the-art, heavily engineered system, using simple small batch fine-tuning." @default.
- W4324016655 created "2023-03-14" @default.
- W4324016655 creator A5009879041 @default.
- W4324016655 creator A5060885692 @default.
- W4324016655 date "2021-08-12" @default.
- W4324016655 modified "2023-09-27" @default.
- W4324016655 title "Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval" @default.
- W4324016655 doi "https://doi.org/10.48550/arxiv.2108.05540" @default.
- W4324016655 hasPublicationYear "2021" @default.
- W4324016655 type Work @default.
- W4324016655 citedByCount "0" @default.
- W4324016655 crossrefType "posted-content" @default.
- W4324016655 hasAuthorship W4324016655A5009879041 @default.
- W4324016655 hasAuthorship W4324016655A5060885692 @default.
- W4324016655 hasBestOaLocation W43240166551 @default.
- W4324016655 hasConcept C111919701 @default.
- W4324016655 hasConcept C119857082 @default.
- W4324016655 hasConcept C121332964 @default.
- W4324016655 hasConcept C137293760 @default.
- W4324016655 hasConcept C153294291 @default.
- W4324016655 hasConcept C154945302 @default.
- W4324016655 hasConcept C204321447 @default.
- W4324016655 hasConcept C2777211547 @default.
- W4324016655 hasConcept C2778572836 @default.
- W4324016655 hasConcept C41008148 @default.
- W4324016655 hasConcept C41608201 @default.
- W4324016655 hasConceptScore W4324016655C111919701 @default.
- W4324016655 hasConceptScore W4324016655C119857082 @default.
- W4324016655 hasConceptScore W4324016655C121332964 @default.
- W4324016655 hasConceptScore W4324016655C137293760 @default.
- W4324016655 hasConceptScore W4324016655C153294291 @default.
- W4324016655 hasConceptScore W4324016655C154945302 @default.
- W4324016655 hasConceptScore W4324016655C204321447 @default.
- W4324016655 hasConceptScore W4324016655C2777211547 @default.
- W4324016655 hasConceptScore W4324016655C2778572836 @default.
- W4324016655 hasConceptScore W4324016655C41008148 @default.
- W4324016655 hasConceptScore W4324016655C41608201 @default.
- W4324016655 hasLocation W43240166551 @default.
- W4324016655 hasOpenAccess W4324016655 @default.
- W4324016655 hasPrimaryLocation W43240166551 @default.
- W4324016655 hasRelatedWork W142374489 @default.
- W4324016655 hasRelatedWork W1538473846 @default.
- W4324016655 hasRelatedWork W1563618553 @default.
- W4324016655 hasRelatedWork W1569841287 @default.
- W4324016655 hasRelatedWork W1803932089 @default.
- W4324016655 hasRelatedWork W2148757832 @default.
- W4324016655 hasRelatedWork W2351428524 @default.
- W4324016655 hasRelatedWork W2359001871 @default.
- W4324016655 hasRelatedWork W3107474891 @default.
- W4324016655 hasRelatedWork W61293283 @default.
- W4324016655 isParatext "false" @default.
- W4324016655 isRetracted "false" @default.
- W4324016655 workType "article" @default.