Matches in SemOpenAlex for { <https://semopenalex.org/work/W4288072953> ?p ?o ?g. }
Showing items 1 to 98 of
98
with 100 items per page.
- W4288072953 abstract "Designing pre-training objectives that more closely resemble the downstream tasks for pre-trained language models can lead to better performance at the fine-tuning stage, especially in the ad-hoc retrieval area. Existing pre-training approaches tailored for IR tried to incorporate weak supervised signals, such as query-likelihood based sampling, to construct pseudo query-document pairs from the raw textual corpus. However, these signals rely heavily on the sampling method. For example, the query likelihood model may lead to much noise in the constructed pre-training data. blfootnote{$dagger$ This work was done during an internship at Huawei.} In this paper, we propose to leverage the large-scale hyperlinks and anchor texts to pre-train the language model for ad-hoc retrieval. Since the anchor texts are created by webmasters and can usually summarize the target document, it can help to build more accurate and reliable pre-training samples than a specific algorithm. Considering different views of the downstream ad-hoc retrieval, we devise four pre-training tasks based on the hyperlinks. We then pre-train the Transformer model to predict the pair-wise preference, jointly with the Masked Language Model objective. Experimental results on two large-scale ad-hoc retrieval datasets show the significant improvement of our model compared with the existing methods." @default.
- W4288072953 created "2022-07-28" @default.
- W4288072953 creator A5000839824 @default.
- W4288072953 creator A5002846606 @default.
- W4288072953 creator A5010558184 @default.
- W4288072953 creator A5025631695 @default.
- W4288072953 creator A5026098526 @default.
- W4288072953 creator A5044171766 @default.
- W4288072953 creator A5046173951 @default.
- W4288072953 date "2021-10-26" @default.
- W4288072953 modified "2023-10-14" @default.
- W4288072953 title "Pre-training for Ad-hoc Retrieval" @default.
- W4288072953 cites W1489893579 @default.
- W4288072953 cites W1755289444 @default.
- W4288072953 cites W1863141190 @default.
- W4288072953 cites W1973430024 @default.
- W4288072953 cites W1982889956 @default.
- W4288072953 cites W2014415866 @default.
- W4288072953 cites W2025356973 @default.
- W4288072953 cites W2078560217 @default.
- W4288072953 cites W2080825533 @default.
- W4288072953 cites W2089098674 @default.
- W4288072953 cites W2100176907 @default.
- W4288072953 cites W2111713978 @default.
- W4288072953 cites W2116930689 @default.
- W4288072953 cites W2119119231 @default.
- W4288072953 cites W2136542423 @default.
- W4288072953 cites W2144578941 @default.
- W4288072953 cites W2159665776 @default.
- W4288072953 cites W2171161922 @default.
- W4288072953 cites W2171710828 @default.
- W4288072953 cites W2270070752 @default.
- W4288072953 cites W2536015822 @default.
- W4288072953 cites W2539671052 @default.
- W4288072953 cites W2610935556 @default.
- W4288072953 cites W2648699835 @default.
- W4288072953 cites W2945127593 @default.
- W4288072953 cites W2951434086 @default.
- W4288072953 cites W2953356739 @default.
- W4288072953 cites W2962739339 @default.
- W4288072953 cites W2979826702 @default.
- W4288072953 cites W2998007616 @default.
- W4288072953 cites W3034751553 @default.
- W4288072953 cites W3035089734 @default.
- W4288072953 cites W3093709503 @default.
- W4288072953 cites W3098417575 @default.
- W4288072953 cites W3115195983 @default.
- W4288072953 cites W3144312515 @default.
- W4288072953 cites W3147292006 @default.
- W4288072953 cites W3152562554 @default.
- W4288072953 cites W3153912254 @default.
- W4288072953 cites W3155865710 @default.
- W4288072953 doi "https://doi.org/10.1145/3459637.3482286" @default.
- W4288072953 hasPublicationYear "2021" @default.
- W4288072953 type Work @default.
- W4288072953 citedByCount "10" @default.
- W4288072953 countsByYear W42880729532022 @default.
- W4288072953 countsByYear W42880729532023 @default.
- W4288072953 crossrefType "proceedings-article" @default.
- W4288072953 hasAuthorship W4288072953A5000839824 @default.
- W4288072953 hasAuthorship W4288072953A5002846606 @default.
- W4288072953 hasAuthorship W4288072953A5010558184 @default.
- W4288072953 hasAuthorship W4288072953A5025631695 @default.
- W4288072953 hasAuthorship W4288072953A5026098526 @default.
- W4288072953 hasAuthorship W4288072953A5044171766 @default.
- W4288072953 hasAuthorship W4288072953A5046173951 @default.
- W4288072953 hasBestOaLocation W42880729532 @default.
- W4288072953 hasConcept C137293760 @default.
- W4288072953 hasConcept C153083717 @default.
- W4288072953 hasConcept C154945302 @default.
- W4288072953 hasConcept C204321447 @default.
- W4288072953 hasConcept C23123220 @default.
- W4288072953 hasConcept C41008148 @default.
- W4288072953 hasConceptScore W4288072953C137293760 @default.
- W4288072953 hasConceptScore W4288072953C153083717 @default.
- W4288072953 hasConceptScore W4288072953C154945302 @default.
- W4288072953 hasConceptScore W4288072953C204321447 @default.
- W4288072953 hasConceptScore W4288072953C23123220 @default.
- W4288072953 hasConceptScore W4288072953C41008148 @default.
- W4288072953 hasFunder F4320321001 @default.
- W4288072953 hasLocation W42880729531 @default.
- W4288072953 hasLocation W42880729532 @default.
- W4288072953 hasLocation W42880729533 @default.
- W4288072953 hasOpenAccess W4288072953 @default.
- W4288072953 hasPrimaryLocation W42880729531 @default.
- W4288072953 hasRelatedWork W1989705153 @default.
- W4288072953 hasRelatedWork W2086064646 @default.
- W4288072953 hasRelatedWork W2115485936 @default.
- W4288072953 hasRelatedWork W2119135658 @default.
- W4288072953 hasRelatedWork W2153015554 @default.
- W4288072953 hasRelatedWork W2293457016 @default.
- W4288072953 hasRelatedWork W2357241418 @default.
- W4288072953 hasRelatedWork W2359001871 @default.
- W4288072953 hasRelatedWork W2496228846 @default.
- W4288072953 hasRelatedWork W2789919619 @default.
- W4288072953 isParatext "false" @default.
- W4288072953 isRetracted "false" @default.
- W4288072953 workType "article" @default.