Matches in SemOpenAlex for { <https://semopenalex.org/work/W3166144525> ?p ?o ?g. }
Showing items 1 to 86 of
86
with 100 items per page.
- W3166144525 abstract "Contrastive learning has been applied successfully to learn vector representations of text. Previous research demonstrated that learning high-quality representations benefits from batch-wise contrastive loss with a large number of negatives. In practice, the technique of in-batch negative is used, where for each example in a batch, other batch examples' positives will be taken as its negatives, avoiding encoding extra negatives. This, however, still conditions each example's loss on all batch examples and requires fitting the entire large batch into GPU memory. This paper introduces a gradient caching technique that decouples backpropagation between contrastive loss and the encoder, removing encoder backward pass data dependency along the batch dimension. As a result, gradients can be computed for one subset of the batch at a time, leading to almost constant memory usage." @default.
- W3166144525 created "2021-06-22" @default.
- W3166144525 creator A5009879041 @default.
- W3166144525 creator A5019539533 @default.
- W3166144525 creator A5060885692 @default.
- W3166144525 creator A5086509214 @default.
- W3166144525 date "2021-01-18" @default.
- W3166144525 modified "2023-09-24" @default.
- W3166144525 title "Scaling Deep Contrastive Learning Batch Size under Memory Limited Setup" @default.
- W3166144525 cites W2120861206 @default.
- W3166144525 cites W2153579005 @default.
- W3166144525 cites W2338908902 @default.
- W3166144525 cites W2891927334 @default.
- W3166144525 cites W2912924812 @default.
- W3166144525 cites W2951434086 @default.
- W3166144525 cites W2963341956 @default.
- W3166144525 cites W2963684275 @default.
- W3166144525 cites W2996064239 @default.
- W3166144525 cites W3005680577 @default.
- W3166144525 cites W3033406728 @default.
- W3166144525 cites W3092683697 @default.
- W3166144525 cites W3099700870 @default.
- W3166144525 cites W3100345210 @default.
- W3166144525 cites W3115295967 @default.
- W3166144525 cites W3129831491 @default.
- W3166144525 doi "https://doi.org/10.48550/arxiv.2101.06983" @default.
- W3166144525 hasPublicationYear "2021" @default.
- W3166144525 type Work @default.
- W3166144525 sameAs 3166144525 @default.
- W3166144525 citedByCount "1" @default.
- W3166144525 countsByYear W31661445252021 @default.
- W3166144525 crossrefType "posted-content" @default.
- W3166144525 hasAuthorship W3166144525A5009879041 @default.
- W3166144525 hasAuthorship W3166144525A5019539533 @default.
- W3166144525 hasAuthorship W3166144525A5060885692 @default.
- W3166144525 hasAuthorship W3166144525A5086509214 @default.
- W3166144525 hasBestOaLocation W31661445251 @default.
- W3166144525 hasConcept C111919701 @default.
- W3166144525 hasConcept C11413529 @default.
- W3166144525 hasConcept C118505674 @default.
- W3166144525 hasConcept C125411270 @default.
- W3166144525 hasConcept C153180895 @default.
- W3166144525 hasConcept C154945302 @default.
- W3166144525 hasConcept C172658912 @default.
- W3166144525 hasConcept C19768560 @default.
- W3166144525 hasConcept C199360897 @default.
- W3166144525 hasConcept C202444582 @default.
- W3166144525 hasConcept C2524010 @default.
- W3166144525 hasConcept C2777027219 @default.
- W3166144525 hasConcept C33676613 @default.
- W3166144525 hasConcept C33923547 @default.
- W3166144525 hasConcept C41008148 @default.
- W3166144525 hasConcept C99844830 @default.
- W3166144525 hasConceptScore W3166144525C111919701 @default.
- W3166144525 hasConceptScore W3166144525C11413529 @default.
- W3166144525 hasConceptScore W3166144525C118505674 @default.
- W3166144525 hasConceptScore W3166144525C125411270 @default.
- W3166144525 hasConceptScore W3166144525C153180895 @default.
- W3166144525 hasConceptScore W3166144525C154945302 @default.
- W3166144525 hasConceptScore W3166144525C172658912 @default.
- W3166144525 hasConceptScore W3166144525C19768560 @default.
- W3166144525 hasConceptScore W3166144525C199360897 @default.
- W3166144525 hasConceptScore W3166144525C202444582 @default.
- W3166144525 hasConceptScore W3166144525C2524010 @default.
- W3166144525 hasConceptScore W3166144525C2777027219 @default.
- W3166144525 hasConceptScore W3166144525C33676613 @default.
- W3166144525 hasConceptScore W3166144525C33923547 @default.
- W3166144525 hasConceptScore W3166144525C41008148 @default.
- W3166144525 hasConceptScore W3166144525C99844830 @default.
- W3166144525 hasLocation W31661445251 @default.
- W3166144525 hasOpenAccess W3166144525 @default.
- W3166144525 hasPrimaryLocation W31661445251 @default.
- W3166144525 hasRelatedWork W1516526594 @default.
- W3166144525 hasRelatedWork W2080795652 @default.
- W3166144525 hasRelatedWork W2099017514 @default.
- W3166144525 hasRelatedWork W2103071823 @default.
- W3166144525 hasRelatedWork W2171183738 @default.
- W3166144525 hasRelatedWork W2353534548 @default.
- W3166144525 hasRelatedWork W2355476020 @default.
- W3166144525 hasRelatedWork W2364375978 @default.
- W3166144525 hasRelatedWork W2368072106 @default.
- W3166144525 hasRelatedWork W2547835662 @default.
- W3166144525 isParatext "false" @default.
- W3166144525 isRetracted "false" @default.
- W3166144525 magId "3166144525" @default.
- W3166144525 workType "article" @default.