Matches in SemOpenAlex for { <https://semopenalex.org/work/W3212686290> ?p ?o ?g. }
- W3212686290 abstract "Existing unsupervised document hashing methods are mostly established on generative models. Due to the difficulties of capturing long dependency structures, these methods rarely model the raw documents directly, but instead to model the features extracted from them (e.g. bag-of-words (BOW), TFIDF). In this paper, we propose to learn hash codes from BERT embeddings after observing their tremendous successes on downstream tasks. As a first try, we modify existing generative hashing models to accommodate the BERT embeddings. However, little improvement is observed over the codes learned from the old BOW or TFIDF features. We attribute this to the reconstruction requirement in the generative hashing, which will enforce irrelevant information that is abundant in the BERT embeddings also compressed into the codes. To remedy this issue, a new unsupervised hashing paradigm is further proposed based on the mutual information (MI) maximization principle. Specifically, the method first constructs appropriate global and local codes from the documents and then seeks to maximize their mutual information. Experimental results on three benchmark datasets demonstrate that the proposed method is able to generate hash codes that outperform existing ones learned from BOW features by a substantial margin." @default.
- W3212686290 created "2021-11-22" @default.
- W3212686290 creator A5021091982 @default.
- W3212686290 creator A5043729465 @default.
- W3212686290 creator A5044777170 @default.
- W3212686290 creator A5051649145 @default.
- W3212686290 creator A5059804502 @default.
- W3212686290 creator A5064662358 @default.
- W3212686290 date "2021-09-07" @default.
- W3212686290 modified "2023-10-17" @default.
- W3212686290 title "Refining BERT Embeddings for Document Hashing via Mutual Information Maximization" @default.
- W3212686290 cites W1552847225 @default.
- W3212686290 cites W1832693441 @default.
- W3212686290 cites W1909320841 @default.
- W3212686290 cites W2118509786 @default.
- W3212686290 cites W2158899491 @default.
- W3212686290 cites W2187089797 @default.
- W3212686290 cites W2242818861 @default.
- W3212686290 cites W2250539671 @default.
- W3212686290 cites W2547875792 @default.
- W3212686290 cites W2740797857 @default.
- W3212686290 cites W2803832867 @default.
- W3212686290 cites W2842511635 @default.
- W3212686290 cites W2906971874 @default.
- W3212686290 cites W2913932916 @default.
- W3212686290 cites W2951873722 @default.
- W3212686290 cites W2962919781 @default.
- W3212686290 cites W2963012544 @default.
- W3212686290 cites W2963341956 @default.
- W3212686290 cites W2963403868 @default.
- W3212686290 cites W2963782635 @default.
- W3212686290 cites W2963800509 @default.
- W3212686290 cites W2964121744 @default.
- W3212686290 cites W2965373594 @default.
- W3212686290 cites W2994710732 @default.
- W3212686290 cites W3034541796 @default.
- W3212686290 cites W3035452479 @default.
- W3212686290 cites W3099146195 @default.
- W3212686290 cites W3105816068 @default.
- W3212686290 cites W3162493078 @default.
- W3212686290 cites W3165138014 @default.
- W3212686290 doi "https://doi.org/10.48550/arxiv.2109.02867" @default.
- W3212686290 hasPublicationYear "2021" @default.
- W3212686290 type Work @default.
- W3212686290 sameAs 3212686290 @default.
- W3212686290 citedByCount "0" @default.
- W3212686290 crossrefType "posted-content" @default.
- W3212686290 hasAuthorship W3212686290A5021091982 @default.
- W3212686290 hasAuthorship W3212686290A5043729465 @default.
- W3212686290 hasAuthorship W3212686290A5044777170 @default.
- W3212686290 hasAuthorship W3212686290A5051649145 @default.
- W3212686290 hasAuthorship W3212686290A5059804502 @default.
- W3212686290 hasAuthorship W3212686290A5064662358 @default.
- W3212686290 hasBestOaLocation W32126862901 @default.
- W3212686290 hasConcept C119857082 @default.
- W3212686290 hasConcept C124101348 @default.
- W3212686290 hasConcept C126255220 @default.
- W3212686290 hasConcept C13280743 @default.
- W3212686290 hasConcept C152139883 @default.
- W3212686290 hasConcept C154945302 @default.
- W3212686290 hasConcept C167966045 @default.
- W3212686290 hasConcept C185798385 @default.
- W3212686290 hasConcept C205649164 @default.
- W3212686290 hasConcept C2776330181 @default.
- W3212686290 hasConcept C33923547 @default.
- W3212686290 hasConcept C38652104 @default.
- W3212686290 hasConcept C39890363 @default.
- W3212686290 hasConcept C41008148 @default.
- W3212686290 hasConcept C80444323 @default.
- W3212686290 hasConcept C99138194 @default.
- W3212686290 hasConceptScore W3212686290C119857082 @default.
- W3212686290 hasConceptScore W3212686290C124101348 @default.
- W3212686290 hasConceptScore W3212686290C126255220 @default.
- W3212686290 hasConceptScore W3212686290C13280743 @default.
- W3212686290 hasConceptScore W3212686290C152139883 @default.
- W3212686290 hasConceptScore W3212686290C154945302 @default.
- W3212686290 hasConceptScore W3212686290C167966045 @default.
- W3212686290 hasConceptScore W3212686290C185798385 @default.
- W3212686290 hasConceptScore W3212686290C205649164 @default.
- W3212686290 hasConceptScore W3212686290C2776330181 @default.
- W3212686290 hasConceptScore W3212686290C33923547 @default.
- W3212686290 hasConceptScore W3212686290C38652104 @default.
- W3212686290 hasConceptScore W3212686290C39890363 @default.
- W3212686290 hasConceptScore W3212686290C41008148 @default.
- W3212686290 hasConceptScore W3212686290C80444323 @default.
- W3212686290 hasConceptScore W3212686290C99138194 @default.
- W3212686290 hasLocation W32126862901 @default.
- W3212686290 hasOpenAccess W3212686290 @default.
- W3212686290 hasPrimaryLocation W32126862901 @default.
- W3212686290 hasRelatedWork W1534961803 @default.
- W3212686290 hasRelatedWork W2573350151 @default.
- W3212686290 hasRelatedWork W2785532149 @default.
- W3212686290 hasRelatedWork W2810557583 @default.
- W3212686290 hasRelatedWork W2888836568 @default.
- W3212686290 hasRelatedWork W2953081648 @default.
- W3212686290 hasRelatedWork W2994891734 @default.
- W3212686290 hasRelatedWork W2999091639 @default.
- W3212686290 hasRelatedWork W4206213633 @default.
- W3212686290 hasRelatedWork W2310403681 @default.
- W3212686290 isParatext "false" @default.