Matches in SemOpenAlex for { <https://semopenalex.org/work/W4378805087> ?p ?o ?g. }
Showing items 1 to 80 of
80
with 100 items per page.
- W4378805087 endingPage "12" @default.
- W4378805087 startingPage "1" @default.
- W4378805087 abstract "As a fundamental topic in bridging the gap between vision and language, cross-modal retrieval purposes to obtain the correspondences' relationship between fragments, i.e., subregions in images and words in texts. Compared with earlier methods that focus on learning the visual semantic embedding from images and sentences to the shared embedding space, the existing methods tend to learn the correspondences between words and regions via cross-modal attention. However, such attention-based approaches invariably result in semantic misalignment between subfragments for two reasons: 1) without modeling the relationship between subfragments and the semantics of the entire images or sentences, it will be hard for such approaches to distinguish images or sentences with multiple same semantic fragments and 2) such approaches focus attention evenly on all subfragments, including nonvisual words and a lot of redundant regions, which also will face the problem of semantic misalignment. To solve these problems, this article proposes a bidirectional correct attention network (BCAN), which introduces a novel concept of the relevance between subfragments and the semantics of the entire images or sentences and designs a novel correct attention mechanism by modeling the local and global similarity between images and sentences to correct the attention weights focused on the wrong fragments. Specifically, we introduce a concept about the semantic relationship between subfragments and entire images or sentences and use this concept to solve the semantic misalignment from two aspects. In our correct attention mechanism, we design two independent units to correct the weight of attention focused on the wrong fragments. Global correct unit (GCU) with modeling the global similarity between images and sentences into the attention mechanism to solve the semantic misalignment problem caused by focusing attention on relevant subfragments in irrelevant pairs (RI) and the local correct unit (LCU) consider the difference in the attention weights between fragments among two steps to solve the semantic misalignment problem caused by focusing attention on irrelevant subfragments in relevant pairs (IR). Extensive experiments on large-scale MS-COCO and Flickr30K show that our proposed method outperforms all the attention-based methods and is competitive to the state-of-the-art. Our code and pretrained model are publicly available at: https://github.com/liuyyy111/BCAN." @default.
- W4378805087 created "2023-06-01" @default.
- W4378805087 creator A5038040039 @default.
- W4378805087 creator A5046380673 @default.
- W4378805087 creator A5055846696 @default.
- W4378805087 creator A5057287134 @default.
- W4378805087 creator A5060233481 @default.
- W4378805087 date "2023-01-01" @default.
- W4378805087 modified "2023-09-23" @default.
- W4378805087 title "BCAN: Bidirectional Correct Attention Network for Cross-Modal Retrieval" @default.
- W4378805087 doi "https://doi.org/10.1109/tnnls.2023.3276796" @default.
- W4378805087 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/37256811" @default.
- W4378805087 hasPublicationYear "2023" @default.
- W4378805087 type Work @default.
- W4378805087 citedByCount "0" @default.
- W4378805087 crossrefType "journal-article" @default.
- W4378805087 hasAuthorship W4378805087A5038040039 @default.
- W4378805087 hasAuthorship W4378805087A5046380673 @default.
- W4378805087 hasAuthorship W4378805087A5055846696 @default.
- W4378805087 hasAuthorship W4378805087A5057287134 @default.
- W4378805087 hasAuthorship W4378805087A5060233481 @default.
- W4378805087 hasConcept C103278499 @default.
- W4378805087 hasConcept C115961682 @default.
- W4378805087 hasConcept C120665830 @default.
- W4378805087 hasConcept C121332964 @default.
- W4378805087 hasConcept C153180895 @default.
- W4378805087 hasConcept C154945302 @default.
- W4378805087 hasConcept C1667742 @default.
- W4378805087 hasConcept C174348530 @default.
- W4378805087 hasConcept C184337299 @default.
- W4378805087 hasConcept C185592680 @default.
- W4378805087 hasConcept C188027245 @default.
- W4378805087 hasConcept C192209626 @default.
- W4378805087 hasConcept C199360897 @default.
- W4378805087 hasConcept C204321447 @default.
- W4378805087 hasConcept C2993807640 @default.
- W4378805087 hasConcept C31258907 @default.
- W4378805087 hasConcept C41008148 @default.
- W4378805087 hasConcept C41608201 @default.
- W4378805087 hasConcept C71139939 @default.
- W4378805087 hasConcept C86034646 @default.
- W4378805087 hasConceptScore W4378805087C103278499 @default.
- W4378805087 hasConceptScore W4378805087C115961682 @default.
- W4378805087 hasConceptScore W4378805087C120665830 @default.
- W4378805087 hasConceptScore W4378805087C121332964 @default.
- W4378805087 hasConceptScore W4378805087C153180895 @default.
- W4378805087 hasConceptScore W4378805087C154945302 @default.
- W4378805087 hasConceptScore W4378805087C1667742 @default.
- W4378805087 hasConceptScore W4378805087C174348530 @default.
- W4378805087 hasConceptScore W4378805087C184337299 @default.
- W4378805087 hasConceptScore W4378805087C185592680 @default.
- W4378805087 hasConceptScore W4378805087C188027245 @default.
- W4378805087 hasConceptScore W4378805087C192209626 @default.
- W4378805087 hasConceptScore W4378805087C199360897 @default.
- W4378805087 hasConceptScore W4378805087C204321447 @default.
- W4378805087 hasConceptScore W4378805087C2993807640 @default.
- W4378805087 hasConceptScore W4378805087C31258907 @default.
- W4378805087 hasConceptScore W4378805087C41008148 @default.
- W4378805087 hasConceptScore W4378805087C41608201 @default.
- W4378805087 hasConceptScore W4378805087C71139939 @default.
- W4378805087 hasConceptScore W4378805087C86034646 @default.
- W4378805087 hasLocation W43788050871 @default.
- W4378805087 hasLocation W43788050872 @default.
- W4378805087 hasOpenAccess W4378805087 @default.
- W4378805087 hasPrimaryLocation W43788050871 @default.
- W4378805087 hasRelatedWork W1541271503 @default.
- W4378805087 hasRelatedWork W1555966012 @default.
- W4378805087 hasRelatedWork W1938514538 @default.
- W4378805087 hasRelatedWork W2374255027 @default.
- W4378805087 hasRelatedWork W2900794075 @default.
- W4378805087 hasRelatedWork W2916492174 @default.
- W4378805087 hasRelatedWork W2998028709 @default.
- W4378805087 hasRelatedWork W3183633970 @default.
- W4378805087 hasRelatedWork W3208309985 @default.
- W4378805087 hasRelatedWork W4281690070 @default.
- W4378805087 isParatext "false" @default.
- W4378805087 isRetracted "false" @default.
- W4378805087 workType "article" @default.