Matches in SemOpenAlex for { <https://semopenalex.org/work/W4310648403> ?p ?o ?g. }
Showing items 1 to 66 of
66
with 100 items per page.
- W4310648403 abstract "Existing research for image text retrieval mainly relies on sentence-level supervision to distinguish matched and mismatched sentences for a query image. However, semantic mismatch between an image and sentences usually happens in finer grain, i.e., phrase level. In this paper, we explore to introduce additional phrase-level supervision for the better identification of mismatched units in the text. In practice, multi-grained semantic labels are automatically constructed for a query image in both sentence-level and phrase-level. We construct text scene graphs for the matched sentences and extract entities and triples as the phrase-level labels. In order to integrate both supervision of sentence-level and phrase-level, we propose Semantic Structure Aware Multimodal Transformer (SSAMT) for multi-modal representation learning. Inside the SSAMT, we utilize different kinds of attention mechanisms to enforce interactions of multi-grain semantic units in both sides of vision and language. For the training, we propose multi-scale matching losses from both global and local perspectives, and penalize mismatched phrases. Experimental results on MS-COCO and Flickr30K show the effectiveness of our approach compared to some state-of-the-art models." @default.
- W4310648403 created "2022-12-14" @default.
- W4310648403 creator A5002477413 @default.
- W4310648403 creator A5011504177 @default.
- W4310648403 creator A5031910872 @default.
- W4310648403 creator A5047968661 @default.
- W4310648403 creator A5075969066 @default.
- W4310648403 creator A5078689432 @default.
- W4310648403 creator A5088834359 @default.
- W4310648403 date "2021-09-12" @default.
- W4310648403 modified "2023-09-27" @default.
- W4310648403 title "Constructing Phrase-level Semantic Labels to Form Multi-Grained Supervision for Image-Text Retrieval" @default.
- W4310648403 doi "https://doi.org/10.48550/arxiv.2109.05523" @default.
- W4310648403 hasPublicationYear "2021" @default.
- W4310648403 type Work @default.
- W4310648403 citedByCount "0" @default.
- W4310648403 crossrefType "posted-content" @default.
- W4310648403 hasAuthorship W4310648403A5002477413 @default.
- W4310648403 hasAuthorship W4310648403A5011504177 @default.
- W4310648403 hasAuthorship W4310648403A5031910872 @default.
- W4310648403 hasAuthorship W4310648403A5047968661 @default.
- W4310648403 hasAuthorship W4310648403A5075969066 @default.
- W4310648403 hasAuthorship W4310648403A5078689432 @default.
- W4310648403 hasAuthorship W4310648403A5088834359 @default.
- W4310648403 hasBestOaLocation W43106484031 @default.
- W4310648403 hasConcept C121332964 @default.
- W4310648403 hasConcept C154945302 @default.
- W4310648403 hasConcept C165801399 @default.
- W4310648403 hasConcept C174348530 @default.
- W4310648403 hasConcept C204321447 @default.
- W4310648403 hasConcept C23123220 @default.
- W4310648403 hasConcept C2776224158 @default.
- W4310648403 hasConcept C2777530160 @default.
- W4310648403 hasConcept C31258907 @default.
- W4310648403 hasConcept C41008148 @default.
- W4310648403 hasConcept C62520636 @default.
- W4310648403 hasConcept C66322947 @default.
- W4310648403 hasConceptScore W4310648403C121332964 @default.
- W4310648403 hasConceptScore W4310648403C154945302 @default.
- W4310648403 hasConceptScore W4310648403C165801399 @default.
- W4310648403 hasConceptScore W4310648403C174348530 @default.
- W4310648403 hasConceptScore W4310648403C204321447 @default.
- W4310648403 hasConceptScore W4310648403C23123220 @default.
- W4310648403 hasConceptScore W4310648403C2776224158 @default.
- W4310648403 hasConceptScore W4310648403C2777530160 @default.
- W4310648403 hasConceptScore W4310648403C31258907 @default.
- W4310648403 hasConceptScore W4310648403C41008148 @default.
- W4310648403 hasConceptScore W4310648403C62520636 @default.
- W4310648403 hasConceptScore W4310648403C66322947 @default.
- W4310648403 hasLocation W43106484031 @default.
- W4310648403 hasLocation W43106484032 @default.
- W4310648403 hasOpenAccess W4310648403 @default.
- W4310648403 hasPrimaryLocation W43106484031 @default.
- W4310648403 hasRelatedWork W159132833 @default.
- W4310648403 hasRelatedWork W1597901428 @default.
- W4310648403 hasRelatedWork W2086064646 @default.
- W4310648403 hasRelatedWork W2108705322 @default.
- W4310648403 hasRelatedWork W2115634880 @default.
- W4310648403 hasRelatedWork W2369308426 @default.
- W4310648403 hasRelatedWork W2399585936 @default.
- W4310648403 hasRelatedWork W2889818188 @default.
- W4310648403 hasRelatedWork W4385873483 @default.
- W4310648403 hasRelatedWork W4385877744 @default.
- W4310648403 isParatext "false" @default.
- W4310648403 isRetracted "false" @default.
- W4310648403 workType "article" @default.