Matches in SemOpenAlex for { <https://semopenalex.org/work/W4207063631> ?p ?o ?g. }
- W4207063631 endingPage "10" @default.
- W4207063631 startingPage "1" @default.
- W4207063631 abstract "People can accurately describe an image by constantly referring to the visual information and key text information of the image. Inspired by this idea, we propose the VTR-PTM (Visual-Text Reference Pretraining Model) for image captioning. First, based on the pretraining model (BERT/UNIML), we design the dual-stream input mode of image reference and text reference and use two different mask modes (bidirectional and sequence to sequence) to realize the VTR-PTM suitable for generating tasks. Second, the target dataset is used to fine tune the VTR-PTM. To the best of our knowledge, VTR-PTM is the first reported pretraining model to use visual-text references in the learning process. To evaluate the model, we conduct several experiments on the benchmark datasets of image captioning, including MS COCO and Visual Genome, and achieve significant improvements on most metrics. The code is available at https://github.com/lpfworld/VTR-PTM." @default.
- W4207063631 created "2022-01-26" @default.
- W4207063631 creator A5038819483 @default.
- W4207063631 creator A5057319158 @default.
- W4207063631 creator A5064681122 @default.
- W4207063631 creator A5082606335 @default.
- W4207063631 creator A5082691066 @default.
- W4207063631 date "2022-01-21" @default.
- W4207063631 modified "2023-09-26" @default.
- W4207063631 title "Visual-Text Reference Pretraining Model for Image Captioning" @default.
- W4207063631 cites W1861492603 @default.
- W4207063631 cites W1897761818 @default.
- W4207063631 cites W1956340063 @default.
- W4207063631 cites W1969616664 @default.
- W4207063631 cites W2101105183 @default.
- W4207063631 cites W2185175083 @default.
- W4207063631 cites W2277195237 @default.
- W4207063631 cites W2463955103 @default.
- W4207063631 cites W2506483933 @default.
- W4207063631 cites W2740715668 @default.
- W4207063631 cites W2745461083 @default.
- W4207063631 cites W2768975974 @default.
- W4207063631 cites W2887585070 @default.
- W4207063631 cites W2963758027 @default.
- W4207063631 cites W2970231061 @default.
- W4207063631 cites W2986670728 @default.
- W4207063631 cites W2997591391 @default.
- W4207063631 cites W2998356391 @default.
- W4207063631 cites W639708223 @default.
- W4207063631 doi "https://doi.org/10.1155/2022/9400999" @default.
- W4207063631 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/35096050" @default.
- W4207063631 hasPublicationYear "2022" @default.
- W4207063631 type Work @default.
- W4207063631 citedByCount "1" @default.
- W4207063631 countsByYear W42070636312022 @default.
- W4207063631 crossrefType "journal-article" @default.
- W4207063631 hasAuthorship W4207063631A5038819483 @default.
- W4207063631 hasAuthorship W4207063631A5057319158 @default.
- W4207063631 hasAuthorship W4207063631A5064681122 @default.
- W4207063631 hasAuthorship W4207063631A5082606335 @default.
- W4207063631 hasAuthorship W4207063631A5082691066 @default.
- W4207063631 hasBestOaLocation W42070636311 @default.
- W4207063631 hasConcept C115961682 @default.
- W4207063631 hasConcept C13280743 @default.
- W4207063631 hasConcept C153180895 @default.
- W4207063631 hasConcept C154945302 @default.
- W4207063631 hasConcept C157657479 @default.
- W4207063631 hasConcept C177264268 @default.
- W4207063631 hasConcept C185798385 @default.
- W4207063631 hasConcept C199360897 @default.
- W4207063631 hasConcept C204321447 @default.
- W4207063631 hasConcept C205649164 @default.
- W4207063631 hasConcept C26517878 @default.
- W4207063631 hasConcept C2776760102 @default.
- W4207063631 hasConcept C2778112365 @default.
- W4207063631 hasConcept C31972630 @default.
- W4207063631 hasConcept C36464697 @default.
- W4207063631 hasConcept C38652104 @default.
- W4207063631 hasConcept C41008148 @default.
- W4207063631 hasConcept C54355233 @default.
- W4207063631 hasConcept C86803240 @default.
- W4207063631 hasConcept C98045186 @default.
- W4207063631 hasConceptScore W4207063631C115961682 @default.
- W4207063631 hasConceptScore W4207063631C13280743 @default.
- W4207063631 hasConceptScore W4207063631C153180895 @default.
- W4207063631 hasConceptScore W4207063631C154945302 @default.
- W4207063631 hasConceptScore W4207063631C157657479 @default.
- W4207063631 hasConceptScore W4207063631C177264268 @default.
- W4207063631 hasConceptScore W4207063631C185798385 @default.
- W4207063631 hasConceptScore W4207063631C199360897 @default.
- W4207063631 hasConceptScore W4207063631C204321447 @default.
- W4207063631 hasConceptScore W4207063631C205649164 @default.
- W4207063631 hasConceptScore W4207063631C26517878 @default.
- W4207063631 hasConceptScore W4207063631C2776760102 @default.
- W4207063631 hasConceptScore W4207063631C2778112365 @default.
- W4207063631 hasConceptScore W4207063631C31972630 @default.
- W4207063631 hasConceptScore W4207063631C36464697 @default.
- W4207063631 hasConceptScore W4207063631C38652104 @default.
- W4207063631 hasConceptScore W4207063631C41008148 @default.
- W4207063631 hasConceptScore W4207063631C54355233 @default.
- W4207063631 hasConceptScore W4207063631C86803240 @default.
- W4207063631 hasConceptScore W4207063631C98045186 @default.
- W4207063631 hasLocation W42070636311 @default.
- W4207063631 hasLocation W42070636312 @default.
- W4207063631 hasLocation W42070636313 @default.
- W4207063631 hasLocation W42070636314 @default.
- W4207063631 hasOpenAccess W4207063631 @default.
- W4207063631 hasPrimaryLocation W42070636311 @default.
- W4207063631 hasRelatedWork W2130228941 @default.
- W4207063631 hasRelatedWork W2161229648 @default.
- W4207063631 hasRelatedWork W2181305951 @default.
- W4207063631 hasRelatedWork W2547835662 @default.
- W4207063631 hasRelatedWork W2971866238 @default.
- W4207063631 hasRelatedWork W2993674027 @default.
- W4207063631 hasRelatedWork W3093267690 @default.
- W4207063631 hasRelatedWork W4307074315 @default.
- W4207063631 hasRelatedWork W4307856881 @default.
- W4207063631 hasRelatedWork W4320016117 @default.