Matches in SemOpenAlex for { <https://semopenalex.org/work/W2912274075> ?p ?o ?g. }
- W2912274075 abstract "Visual Grounding (VG) aims to locate the most relevant region in an image, based on a flexible natural language query but not a pre-defined label, thus it can be a more useful technique than object detection in practice. Most state-of-the-art methods in VG operate in a two-stage manner, wherein the first stage an object detector is adopted to generate a set of object proposals from the input image and the second stage is simply formulated as a cross-modal matching problem that finds the best match between the language query and all region proposals. This is rather inefficient because there might be hundreds of proposals produced in the first stage that need to be compared in the second stage, not to mention this strategy performs inaccurately. In this paper, we propose an simple, intuitive and much more elegant one-stage detection based method that joints the region proposal and matching stage as a single detection network. The detection is conditioned on the input query with a stack of novel Relation-to-Attention modules that transform the image-to-query relationship to an relation map, which is used to predict the bounding box directly without proposing large numbers of useless region proposals. During the inference, our approach is about 20x ~ 30x faster than previous methods and, remarkably, it achieves 18% ~ 41% absolute performance improvement on top of the state-of-the-art results on several benchmark datasets. We release our code and all the pre-trained models at https://github.com/openblack/rvg." @default.
- W2912274075 created "2019-02-21" @default.
- W2912274075 creator A5010906829 @default.
- W2912274075 creator A5015244667 @default.
- W2912274075 creator A5022714797 @default.
- W2912274075 creator A5027344351 @default.
- W2912274075 creator A5032352025 @default.
- W2912274075 creator A5060958969 @default.
- W2912274075 creator A5065964089 @default.
- W2912274075 date "2019-02-11" @default.
- W2912274075 modified "2023-09-26" @default.
- W2912274075 title "You Only Look & Listen Once: Towards Fast and Accurate Visual Grounding" @default.
- W2912274075 cites W1514535095 @default.
- W2912274075 cites W1522301498 @default.
- W2912274075 cites W1536680647 @default.
- W2912274075 cites W1689711448 @default.
- W2912274075 cites W1861492603 @default.
- W2912274075 cites W1895577753 @default.
- W2912274075 cites W1924770834 @default.
- W2912274075 cites W2010181071 @default.
- W2912274075 cites W2068730032 @default.
- W2912274075 cites W2088049833 @default.
- W2912274075 cites W2117539524 @default.
- W2912274075 cites W2153579005 @default.
- W2912274075 cites W2163605009 @default.
- W2912274075 cites W2172806452 @default.
- W2912274075 cites W2194775991 @default.
- W2912274075 cites W2251512949 @default.
- W2912274075 cites W2489434015 @default.
- W2912274075 cites W2558535589 @default.
- W2912274075 cites W2571175805 @default.
- W2912274075 cites W2597655663 @default.
- W2912274075 cites W2601564443 @default.
- W2912274075 cites W2613718673 @default.
- W2912274075 cites W2770129969 @default.
- W2912274075 cites W2770201307 @default.
- W2912274075 cites W2779827764 @default.
- W2912274075 cites W2799263800 @default.
- W2912274075 cites W2950893734 @default.
- W2912274075 cites W2951714314 @default.
- W2912274075 cites W2962764817 @default.
- W2912274075 cites W2963037989 @default.
- W2912274075 cites W2963109634 @default.
- W2912274075 cites W2963383024 @default.
- W2912274075 cites W2963403868 @default.
- W2912274075 cites W2963656855 @default.
- W2912274075 cites W2963668159 @default.
- W2912274075 cites W2963735856 @default.
- W2912274075 cites W2963907629 @default.
- W2912274075 cites W2964080601 @default.
- W2912274075 cites W2964265128 @default.
- W2912274075 cites W2964303913 @default.
- W2912274075 cites W2964308564 @default.
- W2912274075 cites W2964345792 @default.
- W2912274075 cites W3098232790 @default.
- W2912274075 cites W3106250896 @default.
- W2912274075 doi "https://doi.org/10.48550/arxiv.1902.04213" @default.
- W2912274075 hasPublicationYear "2019" @default.
- W2912274075 type Work @default.
- W2912274075 sameAs 2912274075 @default.
- W2912274075 citedByCount "1" @default.
- W2912274075 countsByYear W29122740752020 @default.
- W2912274075 crossrefType "posted-content" @default.
- W2912274075 hasAuthorship W2912274075A5010906829 @default.
- W2912274075 hasAuthorship W2912274075A5015244667 @default.
- W2912274075 hasAuthorship W2912274075A5022714797 @default.
- W2912274075 hasAuthorship W2912274075A5027344351 @default.
- W2912274075 hasAuthorship W2912274075A5032352025 @default.
- W2912274075 hasAuthorship W2912274075A5060958969 @default.
- W2912274075 hasAuthorship W2912274075A5065964089 @default.
- W2912274075 hasBestOaLocation W29122740751 @default.
- W2912274075 hasConcept C105795698 @default.
- W2912274075 hasConcept C115961682 @default.
- W2912274075 hasConcept C124101348 @default.
- W2912274075 hasConcept C13280743 @default.
- W2912274075 hasConcept C147037132 @default.
- W2912274075 hasConcept C153180895 @default.
- W2912274075 hasConcept C154945302 @default.
- W2912274075 hasConcept C165064840 @default.
- W2912274075 hasConcept C177264268 @default.
- W2912274075 hasConcept C185798385 @default.
- W2912274075 hasConcept C199360897 @default.
- W2912274075 hasConcept C205649164 @default.
- W2912274075 hasConcept C25343380 @default.
- W2912274075 hasConcept C2776151529 @default.
- W2912274075 hasConcept C2776214188 @default.
- W2912274075 hasConcept C2776760102 @default.
- W2912274075 hasConcept C2781238097 @default.
- W2912274075 hasConcept C33923547 @default.
- W2912274075 hasConcept C41008148 @default.
- W2912274075 hasConcept C63584917 @default.
- W2912274075 hasConcept C76155785 @default.
- W2912274075 hasConcept C94915269 @default.
- W2912274075 hasConceptScore W2912274075C105795698 @default.
- W2912274075 hasConceptScore W2912274075C115961682 @default.
- W2912274075 hasConceptScore W2912274075C124101348 @default.
- W2912274075 hasConceptScore W2912274075C13280743 @default.
- W2912274075 hasConceptScore W2912274075C147037132 @default.
- W2912274075 hasConceptScore W2912274075C153180895 @default.
- W2912274075 hasConceptScore W2912274075C154945302 @default.