Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387561098> ?p ?o ?g. }
Showing items 1 to 89 of
89
with 100 items per page.
- W4387561098 abstract "In this paper, we present our solution to a Multi-modal Algorithmic Reasoning Task: SMART-101 Challenge. Different from the traditional visual question-answering datasets, this challenge evaluates the abstraction, deduction, and generalization abilities of neural networks in solving visuolinguistic puzzles designed specifically for children in the 6-8 age group. We employed a divide-and-conquer approach. At the data level, inspired by the challenge paper, we categorized the whole questions into eight types and utilized the llama-2-chat model to directly generate the type for each question in a zero-shot manner. Additionally, we trained a yolov7 model on the icon45 dataset for object detection and combined it with the OCR method to recognize and locate objects and text within the images. At the model level, we utilized the BLIP-2 model and added eight adapters to the image encoder VIT-G to adaptively extract visual features for different question types. We fed the pre-constructed question templates as input and generated answers using the flan-t5-xxl decoder. Under the puzzle splits configuration, we achieved an accuracy score of 26.5 on the validation set and 24.30 on the private test set." @default.
- W4387561098 created "2023-10-12" @default.
- W4387561098 creator A5016375169 @default.
- W4387561098 creator A5019095831 @default.
- W4387561098 creator A5037595310 @default.
- W4387561098 creator A5049692788 @default.
- W4387561098 creator A5050022243 @default.
- W4387561098 creator A5088010841 @default.
- W4387561098 date "2023-10-10" @default.
- W4387561098 modified "2023-10-13" @default.
- W4387561098 title "Solution for SMART-101 Challenge of ICCV Multi-modal Algorithmic Reasoning Task 2023" @default.
- W4387561098 doi "https://doi.org/10.48550/arxiv.2310.06440" @default.
- W4387561098 hasPublicationYear "2023" @default.
- W4387561098 type Work @default.
- W4387561098 citedByCount "0" @default.
- W4387561098 crossrefType "posted-content" @default.
- W4387561098 hasAuthorship W4387561098A5016375169 @default.
- W4387561098 hasAuthorship W4387561098A5019095831 @default.
- W4387561098 hasAuthorship W4387561098A5037595310 @default.
- W4387561098 hasAuthorship W4387561098A5049692788 @default.
- W4387561098 hasAuthorship W4387561098A5050022243 @default.
- W4387561098 hasAuthorship W4387561098A5088010841 @default.
- W4387561098 hasBestOaLocation W43875610981 @default.
- W4387561098 hasConcept C111472728 @default.
- W4387561098 hasConcept C111919701 @default.
- W4387561098 hasConcept C115961682 @default.
- W4387561098 hasConcept C118505674 @default.
- W4387561098 hasConcept C119857082 @default.
- W4387561098 hasConcept C124304363 @default.
- W4387561098 hasConcept C134306372 @default.
- W4387561098 hasConcept C138885662 @default.
- W4387561098 hasConcept C153180895 @default.
- W4387561098 hasConcept C154945302 @default.
- W4387561098 hasConcept C162324750 @default.
- W4387561098 hasConcept C169903167 @default.
- W4387561098 hasConcept C177148314 @default.
- W4387561098 hasConcept C177264268 @default.
- W4387561098 hasConcept C185592680 @default.
- W4387561098 hasConcept C187736073 @default.
- W4387561098 hasConcept C188027245 @default.
- W4387561098 hasConcept C199360897 @default.
- W4387561098 hasConcept C2777508537 @default.
- W4387561098 hasConcept C2780451532 @default.
- W4387561098 hasConcept C2781238097 @default.
- W4387561098 hasConcept C33923547 @default.
- W4387561098 hasConcept C41008148 @default.
- W4387561098 hasConcept C58489278 @default.
- W4387561098 hasConcept C71139939 @default.
- W4387561098 hasConceptScore W4387561098C111472728 @default.
- W4387561098 hasConceptScore W4387561098C111919701 @default.
- W4387561098 hasConceptScore W4387561098C115961682 @default.
- W4387561098 hasConceptScore W4387561098C118505674 @default.
- W4387561098 hasConceptScore W4387561098C119857082 @default.
- W4387561098 hasConceptScore W4387561098C124304363 @default.
- W4387561098 hasConceptScore W4387561098C134306372 @default.
- W4387561098 hasConceptScore W4387561098C138885662 @default.
- W4387561098 hasConceptScore W4387561098C153180895 @default.
- W4387561098 hasConceptScore W4387561098C154945302 @default.
- W4387561098 hasConceptScore W4387561098C162324750 @default.
- W4387561098 hasConceptScore W4387561098C169903167 @default.
- W4387561098 hasConceptScore W4387561098C177148314 @default.
- W4387561098 hasConceptScore W4387561098C177264268 @default.
- W4387561098 hasConceptScore W4387561098C185592680 @default.
- W4387561098 hasConceptScore W4387561098C187736073 @default.
- W4387561098 hasConceptScore W4387561098C188027245 @default.
- W4387561098 hasConceptScore W4387561098C199360897 @default.
- W4387561098 hasConceptScore W4387561098C2777508537 @default.
- W4387561098 hasConceptScore W4387561098C2780451532 @default.
- W4387561098 hasConceptScore W4387561098C2781238097 @default.
- W4387561098 hasConceptScore W4387561098C33923547 @default.
- W4387561098 hasConceptScore W4387561098C41008148 @default.
- W4387561098 hasConceptScore W4387561098C58489278 @default.
- W4387561098 hasConceptScore W4387561098C71139939 @default.
- W4387561098 hasLocation W43875610981 @default.
- W4387561098 hasOpenAccess W4387561098 @default.
- W4387561098 hasPrimaryLocation W43875610981 @default.
- W4387561098 hasRelatedWork W1990237101 @default.
- W4387561098 hasRelatedWork W2045155990 @default.
- W4387561098 hasRelatedWork W2183488467 @default.
- W4387561098 hasRelatedWork W2187490799 @default.
- W4387561098 hasRelatedWork W2790141222 @default.
- W4387561098 hasRelatedWork W3170838353 @default.
- W4387561098 hasRelatedWork W4300172249 @default.
- W4387561098 hasRelatedWork W4309907966 @default.
- W4387561098 hasRelatedWork W4313163053 @default.
- W4387561098 hasRelatedWork W4387327236 @default.
- W4387561098 isParatext "false" @default.
- W4387561098 isRetracted "false" @default.
- W4387561098 workType "article" @default.