Matches in SemOpenAlex for { <https://semopenalex.org/work/W4288055341> ?p ?o ?g. }
Showing items 1 to 65 of
65
with 100 items per page.
- W4288055341 abstract "Document Visual Question Answering (VQA) aims to understand visually-rich documents to answer questions in natural language, which is an emerging research topic for both Natural Language Processing and Computer Vision. In this work, we introduce a new Document VQA dataset, named TAT-DQA, which consists of 3,067 document pages comprising semi-structured table(s) and unstructured text as well as 16,558 question-answer pairs by extending the TAT-QA dataset. These documents are sampled from real-world financial reports and contain lots of numbers, which means discrete reasoning capability is demanded to answer questions on this dataset. Based on TAT-DQA, we further develop a novel model named MHST that takes into account the information in multi-modalities, including text, layout and visual image, to intelligently address different types of questions with corresponding strategies, i.e., extraction or reasoning. Extensive experiments show that the MHST model significantly outperforms the baseline methods, demonstrating its effectiveness. However, the performance still lags far behind that of expert humans. We expect that our new TAT-DQA dataset would facilitate the research on deep understanding of visually-rich documents combining vision and language, especially for scenarios that require discrete reasoning. Also, we hope the proposed model would inspire researchers to design more advanced Document VQA models in future." @default.
- W4288055341 created "2022-07-28" @default.
- W4288055341 creator A5003475089 @default.
- W4288055341 creator A5029052244 @default.
- W4288055341 creator A5039239180 @default.
- W4288055341 creator A5051925942 @default.
- W4288055341 creator A5055838753 @default.
- W4288055341 creator A5089404640 @default.
- W4288055341 date "2022-07-24" @default.
- W4288055341 modified "2023-10-17" @default.
- W4288055341 title "Towards Complex Document Understanding By Discrete Reasoning" @default.
- W4288055341 doi "https://doi.org/10.48550/arxiv.2207.11871" @default.
- W4288055341 hasPublicationYear "2022" @default.
- W4288055341 type Work @default.
- W4288055341 citedByCount "0" @default.
- W4288055341 crossrefType "posted-content" @default.
- W4288055341 hasAuthorship W4288055341A5003475089 @default.
- W4288055341 hasAuthorship W4288055341A5029052244 @default.
- W4288055341 hasAuthorship W4288055341A5039239180 @default.
- W4288055341 hasAuthorship W4288055341A5051925942 @default.
- W4288055341 hasAuthorship W4288055341A5055838753 @default.
- W4288055341 hasAuthorship W4288055341A5089404640 @default.
- W4288055341 hasBestOaLocation W42880553411 @default.
- W4288055341 hasConcept C124101348 @default.
- W4288055341 hasConcept C144024400 @default.
- W4288055341 hasConcept C154945302 @default.
- W4288055341 hasConcept C195324797 @default.
- W4288055341 hasConcept C195807954 @default.
- W4288055341 hasConcept C204321447 @default.
- W4288055341 hasConcept C23123220 @default.
- W4288055341 hasConcept C2522767166 @default.
- W4288055341 hasConcept C2779903281 @default.
- W4288055341 hasConcept C36289849 @default.
- W4288055341 hasConcept C41008148 @default.
- W4288055341 hasConcept C44291984 @default.
- W4288055341 hasConcept C45235069 @default.
- W4288055341 hasConceptScore W4288055341C124101348 @default.
- W4288055341 hasConceptScore W4288055341C144024400 @default.
- W4288055341 hasConceptScore W4288055341C154945302 @default.
- W4288055341 hasConceptScore W4288055341C195324797 @default.
- W4288055341 hasConceptScore W4288055341C195807954 @default.
- W4288055341 hasConceptScore W4288055341C204321447 @default.
- W4288055341 hasConceptScore W4288055341C23123220 @default.
- W4288055341 hasConceptScore W4288055341C2522767166 @default.
- W4288055341 hasConceptScore W4288055341C2779903281 @default.
- W4288055341 hasConceptScore W4288055341C36289849 @default.
- W4288055341 hasConceptScore W4288055341C41008148 @default.
- W4288055341 hasConceptScore W4288055341C44291984 @default.
- W4288055341 hasConceptScore W4288055341C45235069 @default.
- W4288055341 hasLocation W42880553411 @default.
- W4288055341 hasOpenAccess W4288055341 @default.
- W4288055341 hasPrimaryLocation W42880553411 @default.
- W4288055341 hasRelatedWork W10275467 @default.
- W4288055341 hasRelatedWork W11238763 @default.
- W4288055341 hasRelatedWork W11990303 @default.
- W4288055341 hasRelatedWork W14310100 @default.
- W4288055341 hasRelatedWork W14942459 @default.
- W4288055341 hasRelatedWork W4281541 @default.
- W4288055341 hasRelatedWork W6633509 @default.
- W4288055341 hasRelatedWork W6904866 @default.
- W4288055341 hasRelatedWork W855327 @default.
- W4288055341 hasRelatedWork W9662544 @default.
- W4288055341 isParatext "false" @default.
- W4288055341 isRetracted "false" @default.
- W4288055341 workType "article" @default.