Matches in SemOpenAlex for { <https://semopenalex.org/work/W4313163001> ?p ?o ?g. }
Showing items 1 to 64 of
64
with 100 items per page.
- W4313163001 endingPage "13" @default.
- W4313163001 startingPage "1" @default.
- W4313163001 abstract "Document intelligence as a relatively new research topic supports many business applications. Its main task is to automatically read, understand, and analyze documents. However, due to the diversity of formats (invoices, reports, forms, etc.) and layouts in documents, it is difficult to make machines understand documents. In this paper, we present the GraphDoc, a multimodal graph attention-based model for various document understanding tasks. GraphDoc is pre-trained in a multimodal framework by utilizing text, layout, and image information simultaneously. In a document, a text block relies heavily on its surrounding contexts, accordingly we inject the graph structure into the attention mechanism to form a graph attention layer so that each input node can only attend to its neighborhoods. The input nodes of each graph attention layer are composed of textual, visual, and positional features from semantically meaningful regions in a document image. We do the multimodal feature fusion of each node by the gate fusion layer. The contextualization between each node is modeled by the graph attention layer. GraphDoc learns a generic representation from only 320k unlabeled documents via the Masked Sentence Modeling task. Extensive experimental results on the publicly available datasets show that GraphDoc achieves state-of-the-art performance, which demonstrates the effectiveness of our proposed method. The code is available at <uri xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>https://github.com/ZZR8066/GraphDoc</uri> ." @default.
- W4313163001 created "2023-01-06" @default.
- W4313163001 creator A5006572116 @default.
- W4313163001 creator A5039254301 @default.
- W4313163001 creator A5052953606 @default.
- W4313163001 creator A5066595711 @default.
- W4313163001 creator A5066831817 @default.
- W4313163001 date "2022-01-01" @default.
- W4313163001 modified "2023-10-16" @default.
- W4313163001 title "Multimodal Pre-training Based on Graph Attention Network for Document Understanding" @default.
- W4313163001 doi "https://doi.org/10.1109/tmm.2022.3214102" @default.
- W4313163001 hasPublicationYear "2022" @default.
- W4313163001 type Work @default.
- W4313163001 citedByCount "7" @default.
- W4313163001 countsByYear W43131630012022 @default.
- W4313163001 countsByYear W43131630012023 @default.
- W4313163001 crossrefType "journal-article" @default.
- W4313163001 hasAuthorship W4313163001A5006572116 @default.
- W4313163001 hasAuthorship W4313163001A5039254301 @default.
- W4313163001 hasAuthorship W4313163001A5052953606 @default.
- W4313163001 hasAuthorship W4313163001A5066595711 @default.
- W4313163001 hasAuthorship W4313163001A5066831817 @default.
- W4313163001 hasBestOaLocation W43131630012 @default.
- W4313163001 hasConcept C127413603 @default.
- W4313163001 hasConcept C132525143 @default.
- W4313163001 hasConcept C154945302 @default.
- W4313163001 hasConcept C204321447 @default.
- W4313163001 hasConcept C23123220 @default.
- W4313163001 hasConcept C2777530160 @default.
- W4313163001 hasConcept C41008148 @default.
- W4313163001 hasConcept C59404180 @default.
- W4313163001 hasConcept C62611344 @default.
- W4313163001 hasConcept C66938386 @default.
- W4313163001 hasConcept C80444323 @default.
- W4313163001 hasConceptScore W4313163001C127413603 @default.
- W4313163001 hasConceptScore W4313163001C132525143 @default.
- W4313163001 hasConceptScore W4313163001C154945302 @default.
- W4313163001 hasConceptScore W4313163001C204321447 @default.
- W4313163001 hasConceptScore W4313163001C23123220 @default.
- W4313163001 hasConceptScore W4313163001C2777530160 @default.
- W4313163001 hasConceptScore W4313163001C41008148 @default.
- W4313163001 hasConceptScore W4313163001C59404180 @default.
- W4313163001 hasConceptScore W4313163001C62611344 @default.
- W4313163001 hasConceptScore W4313163001C66938386 @default.
- W4313163001 hasConceptScore W4313163001C80444323 @default.
- W4313163001 hasLocation W43131630011 @default.
- W4313163001 hasLocation W43131630012 @default.
- W4313163001 hasOpenAccess W4313163001 @default.
- W4313163001 hasPrimaryLocation W43131630011 @default.
- W4313163001 hasRelatedWork W2009831055 @default.
- W4313163001 hasRelatedWork W2083530853 @default.
- W4313163001 hasRelatedWork W2146114872 @default.
- W4313163001 hasRelatedWork W2375873920 @default.
- W4313163001 hasRelatedWork W2392060890 @default.
- W4313163001 hasRelatedWork W2392760275 @default.
- W4313163001 hasRelatedWork W2393172683 @default.
- W4313163001 hasRelatedWork W2968752923 @default.
- W4313163001 hasRelatedWork W2982905616 @default.
- W4313163001 hasRelatedWork W4285218279 @default.
- W4313163001 isParatext "false" @default.
- W4313163001 isRetracted "false" @default.
- W4313163001 workType "article" @default.