Matches in SemOpenAlex for { <https://semopenalex.org/work/W2577020101> ?p ?o ?g. }
Showing items 1 to 98 of
98
with 100 items per page.
- W2577020101 abstract "Identifying and extracting figures and tables along with their captions from scholarly articles is important both as a way of providing tools for article summarization, and as part of larger systems that seek to gain deeper, semantic understanding of these articles. While many off-the-shelf tools exist that can extract embedded images from these documents, e.g. PDFBox, Poppler, etc., these tools are unable to extract tables, captions, and figures composed of vector graphics. Our proposed approach analyzes the structure of individual pages of a document by detecting chunks of body text, and locates the areas wherein figures or tables could reside by reasoning about the empty regions within that text. This method can extract a wide variety of figures because it does not make strong assumptions about the format of the figures embedded in the document, as long as they can be differentiated from the main article's text. Our algorithm also demonstrates a caption-to-figure matching component that is effective even in cases where individual captions are adjacent to multiple figures. Our contribution also includes methods for leveraging particular consistency and formatting assumptions to identify titles, body text and captions within each article. We introduce a new dataset of 150 computer science papers along with ground truth labels for the locations of the figures, tables and captions within them. Our algorithm achieves 96% precision at 92% recall when tested against this dataset, surpassing previous state of the art. We release our dataset, code, and evaluation scripts on our project website for enabling future research." @default.
- W2577020101 created "2017-01-26" @default.
- W2577020101 creator A5020085226 @default.
- W2577020101 creator A5046849436 @default.
- W2577020101 date "2015-04-01" @default.
- W2577020101 modified "2023-10-01" @default.
- W2577020101 title "Looking Beyond Text: Extracting Figures, Tables and Captions from Computer Science Papers" @default.
- W2577020101 cites W1500351990 @default.
- W2577020101 cites W1519192496 @default.
- W2577020101 cites W1848279521 @default.
- W2577020101 cites W2001642682 @default.
- W2577020101 cites W2023704244 @default.
- W2577020101 cites W2031489346 @default.
- W2577020101 cites W2091344457 @default.
- W2577020101 cites W2092772700 @default.
- W2577020101 cites W2105734423 @default.
- W2577020101 cites W2107092590 @default.
- W2577020101 cites W2139069711 @default.
- W2577020101 cites W2295508865 @default.
- W2577020101 hasPublicationYear "2015" @default.
- W2577020101 type Work @default.
- W2577020101 sameAs 2577020101 @default.
- W2577020101 citedByCount "14" @default.
- W2577020101 countsByYear W25770201012015 @default.
- W2577020101 countsByYear W25770201012016 @default.
- W2577020101 countsByYear W25770201012017 @default.
- W2577020101 countsByYear W25770201012018 @default.
- W2577020101 countsByYear W25770201012019 @default.
- W2577020101 countsByYear W25770201012020 @default.
- W2577020101 countsByYear W25770201012021 @default.
- W2577020101 crossrefType "proceedings-article" @default.
- W2577020101 hasAuthorship W2577020101A5020085226 @default.
- W2577020101 hasAuthorship W2577020101A5046849436 @default.
- W2577020101 hasConcept C105795698 @default.
- W2577020101 hasConcept C111919701 @default.
- W2577020101 hasConcept C124101348 @default.
- W2577020101 hasConcept C136764020 @default.
- W2577020101 hasConcept C154945302 @default.
- W2577020101 hasConcept C165064840 @default.
- W2577020101 hasConcept C170858558 @default.
- W2577020101 hasConcept C199360897 @default.
- W2577020101 hasConcept C204321447 @default.
- W2577020101 hasConcept C23123220 @default.
- W2577020101 hasConcept C2776436953 @default.
- W2577020101 hasConcept C2777737414 @default.
- W2577020101 hasConcept C33923547 @default.
- W2577020101 hasConcept C41008148 @default.
- W2577020101 hasConcept C45235069 @default.
- W2577020101 hasConcept C61423126 @default.
- W2577020101 hasConcept C68476402 @default.
- W2577020101 hasConcept C81669768 @default.
- W2577020101 hasConcept C88006597 @default.
- W2577020101 hasConceptScore W2577020101C105795698 @default.
- W2577020101 hasConceptScore W2577020101C111919701 @default.
- W2577020101 hasConceptScore W2577020101C124101348 @default.
- W2577020101 hasConceptScore W2577020101C136764020 @default.
- W2577020101 hasConceptScore W2577020101C154945302 @default.
- W2577020101 hasConceptScore W2577020101C165064840 @default.
- W2577020101 hasConceptScore W2577020101C170858558 @default.
- W2577020101 hasConceptScore W2577020101C199360897 @default.
- W2577020101 hasConceptScore W2577020101C204321447 @default.
- W2577020101 hasConceptScore W2577020101C23123220 @default.
- W2577020101 hasConceptScore W2577020101C2776436953 @default.
- W2577020101 hasConceptScore W2577020101C2777737414 @default.
- W2577020101 hasConceptScore W2577020101C33923547 @default.
- W2577020101 hasConceptScore W2577020101C41008148 @default.
- W2577020101 hasConceptScore W2577020101C45235069 @default.
- W2577020101 hasConceptScore W2577020101C61423126 @default.
- W2577020101 hasConceptScore W2577020101C68476402 @default.
- W2577020101 hasConceptScore W2577020101C81669768 @default.
- W2577020101 hasConceptScore W2577020101C88006597 @default.
- W2577020101 hasLocation W25770201011 @default.
- W2577020101 hasOpenAccess W2577020101 @default.
- W2577020101 hasPrimaryLocation W25770201011 @default.
- W2577020101 hasRelatedWork W1559499673 @default.
- W2577020101 hasRelatedWork W1603719052 @default.
- W2577020101 hasRelatedWork W194585701 @default.
- W2577020101 hasRelatedWork W1991869139 @default.
- W2577020101 hasRelatedWork W2001642682 @default.
- W2577020101 hasRelatedWork W2023704244 @default.
- W2577020101 hasRelatedWork W2027929866 @default.
- W2577020101 hasRelatedWork W2042014073 @default.
- W2577020101 hasRelatedWork W2053604034 @default.
- W2577020101 hasRelatedWork W2103558289 @default.
- W2577020101 hasRelatedWork W2168065722 @default.
- W2577020101 hasRelatedWork W2194775991 @default.
- W2577020101 hasRelatedWork W2416987009 @default.
- W2577020101 hasRelatedWork W2786480153 @default.
- W2577020101 hasRelatedWork W2896630878 @default.
- W2577020101 hasRelatedWork W2964346820 @default.
- W2577020101 hasRelatedWork W3046577738 @default.
- W2577020101 hasRelatedWork W3093351578 @default.
- W2577020101 hasRelatedWork W3114290877 @default.
- W2577020101 hasRelatedWork W3210140424 @default.
- W2577020101 isParatext "false" @default.
- W2577020101 isRetracted "false" @default.
- W2577020101 magId "2577020101" @default.
- W2577020101 workType "article" @default.