Matches in SemOpenAlex for { <https://semopenalex.org/work/W4385991037> ?p ?o ?g. }
Showing items 1 to 80 of
80
with 100 items per page.
- W4385991037 endingPage "386" @default.
- W4385991037 startingPage "371" @default.
- W4385991037 abstract "Benchmark datasets for table structure recognition (TSR) must be carefully processed to ensure they are annotated consistently. However, even if a dataset’s annotations are self-consistent, there may be significant inconsistency across datasets, which can harm the performance of models trained and evaluated on them. In this work, we show that aligning these benchmarks—removing both errors and inconsistency between them—improves model performance significantly. We demonstrate this through a data-centric approach where we adopt one model architecture, the Table Transformer (TATR), that we hold fixed throughout. Baseline exact match accuracy for TATR evaluated on the ICDAR-2013 benchmark is 65% when trained on PubTables-1M, 42% when trained on FinTabNet, and 69% combined. After reducing annotation mistakes and inter-dataset inconsistency, performance of TATR evaluated on ICDAR-2013 increases substantially to 75% when trained on PubTables-1M, 65% when trained on FinTabNet, and 81% combined. We show through ablations over the modification steps that canonicalization of the table annotations has a significantly positive effect on performance, while other choices balance necessary trade-offs that arise when deciding a benchmark dataset’s final composition. Overall we believe our work has significant implications for benchmark design for TSR and potentially other tasks as well. Dataset processing and training code will be released at https://github.com/microsoft/table-transformer ." @default.
- W4385991037 created "2023-08-19" @default.
- W4385991037 creator A5028532226 @default.
- W4385991037 creator A5078308260 @default.
- W4385991037 creator A5087414923 @default.
- W4385991037 date "2023-01-01" @default.
- W4385991037 modified "2023-10-14" @default.
- W4385991037 title "Aligning Benchmark Datasets for Table Structure Recognition" @default.
- W4385991037 cites W2022351003 @default.
- W4385991037 cites W2034841618 @default.
- W4385991037 cites W2142218344 @default.
- W4385991037 cites W2167460663 @default.
- W4385991037 cites W2749859009 @default.
- W4385991037 cites W2786162033 @default.
- W4385991037 cites W3003931580 @default.
- W4385991037 cites W3034997246 @default.
- W4385991037 cites W3092026060 @default.
- W4385991037 cites W3096609285 @default.
- W4385991037 cites W3107064625 @default.
- W4385991037 cites W3118722740 @default.
- W4385991037 cites W3167404434 @default.
- W4385991037 cites W3190766843 @default.
- W4385991037 cites W3217518891 @default.
- W4385991037 cites W4223519356 @default.
- W4385991037 cites W4226189435 @default.
- W4385991037 cites W4312554912 @default.
- W4385991037 doi "https://doi.org/10.1007/978-3-031-41734-4_23" @default.
- W4385991037 hasPublicationYear "2023" @default.
- W4385991037 type Work @default.
- W4385991037 citedByCount "0" @default.
- W4385991037 crossrefType "book-chapter" @default.
- W4385991037 hasAuthorship W4385991037A5028532226 @default.
- W4385991037 hasAuthorship W4385991037A5078308260 @default.
- W4385991037 hasAuthorship W4385991037A5087414923 @default.
- W4385991037 hasConcept C119857082 @default.
- W4385991037 hasConcept C121332964 @default.
- W4385991037 hasConcept C124101348 @default.
- W4385991037 hasConcept C13280743 @default.
- W4385991037 hasConcept C153180895 @default.
- W4385991037 hasConcept C154945302 @default.
- W4385991037 hasConcept C165801399 @default.
- W4385991037 hasConcept C185798385 @default.
- W4385991037 hasConcept C205649164 @default.
- W4385991037 hasConcept C2776321320 @default.
- W4385991037 hasConcept C41008148 @default.
- W4385991037 hasConcept C45235069 @default.
- W4385991037 hasConcept C62520636 @default.
- W4385991037 hasConcept C66322947 @default.
- W4385991037 hasConceptScore W4385991037C119857082 @default.
- W4385991037 hasConceptScore W4385991037C121332964 @default.
- W4385991037 hasConceptScore W4385991037C124101348 @default.
- W4385991037 hasConceptScore W4385991037C13280743 @default.
- W4385991037 hasConceptScore W4385991037C153180895 @default.
- W4385991037 hasConceptScore W4385991037C154945302 @default.
- W4385991037 hasConceptScore W4385991037C165801399 @default.
- W4385991037 hasConceptScore W4385991037C185798385 @default.
- W4385991037 hasConceptScore W4385991037C205649164 @default.
- W4385991037 hasConceptScore W4385991037C2776321320 @default.
- W4385991037 hasConceptScore W4385991037C41008148 @default.
- W4385991037 hasConceptScore W4385991037C45235069 @default.
- W4385991037 hasConceptScore W4385991037C62520636 @default.
- W4385991037 hasConceptScore W4385991037C66322947 @default.
- W4385991037 hasLocation W43859910371 @default.
- W4385991037 hasOpenAccess W4385991037 @default.
- W4385991037 hasPrimaryLocation W43859910371 @default.
- W4385991037 hasRelatedWork W2130974462 @default.
- W4385991037 hasRelatedWork W2218034408 @default.
- W4385991037 hasRelatedWork W2263699433 @default.
- W4385991037 hasRelatedWork W2358755282 @default.
- W4385991037 hasRelatedWork W2361861616 @default.
- W4385991037 hasRelatedWork W2377979023 @default.
- W4385991037 hasRelatedWork W2378211422 @default.
- W4385991037 hasRelatedWork W2392921965 @default.
- W4385991037 hasRelatedWork W2625833328 @default.
- W4385991037 hasRelatedWork W4321353415 @default.
- W4385991037 isParatext "false" @default.
- W4385991037 isRetracted "false" @default.
- W4385991037 workType "book-chapter" @default.