Matches in SemOpenAlex for { <https://semopenalex.org/work/W2022158118> ?p ?o ?g. }
Showing items 1 to 76 of
76
with 100 items per page.
- W2022158118 endingPage "509" @default.
- W2022158118 startingPage "491" @default.
- W2022158118 abstract "Many web sources provide access to an underlying database containing structured data. These data can be usually accessed in HTML form only, which makes it difficult for software programs to obtain them in structured form. Nevertheless, web sources usually encode data records using a consistent template or layout, and the implicit regularities in the template can be used to automatically infer the structure and extract the data. In this paper, we propose a set of novel techniques to address this problem. While several previous works have addressed the same problem, most of them require multiple input pages while our method requires only one. In addition, previous methods make some assumptions about how data records are encoded into web pages, which do not always hold in real websites. Finally, we have also tested our techniques with a high number of real web sources and we have found them to be very effective." @default.
- W2022158118 created "2016-06-24" @default.
- W2022158118 creator A5020728818 @default.
- W2022158118 creator A5031134938 @default.
- W2022158118 creator A5056798345 @default.
- W2022158118 creator A5068360111 @default.
- W2022158118 creator A5070668196 @default.
- W2022158118 date "2008-02-01" @default.
- W2022158118 modified "2023-09-28" @default.
- W2022158118 title "Extracting lists of data records from semi-structured web pages" @default.
- W2022158118 cites W1489992655 @default.
- W2022158118 cites W1492472380 @default.
- W2022158118 cites W1616576116 @default.
- W2022158118 cites W1821155018 @default.
- W2022158118 cites W1822119045 @default.
- W2022158118 cites W1946389870 @default.
- W2022158118 cites W2005646337 @default.
- W2022158118 cites W2049365470 @default.
- W2022158118 cites W2053045757 @default.
- W2022158118 cites W2065568440 @default.
- W2022158118 cites W2069388662 @default.
- W2022158118 cites W2073384429 @default.
- W2022158118 cites W2084801987 @default.
- W2022158118 cites W2096496923 @default.
- W2022158118 cites W2123718938 @default.
- W2022158118 cites W2128613007 @default.
- W2022158118 cites W2136072238 @default.
- W2022158118 cites W2138405339 @default.
- W2022158118 cites W2150721933 @default.
- W2022158118 cites W2160196229 @default.
- W2022158118 cites W2168358004 @default.
- W2022158118 doi "https://doi.org/10.1016/j.datak.2007.10.002" @default.
- W2022158118 hasPublicationYear "2008" @default.
- W2022158118 type Work @default.
- W2022158118 sameAs 2022158118 @default.
- W2022158118 citedByCount "60" @default.
- W2022158118 countsByYear W20221581182012 @default.
- W2022158118 countsByYear W20221581182013 @default.
- W2022158118 countsByYear W20221581182014 @default.
- W2022158118 countsByYear W20221581182015 @default.
- W2022158118 countsByYear W20221581182016 @default.
- W2022158118 countsByYear W20221581182017 @default.
- W2022158118 countsByYear W20221581182018 @default.
- W2022158118 crossrefType "journal-article" @default.
- W2022158118 hasAuthorship W2022158118A5020728818 @default.
- W2022158118 hasAuthorship W2022158118A5031134938 @default.
- W2022158118 hasAuthorship W2022158118A5056798345 @default.
- W2022158118 hasAuthorship W2022158118A5068360111 @default.
- W2022158118 hasAuthorship W2022158118A5070668196 @default.
- W2022158118 hasConcept C136764020 @default.
- W2022158118 hasConcept C23123220 @default.
- W2022158118 hasConcept C41008148 @default.
- W2022158118 hasConceptScore W2022158118C136764020 @default.
- W2022158118 hasConceptScore W2022158118C23123220 @default.
- W2022158118 hasConceptScore W2022158118C41008148 @default.
- W2022158118 hasIssue "2" @default.
- W2022158118 hasLocation W20221581181 @default.
- W2022158118 hasOpenAccess W2022158118 @default.
- W2022158118 hasPrimaryLocation W20221581181 @default.
- W2022158118 hasRelatedWork W2115485936 @default.
- W2022158118 hasRelatedWork W2119135658 @default.
- W2022158118 hasRelatedWork W2119214692 @default.
- W2022158118 hasRelatedWork W2144190808 @default.
- W2022158118 hasRelatedWork W2357241418 @default.
- W2022158118 hasRelatedWork W2366644548 @default.
- W2022158118 hasRelatedWork W2376314740 @default.
- W2022158118 hasRelatedWork W2384888906 @default.
- W2022158118 hasRelatedWork W2469626427 @default.
- W2022158118 hasRelatedWork W2748952813 @default.
- W2022158118 hasVolume "64" @default.
- W2022158118 isParatext "false" @default.
- W2022158118 isRetracted "false" @default.
- W2022158118 magId "2022158118" @default.
- W2022158118 workType "article" @default.