Matches in SemOpenAlex for { <https://semopenalex.org/work/W2104896715> ?p ?o ?g. }
Showing items 1 to 75 of
75
with 100 items per page.
- W2104896715 endingPage "1333" @default.
- W2104896715 startingPage "1327" @default.
- W2104896715 abstract "It is often desirable to extract structured information from raw web pages for better information browsing, query answering, and pattern mining. many such Information Extraction (IE) technologies are costly and applying them at the web-scale is impractical. In this paper, we propose a novel prioritization approach where candidate pages from the corpus are ordered according to their expected contribution to the extraction results and those with higher estimated potential are extracted earlier. Systems employing this approach can stop the extraction process at any time when the resource gets scarce (i.e., not all pages in the corpus can be processed), without worrying about wasting extraction effort on unimportant pages. More specifically, we define a novel notion to measure the value of extraction results and design various mechanisms for estimating a candidate page’s contribution to this value. We further design and build the Extraction Prioritization (EP) system with efficient scoring and scheduling algorithms, and experimentally demonstrate that EP significantly outperforms the naive approach and is more flexible than the classifier approach." @default.
- W2104896715 created "2016-06-24" @default.
- W2104896715 creator A5043275971 @default.
- W2104896715 creator A5048368267 @default.
- W2104896715 date "2010-07-05" @default.
- W2104896715 modified "2023-10-02" @default.
- W2104896715 title "Prioritization of Domain-Specific Web Information Extraction" @default.
- W2104896715 cites W1493490255 @default.
- W2104896715 cites W2069897123 @default.
- W2104896715 cites W2094728533 @default.
- W2104896715 cites W2096891167 @default.
- W2104896715 cites W2103224511 @default.
- W2104896715 cites W2122410182 @default.
- W2104896715 cites W2128384372 @default.
- W2104896715 cites W2162829716 @default.
- W2104896715 doi "https://doi.org/10.1609/aaai.v24i1.7500" @default.
- W2104896715 hasPublicationYear "2010" @default.
- W2104896715 type Work @default.
- W2104896715 sameAs 2104896715 @default.
- W2104896715 citedByCount "6" @default.
- W2104896715 countsByYear W21048967152012 @default.
- W2104896715 countsByYear W21048967152014 @default.
- W2104896715 countsByYear W21048967152015 @default.
- W2104896715 countsByYear W21048967152018 @default.
- W2104896715 crossrefType "journal-article" @default.
- W2104896715 hasAuthorship W2104896715A5043275971 @default.
- W2104896715 hasAuthorship W2104896715A5048368267 @default.
- W2104896715 hasBestOaLocation W21048967151 @default.
- W2104896715 hasConcept C124101348 @default.
- W2104896715 hasConcept C136764020 @default.
- W2104896715 hasConcept C153604712 @default.
- W2104896715 hasConcept C154945302 @default.
- W2104896715 hasConcept C162324750 @default.
- W2104896715 hasConcept C195807954 @default.
- W2104896715 hasConcept C21959979 @default.
- W2104896715 hasConcept C23123220 @default.
- W2104896715 hasConcept C2777615720 @default.
- W2104896715 hasConcept C41008148 @default.
- W2104896715 hasConcept C539667460 @default.
- W2104896715 hasConcept C65603577 @default.
- W2104896715 hasConcept C95623464 @default.
- W2104896715 hasConceptScore W2104896715C124101348 @default.
- W2104896715 hasConceptScore W2104896715C136764020 @default.
- W2104896715 hasConceptScore W2104896715C153604712 @default.
- W2104896715 hasConceptScore W2104896715C154945302 @default.
- W2104896715 hasConceptScore W2104896715C162324750 @default.
- W2104896715 hasConceptScore W2104896715C195807954 @default.
- W2104896715 hasConceptScore W2104896715C21959979 @default.
- W2104896715 hasConceptScore W2104896715C23123220 @default.
- W2104896715 hasConceptScore W2104896715C2777615720 @default.
- W2104896715 hasConceptScore W2104896715C41008148 @default.
- W2104896715 hasConceptScore W2104896715C539667460 @default.
- W2104896715 hasConceptScore W2104896715C65603577 @default.
- W2104896715 hasConceptScore W2104896715C95623464 @default.
- W2104896715 hasIssue "1" @default.
- W2104896715 hasLocation W21048967151 @default.
- W2104896715 hasOpenAccess W2104896715 @default.
- W2104896715 hasPrimaryLocation W21048967151 @default.
- W2104896715 hasRelatedWork W102721276 @default.
- W2104896715 hasRelatedWork W131325339 @default.
- W2104896715 hasRelatedWork W1528934735 @default.
- W2104896715 hasRelatedWork W1788528807 @default.
- W2104896715 hasRelatedWork W2104896715 @default.
- W2104896715 hasRelatedWork W2140915678 @default.
- W2104896715 hasRelatedWork W2153799433 @default.
- W2104896715 hasRelatedWork W2393978999 @default.
- W2104896715 hasRelatedWork W2725657302 @default.
- W2104896715 hasRelatedWork W2803765572 @default.
- W2104896715 hasVolume "24" @default.
- W2104896715 isParatext "false" @default.
- W2104896715 isRetracted "false" @default.
- W2104896715 magId "2104896715" @default.
- W2104896715 workType "article" @default.