Matches in SemOpenAlex for { <https://semopenalex.org/work/W1510007506> ?p ?o ?g. }
Showing items 1 to 72 of
72
with 100 items per page.
- W1510007506 endingPage "48" @default.
- W1510007506 startingPage "39" @default.
- W1510007506 abstract "Web information extraction is a fundamental issue for web information management and integrations. A common approach is to use wrappers to extract data from web pages or documents. However, a critical issue for wrapper development is how to generate extraction rules. In this paper, we propose a novel two-phase rule generation and optimization (2P-RULE) approach for wrapper generation. 2P-RULE consists of internal rule optimization (IRO) process and external rule optimization (ERO) process. In IRO, a user, through a GUI interface, firstly creates a mapping from useful values in web page to a schema specified by the users according to target web information. Based on the mapping, the system automatically generates a rule list for the schema. Whereas in ERO, the user can create multiple mappings to generate further rule lists. All the acquired rule lists are merged and refined into one optimized rule list, which is expressed with XQuery as the final extraction rules. Experiments show that our 2P-RULE approach is suitable for extracting information from web pages with complex nested structure, and can also achieve better precision and recall ratio." @default.
- W1510007506 created "2016-06-24" @default.
- W1510007506 creator A5017141575 @default.
- W1510007506 creator A5049750015 @default.
- W1510007506 date "2006-01-01" @default.
- W1510007506 modified "2023-09-25" @default.
- W1510007506 title "A two-phase rule generation and optimization approach for wrapper generation" @default.
- W1510007506 cites W1530996180 @default.
- W1510007506 cites W1542272110 @default.
- W1510007506 cites W1543195558 @default.
- W1510007506 cites W1602270052 @default.
- W1510007506 cites W1603317483 @default.
- W1510007506 cites W1927338256 @default.
- W1510007506 cites W2005646337 @default.
- W1510007506 cites W2023673418 @default.
- W1510007506 cites W2145125025 @default.
- W1510007506 cites W2148210463 @default.
- W1510007506 cites W2150721933 @default.
- W1510007506 cites W2162340487 @default.
- W1510007506 hasPublicationYear "2006" @default.
- W1510007506 type Work @default.
- W1510007506 sameAs 1510007506 @default.
- W1510007506 citedByCount "2" @default.
- W1510007506 countsByYear W15100075062012 @default.
- W1510007506 crossrefType "proceedings-article" @default.
- W1510007506 hasAuthorship W1510007506A5017141575 @default.
- W1510007506 hasAuthorship W1510007506A5049750015 @default.
- W1510007506 hasConcept C124101348 @default.
- W1510007506 hasConcept C149271511 @default.
- W1510007506 hasConcept C154945302 @default.
- W1510007506 hasConcept C195807954 @default.
- W1510007506 hasConcept C23123220 @default.
- W1510007506 hasConcept C41008148 @default.
- W1510007506 hasConcept C52146309 @default.
- W1510007506 hasConcept C81669768 @default.
- W1510007506 hasConceptScore W1510007506C124101348 @default.
- W1510007506 hasConceptScore W1510007506C149271511 @default.
- W1510007506 hasConceptScore W1510007506C154945302 @default.
- W1510007506 hasConceptScore W1510007506C195807954 @default.
- W1510007506 hasConceptScore W1510007506C23123220 @default.
- W1510007506 hasConceptScore W1510007506C41008148 @default.
- W1510007506 hasConceptScore W1510007506C52146309 @default.
- W1510007506 hasConceptScore W1510007506C81669768 @default.
- W1510007506 hasLocation W15100075061 @default.
- W1510007506 hasOpenAccess W1510007506 @default.
- W1510007506 hasPrimaryLocation W15100075061 @default.
- W1510007506 hasRelatedWork W1529623952 @default.
- W1510007506 hasRelatedWork W1531514915 @default.
- W1510007506 hasRelatedWork W1555562505 @default.
- W1510007506 hasRelatedWork W1560727857 @default.
- W1510007506 hasRelatedWork W1667470981 @default.
- W1510007506 hasRelatedWork W1976108192 @default.
- W1510007506 hasRelatedWork W1981598089 @default.
- W1510007506 hasRelatedWork W1984404521 @default.
- W1510007506 hasRelatedWork W2055504454 @default.
- W1510007506 hasRelatedWork W207733427 @default.
- W1510007506 hasRelatedWork W2103036553 @default.
- W1510007506 hasRelatedWork W2152149667 @default.
- W1510007506 hasRelatedWork W3114965203 @default.
- W1510007506 hasRelatedWork W96996829 @default.
- W1510007506 hasRelatedWork W2099892409 @default.
- W1510007506 hasRelatedWork W2780684447 @default.
- W1510007506 hasRelatedWork W2861898291 @default.
- W1510007506 hasRelatedWork W3002479177 @default.
- W1510007506 hasRelatedWork W3107724327 @default.
- W1510007506 hasRelatedWork W3113691287 @default.
- W1510007506 isParatext "false" @default.
- W1510007506 isRetracted "false" @default.
- W1510007506 magId "1510007506" @default.
- W1510007506 workType "article" @default.