Matches in SemOpenAlex for { <https://semopenalex.org/work/W2154445423> ?p ?o ?g. }
Showing items 1 to 78 of
78
with 100 items per page.
- W2154445423 abstract "Web forums have become an important data resource for many web applications, but extracting structured data from unstructured web forum pages is still a challenging task due to both complex page layout designs and unrestricted user created posts. In this paper, we study the problem of structured data extraction from various web forum sites. Our target is to find a solution as general as possible to extract structured data, such as post title, post author, post time, and post content from any forum site. In contrast to most existing information extraction methods, which only leverage the knowledge inside an individual page, we incorporate both page-level and site-level knowledge and employ Markov logic networks (MLNs) to effectively integrate all useful evidence by learning their importance automatically. Site-level knowledge includes (1) the linkages among different object pages, such as list pages and post pages, and (2) the interrelationships of pages belonging to the same object. The experimental results on 20 forums show a very encouraging information extraction performance, and demonstrate the ability of the proposed approach on various forums. We also show that the performance is limited if only page-level knowledge is used, while when incorporating the site-level knowledge both precision and recall can be significantly improved." @default.
- W2154445423 created "2016-06-24" @default.
- W2154445423 creator A5002360117 @default.
- W2154445423 creator A5003713785 @default.
- W2154445423 creator A5035186499 @default.
- W2154445423 creator A5053966682 @default.
- W2154445423 creator A5069272117 @default.
- W2154445423 creator A5071798264 @default.
- W2154445423 date "2009-04-20" @default.
- W2154445423 modified "2023-10-01" @default.
- W2154445423 title "Incorporating site-level knowledge to extract structured data from web forums" @default.
- W2154445423 cites W1973483159 @default.
- W2154445423 cites W1977970897 @default.
- W2154445423 cites W2023094000 @default.
- W2154445423 cites W2034797903 @default.
- W2154445423 cites W2088600132 @default.
- W2154445423 cites W2096496923 @default.
- W2154445423 cites W2100779529 @default.
- W2154445423 cites W2105803339 @default.
- W2154445423 cites W2116216325 @default.
- W2154445423 cites W2129712609 @default.
- W2154445423 cites W2132613313 @default.
- W2154445423 cites W2135479443 @default.
- W2154445423 cites W2143309843 @default.
- W2154445423 cites W2157316480 @default.
- W2154445423 cites W2171472464 @default.
- W2154445423 cites W3149154678 @default.
- W2154445423 doi "https://doi.org/10.1145/1526709.1526735" @default.
- W2154445423 hasPublicationYear "2009" @default.
- W2154445423 type Work @default.
- W2154445423 sameAs 2154445423 @default.
- W2154445423 citedByCount "58" @default.
- W2154445423 countsByYear W21544454232012 @default.
- W2154445423 countsByYear W21544454232013 @default.
- W2154445423 countsByYear W21544454232014 @default.
- W2154445423 countsByYear W21544454232015 @default.
- W2154445423 countsByYear W21544454232016 @default.
- W2154445423 countsByYear W21544454232017 @default.
- W2154445423 countsByYear W21544454232018 @default.
- W2154445423 countsByYear W21544454232019 @default.
- W2154445423 countsByYear W21544454232020 @default.
- W2154445423 countsByYear W21544454232021 @default.
- W2154445423 crossrefType "proceedings-article" @default.
- W2154445423 hasAuthorship W2154445423A5002360117 @default.
- W2154445423 hasAuthorship W2154445423A5003713785 @default.
- W2154445423 hasAuthorship W2154445423A5035186499 @default.
- W2154445423 hasAuthorship W2154445423A5053966682 @default.
- W2154445423 hasAuthorship W2154445423A5069272117 @default.
- W2154445423 hasAuthorship W2154445423A5071798264 @default.
- W2154445423 hasConcept C110875604 @default.
- W2154445423 hasConcept C136764020 @default.
- W2154445423 hasConcept C23123220 @default.
- W2154445423 hasConcept C2522767166 @default.
- W2154445423 hasConcept C2984519610 @default.
- W2154445423 hasConcept C41008148 @default.
- W2154445423 hasConceptScore W2154445423C110875604 @default.
- W2154445423 hasConceptScore W2154445423C136764020 @default.
- W2154445423 hasConceptScore W2154445423C23123220 @default.
- W2154445423 hasConceptScore W2154445423C2522767166 @default.
- W2154445423 hasConceptScore W2154445423C2984519610 @default.
- W2154445423 hasConceptScore W2154445423C41008148 @default.
- W2154445423 hasLocation W21544454231 @default.
- W2154445423 hasOpenAccess W2154445423 @default.
- W2154445423 hasPrimaryLocation W21544454231 @default.
- W2154445423 hasRelatedWork W1605876250 @default.
- W2154445423 hasRelatedWork W2101955803 @default.
- W2154445423 hasRelatedWork W2119214692 @default.
- W2154445423 hasRelatedWork W2144190808 @default.
- W2154445423 hasRelatedWork W2357241418 @default.
- W2154445423 hasRelatedWork W2366644548 @default.
- W2154445423 hasRelatedWork W2376314740 @default.
- W2154445423 hasRelatedWork W2384888906 @default.
- W2154445423 hasRelatedWork W2469626427 @default.
- W2154445423 hasRelatedWork W2748952813 @default.
- W2154445423 isParatext "false" @default.
- W2154445423 isRetracted "false" @default.
- W2154445423 magId "2154445423" @default.
- W2154445423 workType "article" @default.