Matches in SemOpenAlex for { <https://semopenalex.org/work/W2900969767> ?p ?o ?g. }
Showing items 1 to 66 of
66
with 100 items per page.
- W2900969767 abstract "Web pages are becoming more complex than ever, as they are generated by Content Management Systems (CMS). Thus, analyzing them, i.e. automatically identifying and classifying different elements from Web pages, such as main content, menus, among others, becomes difficult. A solution to this issue is provided by Web page segmentation which refers to the process of dividing a Web page into visually and semantically coherent segments called blocks.The quality of a Web page segmenter is measured by its correctness and its genericity, i.e. the variety of Web page types it is able to segment. Our research focuses on enhancing this quality and measuring it in a fair and accurate way. We first propose a conceptual model for segmentation, as well as Block-o-Matic (BoM), our Web page segmenter. We propose an evaluation model that takes the content as well as the geometry of blocks into account in order to measure the correctness of a segmentation algorithm according to a predefined ground truth. The quality of four state of the art algorithms is experimentally tested on four types of pages. Our evaluation framework allows testing any segmenter, i.e. measuring their quality. The results show that BoM presents the best performance among the four segmentation algorithms tested, and also that the performance of segmenters depends on the type of page to segment.We present two applications of BoM. Pagelyzer uses BoM for comparing two Web pages versions and decides if they are similar or not. It is the main contribution of our team to the European project Scape (FP7-IP). We also developed a migration tool of Web pages from HTML4 format to HTML5 format in the context of Web archives." @default.
- W2900969767 created "2018-11-29" @default.
- W2900969767 creator A5087918849 @default.
- W2900969767 date "2015-01-22" @default.
- W2900969767 modified "2023-09-25" @default.
- W2900969767 title "Web page segmentation, evaluation and applications" @default.
- W2900969767 hasPublicationYear "2015" @default.
- W2900969767 type Work @default.
- W2900969767 sameAs 2900969767 @default.
- W2900969767 citedByCount "1" @default.
- W2900969767 countsByYear W29009697672016 @default.
- W2900969767 crossrefType "dissertation" @default.
- W2900969767 hasAuthorship W2900969767A5087918849 @default.
- W2900969767 hasConcept C11413529 @default.
- W2900969767 hasConcept C124101348 @default.
- W2900969767 hasConcept C136764020 @default.
- W2900969767 hasConcept C146849305 @default.
- W2900969767 hasConcept C154945302 @default.
- W2900969767 hasConcept C21959979 @default.
- W2900969767 hasConcept C23123220 @default.
- W2900969767 hasConcept C2524010 @default.
- W2900969767 hasConcept C2777210771 @default.
- W2900969767 hasConcept C33923547 @default.
- W2900969767 hasConcept C41008148 @default.
- W2900969767 hasConcept C55439883 @default.
- W2900969767 hasConcept C89600930 @default.
- W2900969767 hasConceptScore W2900969767C11413529 @default.
- W2900969767 hasConceptScore W2900969767C124101348 @default.
- W2900969767 hasConceptScore W2900969767C136764020 @default.
- W2900969767 hasConceptScore W2900969767C146849305 @default.
- W2900969767 hasConceptScore W2900969767C154945302 @default.
- W2900969767 hasConceptScore W2900969767C21959979 @default.
- W2900969767 hasConceptScore W2900969767C23123220 @default.
- W2900969767 hasConceptScore W2900969767C2524010 @default.
- W2900969767 hasConceptScore W2900969767C2777210771 @default.
- W2900969767 hasConceptScore W2900969767C33923547 @default.
- W2900969767 hasConceptScore W2900969767C41008148 @default.
- W2900969767 hasConceptScore W2900969767C55439883 @default.
- W2900969767 hasConceptScore W2900969767C89600930 @default.
- W2900969767 hasLocation W29009697671 @default.
- W2900969767 hasOpenAccess W2900969767 @default.
- W2900969767 hasPrimaryLocation W29009697671 @default.
- W2900969767 hasRelatedWork W111499458 @default.
- W2900969767 hasRelatedWork W1488090274 @default.
- W2900969767 hasRelatedWork W1507164096 @default.
- W2900969767 hasRelatedWork W1517229911 @default.
- W2900969767 hasRelatedWork W1544683773 @default.
- W2900969767 hasRelatedWork W180238871 @default.
- W2900969767 hasRelatedWork W1966102147 @default.
- W2900969767 hasRelatedWork W1991738584 @default.
- W2900969767 hasRelatedWork W2057398832 @default.
- W2900969767 hasRelatedWork W2057510760 @default.
- W2900969767 hasRelatedWork W2149879421 @default.
- W2900969767 hasRelatedWork W2157316480 @default.
- W2900969767 hasRelatedWork W2158026699 @default.
- W2900969767 hasRelatedWork W2160775200 @default.
- W2900969767 hasRelatedWork W2189277793 @default.
- W2900969767 hasRelatedWork W2351711590 @default.
- W2900969767 hasRelatedWork W2351888253 @default.
- W2900969767 hasRelatedWork W2549918794 @default.
- W2900969767 hasRelatedWork W1551745536 @default.
- W2900969767 hasRelatedWork W2000745862 @default.
- W2900969767 isParatext "false" @default.
- W2900969767 isRetracted "false" @default.
- W2900969767 magId "2900969767" @default.
- W2900969767 workType "dissertation" @default.