Matches in SemOpenAlex for { <https://semopenalex.org/work/W2015415022> ?p ?o ?g. }
Showing items 1 to 83 of
83
with 100 items per page.
- W2015415022 abstract "One of the common approaches to extracting high-quality knowledge from Web sources is to exploit the redundancy of the published information. Therefore, a Web Mining System not only has to search for relevant Web pages but also has to somehow determine whether two pages describe the same entity in order to extract as much knowledge as possible about it. It has been shown that statistical clustering techniques are in general a suitable means to achieve this task by grouping documents that are supposed to contain similar information. However, when data is given in tabular form - which is for instance a typical way of describing items in online shops - existing document clustering algorithms show limited performance as documents containing tabular descriptions typically share a very common set of tokens although they describe different entities. In this paper we therefore propose a new document clustering approach that exploits hyperlinks and document metadata to extract candidates for entity names. These candidate names are subsequently used to cluster the documents and further improve these names, which are finally used to determine whether two documents describe the same entity. The detailed evaluation of our approach in two popular example domains showed its high accuracy in terms of precision and recall (F-Measure > 0.9)." @default.
- W2015415022 created "2016-06-24" @default.
- W2015415022 creator A5005545405 @default.
- W2015415022 creator A5037543382 @default.
- W2015415022 creator A5083667930 @default.
- W2015415022 date "2007-10-28" @default.
- W2015415022 modified "2023-09-27" @default.
- W2015415022 title "Clustering web documents with tables for information extraction" @default.
- W2015415022 cites W2100684772 @default.
- W2015415022 doi "https://doi.org/10.1145/1298406.1298438" @default.
- W2015415022 hasPublicationYear "2007" @default.
- W2015415022 type Work @default.
- W2015415022 sameAs 2015415022 @default.
- W2015415022 citedByCount "4" @default.
- W2015415022 crossrefType "proceedings-article" @default.
- W2015415022 hasAuthorship W2015415022A5005545405 @default.
- W2015415022 hasAuthorship W2015415022A5037543382 @default.
- W2015415022 hasAuthorship W2015415022A5083667930 @default.
- W2015415022 hasBestOaLocation W20154150222 @default.
- W2015415022 hasConcept C111919701 @default.
- W2015415022 hasConcept C124101348 @default.
- W2015415022 hasConcept C136764020 @default.
- W2015415022 hasConcept C13743948 @default.
- W2015415022 hasConcept C152124472 @default.
- W2015415022 hasConcept C154945302 @default.
- W2015415022 hasConcept C162324750 @default.
- W2015415022 hasConcept C165696696 @default.
- W2015415022 hasConcept C177264268 @default.
- W2015415022 hasConcept C177937566 @default.
- W2015415022 hasConcept C187736073 @default.
- W2015415022 hasConcept C199360897 @default.
- W2015415022 hasConcept C21959979 @default.
- W2015415022 hasConcept C23123220 @default.
- W2015415022 hasConcept C2780451532 @default.
- W2015415022 hasConcept C30088001 @default.
- W2015415022 hasConcept C38652104 @default.
- W2015415022 hasConcept C41008148 @default.
- W2015415022 hasConcept C4554734 @default.
- W2015415022 hasConcept C73555534 @default.
- W2015415022 hasConcept C81669768 @default.
- W2015415022 hasConcept C93518851 @default.
- W2015415022 hasConcept C96711827 @default.
- W2015415022 hasConceptScore W2015415022C111919701 @default.
- W2015415022 hasConceptScore W2015415022C124101348 @default.
- W2015415022 hasConceptScore W2015415022C136764020 @default.
- W2015415022 hasConceptScore W2015415022C13743948 @default.
- W2015415022 hasConceptScore W2015415022C152124472 @default.
- W2015415022 hasConceptScore W2015415022C154945302 @default.
- W2015415022 hasConceptScore W2015415022C162324750 @default.
- W2015415022 hasConceptScore W2015415022C165696696 @default.
- W2015415022 hasConceptScore W2015415022C177264268 @default.
- W2015415022 hasConceptScore W2015415022C177937566 @default.
- W2015415022 hasConceptScore W2015415022C187736073 @default.
- W2015415022 hasConceptScore W2015415022C199360897 @default.
- W2015415022 hasConceptScore W2015415022C21959979 @default.
- W2015415022 hasConceptScore W2015415022C23123220 @default.
- W2015415022 hasConceptScore W2015415022C2780451532 @default.
- W2015415022 hasConceptScore W2015415022C30088001 @default.
- W2015415022 hasConceptScore W2015415022C38652104 @default.
- W2015415022 hasConceptScore W2015415022C41008148 @default.
- W2015415022 hasConceptScore W2015415022C4554734 @default.
- W2015415022 hasConceptScore W2015415022C73555534 @default.
- W2015415022 hasConceptScore W2015415022C81669768 @default.
- W2015415022 hasConceptScore W2015415022C93518851 @default.
- W2015415022 hasConceptScore W2015415022C96711827 @default.
- W2015415022 hasLocation W20154150221 @default.
- W2015415022 hasLocation W20154150222 @default.
- W2015415022 hasOpenAccess W2015415022 @default.
- W2015415022 hasPrimaryLocation W20154150221 @default.
- W2015415022 hasRelatedWork W1486482441 @default.
- W2015415022 hasRelatedWork W1963973829 @default.
- W2015415022 hasRelatedWork W2008345209 @default.
- W2015415022 hasRelatedWork W2051135816 @default.
- W2015415022 hasRelatedWork W2130476896 @default.
- W2015415022 hasRelatedWork W2146990843 @default.
- W2015415022 hasRelatedWork W235114465 @default.
- W2015415022 hasRelatedWork W2387844018 @default.
- W2015415022 hasRelatedWork W2546867392 @default.
- W2015415022 hasRelatedWork W3216588747 @default.
- W2015415022 isParatext "false" @default.
- W2015415022 isRetracted "false" @default.
- W2015415022 magId "2015415022" @default.
- W2015415022 workType "article" @default.