Matches in SemOpenAlex for { <https://semopenalex.org/work/W2541508047> ?p ?o ?g. }
Showing items 1 to 87 of
87
with 100 items per page.
- W2541508047 abstract "As online forums contain a vast amount of information that can aid in the early detection of fraud and extremist activities, accurate and efficient information extraction from forum sites is very important. In this paper, we discuss the limitations of existing works in the extraction of information from generic web sites and forum sites. We also identify the need for better suited, generalized and lightweight algorithms to carry out a more accurate and efficient information extraction while eliminating noisy data from forum sites. In this paper, we propose three generalized and lightweight algorithms to carry out accurate thread and post content extraction from web forums. We evaluate our algorithms based on two strict criteria and to the granularity of the (DOM tree) node level correctness. We consider a thread or post as successfully extracted by our algorithms only if (i) all the contents in its text and anchor nodes are extracted correctly, and (ii) each content node is grouped correctly according to its respective thread or post. Our experiments on ten different forum sites show that our proposed thread extraction algorithm achieves an average recall and precision rate of 100% and 98.66%, respectively, while our core post extraction algorithm achieves an average recall and precision rate of 99.74% and 99.79%, respectively." @default.
- W2541508047 created "2016-11-04" @default.
- W2541508047 creator A5040321131 @default.
- W2541508047 creator A5060355906 @default.
- W2541508047 creator A5068572741 @default.
- W2541508047 date "2013-12-01" @default.
- W2541508047 modified "2023-09-26" @default.
- W2541508047 title "Generalized and lightweight algorithms for automated web forum content extraction" @default.
- W2541508047 cites W1489992655 @default.
- W2541508047 cites W1963744784 @default.
- W2541508047 cites W1973483159 @default.
- W2541508047 cites W1976373002 @default.
- W2541508047 cites W2023094000 @default.
- W2541508047 cites W2066636486 @default.
- W2541508047 cites W2071984559 @default.
- W2541508047 cites W2088600132 @default.
- W2541508047 cites W2096478255 @default.
- W2541508047 cites W2100779529 @default.
- W2541508047 cites W2102189859 @default.
- W2541508047 cites W2103901064 @default.
- W2541508047 cites W2105803339 @default.
- W2541508047 cites W2116216325 @default.
- W2541508047 cites W2128836931 @default.
- W2541508047 cites W2129342550 @default.
- W2541508047 cites W2132613313 @default.
- W2541508047 cites W2135479443 @default.
- W2541508047 cites W2154445423 @default.
- W2541508047 cites W2158051716 @default.
- W2541508047 cites W4232071090 @default.
- W2541508047 cites W4250847188 @default.
- W2541508047 cites W4255854114 @default.
- W2541508047 cites W938539187 @default.
- W2541508047 doi "https://doi.org/10.1109/iccic.2013.6724259" @default.
- W2541508047 hasPublicationYear "2013" @default.
- W2541508047 type Work @default.
- W2541508047 sameAs 2541508047 @default.
- W2541508047 citedByCount "3" @default.
- W2541508047 countsByYear W25415080472017 @default.
- W2541508047 countsByYear W25415080472019 @default.
- W2541508047 countsByYear W25415080472023 @default.
- W2541508047 crossrefType "proceedings-article" @default.
- W2541508047 hasAuthorship W2541508047A5040321131 @default.
- W2541508047 hasAuthorship W2541508047A5060355906 @default.
- W2541508047 hasAuthorship W2541508047A5068572741 @default.
- W2541508047 hasConcept C111919701 @default.
- W2541508047 hasConcept C11413529 @default.
- W2541508047 hasConcept C119857082 @default.
- W2541508047 hasConcept C124101348 @default.
- W2541508047 hasConcept C138101251 @default.
- W2541508047 hasConcept C154945302 @default.
- W2541508047 hasConcept C177774035 @default.
- W2541508047 hasConcept C195807954 @default.
- W2541508047 hasConcept C23123220 @default.
- W2541508047 hasConcept C2987098735 @default.
- W2541508047 hasConcept C41008148 @default.
- W2541508047 hasConcept C55439883 @default.
- W2541508047 hasConcept C81669768 @default.
- W2541508047 hasConceptScore W2541508047C111919701 @default.
- W2541508047 hasConceptScore W2541508047C11413529 @default.
- W2541508047 hasConceptScore W2541508047C119857082 @default.
- W2541508047 hasConceptScore W2541508047C124101348 @default.
- W2541508047 hasConceptScore W2541508047C138101251 @default.
- W2541508047 hasConceptScore W2541508047C154945302 @default.
- W2541508047 hasConceptScore W2541508047C177774035 @default.
- W2541508047 hasConceptScore W2541508047C195807954 @default.
- W2541508047 hasConceptScore W2541508047C23123220 @default.
- W2541508047 hasConceptScore W2541508047C2987098735 @default.
- W2541508047 hasConceptScore W2541508047C41008148 @default.
- W2541508047 hasConceptScore W2541508047C55439883 @default.
- W2541508047 hasConceptScore W2541508047C81669768 @default.
- W2541508047 hasLocation W25415080471 @default.
- W2541508047 hasOpenAccess W2541508047 @default.
- W2541508047 hasPrimaryLocation W25415080471 @default.
- W2541508047 hasRelatedWork W1482030660 @default.
- W2541508047 hasRelatedWork W1788528807 @default.
- W2541508047 hasRelatedWork W1976628154 @default.
- W2541508047 hasRelatedWork W2001121861 @default.
- W2541508047 hasRelatedWork W2052457213 @default.
- W2541508047 hasRelatedWork W2153799433 @default.
- W2541508047 hasRelatedWork W2367301249 @default.
- W2541508047 hasRelatedWork W2379157006 @default.
- W2541508047 hasRelatedWork W2393978999 @default.
- W2541508047 hasRelatedWork W2725657302 @default.
- W2541508047 isParatext "false" @default.
- W2541508047 isRetracted "false" @default.
- W2541508047 magId "2541508047" @default.
- W2541508047 workType "article" @default.