Matches in SemOpenAlex for { <https://semopenalex.org/work/W2334694400> ?p ?o ?g. }
Showing items 1 to 96 of
96
with 100 items per page.
- W2334694400 abstract "With the rapid development of Internet technology, Web has become a huge information source with massive amounts of data. But these data are usually embedded in the semi-structured pages. In order to use these data effectively, the primary problem is to extract the data and store them in structured form. Most of current approaches use a single classifier to extract web data, but relying on a single classifier is not sufficient and different classifier has different performance for the same problem. In this paper, we use the method of ensemble learning for web data extraction. Firstly, we parse the page as a Dom tree, identify the main data regions, and construct feature sets of text nodes in the region. Secondly, we choose multiple kinds of base classifiers (SVM, KNN and Random Forest) to build classification models and then use the linear method to integrate results of each classification model. Finally, we combine integration results with heuristic rules to get the final extraction results. The experiment results show that our approach outperforms the baseline approaches and has a good robustness." @default.
- W2334694400 created "2016-06-24" @default.
- W2334694400 creator A5004836810 @default.
- W2334694400 creator A5027992834 @default.
- W2334694400 creator A5035411921 @default.
- W2334694400 date "2015-06-30" @default.
- W2334694400 modified "2023-09-25" @default.
- W2334694400 title "Web Data Extraction Based on Ensemble Learning" @default.
- W2334694400 cites W1524234997 @default.
- W2334694400 cites W165838864 @default.
- W2334694400 cites W1965759148 @default.
- W2334694400 cites W1975846642 @default.
- W2334694400 cites W2015551056 @default.
- W2334694400 cites W2062543449 @default.
- W2334694400 cites W2104086170 @default.
- W2334694400 cites W2121751007 @default.
- W2334694400 cites W2122111042 @default.
- W2334694400 cites W2128341918 @default.
- W2334694400 cites W2134150392 @default.
- W2334694400 cites W2135479443 @default.
- W2334694400 cites W2135850590 @default.
- W2334694400 cites W2143309843 @default.
- W2334694400 cites W2150721933 @default.
- W2334694400 cites W2156909104 @default.
- W2334694400 cites W2160196229 @default.
- W2334694400 doi "https://doi.org/10.14257/ijdta.2015.8.3.27" @default.
- W2334694400 hasPublicationYear "2015" @default.
- W2334694400 type Work @default.
- W2334694400 sameAs 2334694400 @default.
- W2334694400 citedByCount "1" @default.
- W2334694400 countsByYear W23346944002017 @default.
- W2334694400 crossrefType "journal-article" @default.
- W2334694400 hasAuthorship W2334694400A5004836810 @default.
- W2334694400 hasAuthorship W2334694400A5027992834 @default.
- W2334694400 hasAuthorship W2334694400A5035411921 @default.
- W2334694400 hasConcept C104317684 @default.
- W2334694400 hasConcept C110875604 @default.
- W2334694400 hasConcept C119857082 @default.
- W2334694400 hasConcept C12267149 @default.
- W2334694400 hasConcept C124101348 @default.
- W2334694400 hasConcept C136764020 @default.
- W2334694400 hasConcept C153180895 @default.
- W2334694400 hasConcept C154945302 @default.
- W2334694400 hasConcept C169258074 @default.
- W2334694400 hasConcept C185592680 @default.
- W2334694400 hasConcept C186644900 @default.
- W2334694400 hasConcept C21959979 @default.
- W2334694400 hasConcept C41008148 @default.
- W2334694400 hasConcept C55493867 @default.
- W2334694400 hasConcept C63479239 @default.
- W2334694400 hasConcept C84525736 @default.
- W2334694400 hasConcept C95623464 @default.
- W2334694400 hasConceptScore W2334694400C104317684 @default.
- W2334694400 hasConceptScore W2334694400C110875604 @default.
- W2334694400 hasConceptScore W2334694400C119857082 @default.
- W2334694400 hasConceptScore W2334694400C12267149 @default.
- W2334694400 hasConceptScore W2334694400C124101348 @default.
- W2334694400 hasConceptScore W2334694400C136764020 @default.
- W2334694400 hasConceptScore W2334694400C153180895 @default.
- W2334694400 hasConceptScore W2334694400C154945302 @default.
- W2334694400 hasConceptScore W2334694400C169258074 @default.
- W2334694400 hasConceptScore W2334694400C185592680 @default.
- W2334694400 hasConceptScore W2334694400C186644900 @default.
- W2334694400 hasConceptScore W2334694400C21959979 @default.
- W2334694400 hasConceptScore W2334694400C41008148 @default.
- W2334694400 hasConceptScore W2334694400C55493867 @default.
- W2334694400 hasConceptScore W2334694400C63479239 @default.
- W2334694400 hasConceptScore W2334694400C84525736 @default.
- W2334694400 hasConceptScore W2334694400C95623464 @default.
- W2334694400 hasLocation W23346944001 @default.
- W2334694400 hasOpenAccess W2334694400 @default.
- W2334694400 hasPrimaryLocation W23346944001 @default.
- W2334694400 hasRelatedWork W1855811457 @default.
- W2334694400 hasRelatedWork W2056594684 @default.
- W2334694400 hasRelatedWork W2074469456 @default.
- W2334694400 hasRelatedWork W2082130842 @default.
- W2334694400 hasRelatedWork W2096734410 @default.
- W2334694400 hasRelatedWork W2112566102 @default.
- W2334694400 hasRelatedWork W2126686675 @default.
- W2334694400 hasRelatedWork W2143774383 @default.
- W2334694400 hasRelatedWork W2188465890 @default.
- W2334694400 hasRelatedWork W2340646070 @default.
- W2334694400 hasRelatedWork W2428321108 @default.
- W2334694400 hasRelatedWork W2783253624 @default.
- W2334694400 hasRelatedWork W2784762165 @default.
- W2334694400 hasRelatedWork W2789160147 @default.
- W2334694400 hasRelatedWork W2924463347 @default.
- W2334694400 hasRelatedWork W2960727058 @default.
- W2334694400 hasRelatedWork W3003566741 @default.
- W2334694400 hasRelatedWork W3006542432 @default.
- W2334694400 hasRelatedWork W3095317568 @default.
- W2334694400 hasRelatedWork W3189679720 @default.
- W2334694400 isParatext "false" @default.
- W2334694400 isRetracted "false" @default.
- W2334694400 magId "2334694400" @default.
- W2334694400 workType "article" @default.