Matches in SemOpenAlex for { <https://semopenalex.org/work/W2783809651> ?p ?o ?g. }
Showing items 1 to 86 of
86
with 100 items per page.
- W2783809651 abstract "Web page keyword extraction is widely used in web text classification, text clustering, and information retrieval. However, the keyword extraction of the Chinese web page still need be improved and applied, especially in the medical field. This paper proposes an improved TF-IDF algorithm based on WF-TF-IDF to extract keywords from Chinese medical web page. The WF-TF-IDF algorithm considers three factors which are word frequency in the title, description and word distribution of categories in the corpus. We do the data-preprocessing which includes web page denoising, regular expression processing, Chinese word segmentation, synonyms exchanging and stop word filtering. Then we extract keywords based on the result of data-preprocessing. We filter the meaningless words in the extracted keywords according to the part of speech. The experimental results shows that the WF-TF-IDF algorithm improves the precision rate and recall rate by about 7% compared to the traditional TF-IDF algorithm." @default.
- W2783809651 created "2018-01-26" @default.
- W2783809651 creator A5003079947 @default.
- W2783809651 creator A5040215906 @default.
- W2783809651 creator A5081605311 @default.
- W2783809651 date "2017-10-01" @default.
- W2783809651 modified "2023-10-17" @default.
- W2783809651 title "The Keyword Extraction of Chinese Medical Web Page Based on WF-TF-IDF Algorithm" @default.
- W2783809651 cites W1993031820 @default.
- W2783809651 cites W2062079906 @default.
- W2783809651 cites W2064418625 @default.
- W2783809651 cites W2511698004 @default.
- W2783809651 cites W2575605699 @default.
- W2783809651 cites W1972709099 @default.
- W2783809651 doi "https://doi.org/10.1109/cyberc.2017.40" @default.
- W2783809651 hasPublicationYear "2017" @default.
- W2783809651 type Work @default.
- W2783809651 sameAs 2783809651 @default.
- W2783809651 citedByCount "19" @default.
- W2783809651 countsByYear W27838096512018 @default.
- W2783809651 countsByYear W27838096512019 @default.
- W2783809651 countsByYear W27838096512020 @default.
- W2783809651 countsByYear W27838096512021 @default.
- W2783809651 countsByYear W27838096512022 @default.
- W2783809651 countsByYear W27838096512023 @default.
- W2783809651 crossrefType "proceedings-article" @default.
- W2783809651 hasAuthorship W2783809651A5003079947 @default.
- W2783809651 hasAuthorship W2783809651A5040215906 @default.
- W2783809651 hasAuthorship W2783809651A5081605311 @default.
- W2783809651 hasConcept C10551718 @default.
- W2783809651 hasConcept C121332964 @default.
- W2783809651 hasConcept C136764020 @default.
- W2783809651 hasConcept C154945302 @default.
- W2783809651 hasConcept C188338183 @default.
- W2783809651 hasConcept C204321447 @default.
- W2783809651 hasConcept C21959979 @default.
- W2783809651 hasConcept C23123220 @default.
- W2783809651 hasConcept C2524010 @default.
- W2783809651 hasConcept C2780288562 @default.
- W2783809651 hasConcept C2987098735 @default.
- W2783809651 hasConcept C33923547 @default.
- W2783809651 hasConcept C34736171 @default.
- W2783809651 hasConcept C41008148 @default.
- W2783809651 hasConcept C61797465 @default.
- W2783809651 hasConcept C62520636 @default.
- W2783809651 hasConcept C73555534 @default.
- W2783809651 hasConcept C81669768 @default.
- W2783809651 hasConcept C81758059 @default.
- W2783809651 hasConcept C90805587 @default.
- W2783809651 hasConceptScore W2783809651C10551718 @default.
- W2783809651 hasConceptScore W2783809651C121332964 @default.
- W2783809651 hasConceptScore W2783809651C136764020 @default.
- W2783809651 hasConceptScore W2783809651C154945302 @default.
- W2783809651 hasConceptScore W2783809651C188338183 @default.
- W2783809651 hasConceptScore W2783809651C204321447 @default.
- W2783809651 hasConceptScore W2783809651C21959979 @default.
- W2783809651 hasConceptScore W2783809651C23123220 @default.
- W2783809651 hasConceptScore W2783809651C2524010 @default.
- W2783809651 hasConceptScore W2783809651C2780288562 @default.
- W2783809651 hasConceptScore W2783809651C2987098735 @default.
- W2783809651 hasConceptScore W2783809651C33923547 @default.
- W2783809651 hasConceptScore W2783809651C34736171 @default.
- W2783809651 hasConceptScore W2783809651C41008148 @default.
- W2783809651 hasConceptScore W2783809651C61797465 @default.
- W2783809651 hasConceptScore W2783809651C62520636 @default.
- W2783809651 hasConceptScore W2783809651C73555534 @default.
- W2783809651 hasConceptScore W2783809651C81669768 @default.
- W2783809651 hasConceptScore W2783809651C81758059 @default.
- W2783809651 hasConceptScore W2783809651C90805587 @default.
- W2783809651 hasLocation W27838096511 @default.
- W2783809651 hasOpenAccess W2783809651 @default.
- W2783809651 hasPrimaryLocation W27838096511 @default.
- W2783809651 hasRelatedWork W1886880132 @default.
- W2783809651 hasRelatedWork W1997256148 @default.
- W2783809651 hasRelatedWork W2111538853 @default.
- W2783809651 hasRelatedWork W2766722902 @default.
- W2783809651 hasRelatedWork W2788941900 @default.
- W2783809651 hasRelatedWork W2978993618 @default.
- W2783809651 hasRelatedWork W2984228201 @default.
- W2783809651 hasRelatedWork W3114946501 @default.
- W2783809651 hasRelatedWork W4285169119 @default.
- W2783809651 hasRelatedWork W4313188041 @default.
- W2783809651 isParatext "false" @default.
- W2783809651 isRetracted "false" @default.
- W2783809651 magId "2783809651" @default.
- W2783809651 workType "article" @default.