Matches in SemOpenAlex for { <https://semopenalex.org/work/W4387340385> ?p ?o ?g. }
- W4387340385 endingPage "121852" @default.
- W4387340385 startingPage "121852" @default.
- W4387340385 abstract "Text classification first needs to convert the text into embedding vectors. Considering that static word embedding models such as Word2vec do not consider the position information of word and the difference of its role in different documents, while dynamic word embedding models such as Bert consume a large amount of time. An improved word embedding model based on pre-trained Word2vec is proposed, which achieves better classification accuracy and much lower classification time than Bert. At first, the concept of Term Document Frequency (TDF) is proposed on the basis of TF-IDF, and the TF-IDF-TDF of each word in different documents is calculated. Then, The positional encoding is added. Finally, in order to reduce the misleading of words with low importance, a filter is designed to set the embedding vector with low importance to zero. Considering that the sequence length that the deep learning model can handle is limited, and the text sequence exceeding the Maximum Sequence Length (MSL) set by the deep learning model will be directly truncated and discarded, an adaptive segmentation model is proposed, which can set different segmentation strategies for different texts according to the length of the text and the MSL. In order to maintain the continuity of adjacent text after segmentation, an adjacent-segment-vector-attended co-attention network is designed. In addition, the multi-channel convolution and the capsule network are designed to further extract deep hidden features. Multiple comparative experiment results show that the proposed model achieves the best Accuracy and Micro-F1 on five long text baseline datasets and six short text baseline datasets. In addition, when the MSL is not set too large compared with the document length in the dataset, the classification results of the proposed model are not affected by it." @default.
- W4387340385 created "2023-10-05" @default.
- W4387340385 creator A5027216830 @default.
- W4387340385 creator A5030627445 @default.
- W4387340385 creator A5047242742 @default.
- W4387340385 creator A5087988450 @default.
- W4387340385 creator A5060223542 @default.
- W4387340385 date "2024-03-01" @default.
- W4387340385 modified "2023-10-18" @default.
- W4387340385 title "Text classification with improved word embedding and adaptive segmentation" @default.
- W4387340385 cites W1978394996 @default.
- W4387340385 cites W1982589161 @default.
- W4387340385 cites W2108564850 @default.
- W4387340385 cites W2112796928 @default.
- W4387340385 cites W2115477592 @default.
- W4387340385 cites W2128420091 @default.
- W4387340385 cites W2142972908 @default.
- W4387340385 cites W2250539671 @default.
- W4387340385 cites W2470673105 @default.
- W4387340385 cites W2788347302 @default.
- W4387340385 cites W2962739339 @default.
- W4387340385 cites W2962946486 @default.
- W4387340385 cites W3007595536 @default.
- W4387340385 cites W3010081751 @default.
- W4387340385 cites W3081427626 @default.
- W4387340385 cites W3088261078 @default.
- W4387340385 cites W3099742251 @default.
- W4387340385 cites W3107577028 @default.
- W4387340385 cites W3128494784 @default.
- W4387340385 cites W3145531119 @default.
- W4387340385 cites W3153562715 @default.
- W4387340385 cites W3153720139 @default.
- W4387340385 cites W3195483305 @default.
- W4387340385 cites W3209451874 @default.
- W4387340385 cites W3213432680 @default.
- W4387340385 cites W4205462621 @default.
- W4387340385 cites W4206052922 @default.
- W4387340385 cites W4220977484 @default.
- W4387340385 cites W4239510810 @default.
- W4387340385 cites W4283735109 @default.
- W4387340385 cites W4285405572 @default.
- W4387340385 cites W4288760206 @default.
- W4387340385 cites W4289170424 @default.
- W4387340385 doi "https://doi.org/10.1016/j.eswa.2023.121852" @default.
- W4387340385 hasPublicationYear "2024" @default.
- W4387340385 type Work @default.
- W4387340385 citedByCount "0" @default.
- W4387340385 crossrefType "journal-article" @default.
- W4387340385 hasAuthorship W4387340385A5027216830 @default.
- W4387340385 hasAuthorship W4387340385A5030627445 @default.
- W4387340385 hasAuthorship W4387340385A5047242742 @default.
- W4387340385 hasAuthorship W4387340385A5060223542 @default.
- W4387340385 hasAuthorship W4387340385A5087988450 @default.
- W4387340385 hasBestOaLocation W43873403851 @default.
- W4387340385 hasConcept C106131492 @default.
- W4387340385 hasConcept C153180895 @default.
- W4387340385 hasConcept C154945302 @default.
- W4387340385 hasConcept C177264268 @default.
- W4387340385 hasConcept C199360897 @default.
- W4387340385 hasConcept C2524010 @default.
- W4387340385 hasConcept C2776461190 @default.
- W4387340385 hasConcept C2777462759 @default.
- W4387340385 hasConcept C2778112365 @default.
- W4387340385 hasConcept C28490314 @default.
- W4387340385 hasConcept C31972630 @default.
- W4387340385 hasConcept C33923547 @default.
- W4387340385 hasConcept C41008148 @default.
- W4387340385 hasConcept C41608201 @default.
- W4387340385 hasConcept C54355233 @default.
- W4387340385 hasConcept C86803240 @default.
- W4387340385 hasConcept C89600930 @default.
- W4387340385 hasConcept C90805587 @default.
- W4387340385 hasConcept C98501671 @default.
- W4387340385 hasConceptScore W4387340385C106131492 @default.
- W4387340385 hasConceptScore W4387340385C153180895 @default.
- W4387340385 hasConceptScore W4387340385C154945302 @default.
- W4387340385 hasConceptScore W4387340385C177264268 @default.
- W4387340385 hasConceptScore W4387340385C199360897 @default.
- W4387340385 hasConceptScore W4387340385C2524010 @default.
- W4387340385 hasConceptScore W4387340385C2776461190 @default.
- W4387340385 hasConceptScore W4387340385C2777462759 @default.
- W4387340385 hasConceptScore W4387340385C2778112365 @default.
- W4387340385 hasConceptScore W4387340385C28490314 @default.
- W4387340385 hasConceptScore W4387340385C31972630 @default.
- W4387340385 hasConceptScore W4387340385C33923547 @default.
- W4387340385 hasConceptScore W4387340385C41008148 @default.
- W4387340385 hasConceptScore W4387340385C41608201 @default.
- W4387340385 hasConceptScore W4387340385C54355233 @default.
- W4387340385 hasConceptScore W4387340385C86803240 @default.
- W4387340385 hasConceptScore W4387340385C89600930 @default.
- W4387340385 hasConceptScore W4387340385C90805587 @default.
- W4387340385 hasConceptScore W4387340385C98501671 @default.
- W4387340385 hasFunder F4320321940 @default.
- W4387340385 hasFunder F4320324174 @default.
- W4387340385 hasFunder F4320326674 @default.
- W4387340385 hasFunder F4320335777 @default.
- W4387340385 hasLocation W43873403851 @default.
- W4387340385 hasOpenAccess W4387340385 @default.