Matches in SemOpenAlex for { <https://semopenalex.org/work/W4225256521> ?p ?o ?g. }
- W4225256521 endingPage "16" @default.
- W4225256521 startingPage "1" @default.
- W4225256521 abstract "With the increasing number of online social posts, review comments, and digital documentations, the Arabic text classification (ATC) task has been hugely required for many spontaneous natural language processing (NLP) applications, especially within the coronavirus pandemics. The variations in the meaning of the same Arabic words could directly affect the performance of any AI-based framework. This work aims to identify the effectiveness of machine learning (ML) algorithms through preprocessing and representation techniques. This effectiveness is measured via different AI-based classification techniques. Basically, the ATC process is influenced by several factors such as stemming in preprocessing, method of feature extraction and selection, nature of datasets, and classification algorithm. To improve the overall classification performance, preprocessing techniques are mainly used to convert each Arabic word into its root and decrease the representation dimension among the datasets. Feature extraction and selection always play crucial roles to represent the Arabic text in a meaningful way and improve the classification accuracy rate. The selected classifiers in this study are performed based on various feature selection algorithms. The overall classification evaluation results are compared using different classifiers such as multinomial Naive Bayes (MNB), Bernoulli Naive Bayes (BNB), Stochastic Gradient Descent (SGD), Support Vector Classifier (SVC), Logistic Regression (LR), and Linear SVC. All of these AI classifiers are evaluated using five balanced and unbalanced benchmark datasets: BBC Arabic corpus, CNN Arabic corpus, Open-Source Arabic corpus (OSAc), ArCovidVac, and AlKhaleej. The evaluation results show that the classification performance strongly depends on the preprocessing technique, representation methods and classification technique, and the nature of datasets used. For the considered benchmark datasets, the linear SVC has outperformed other classifiers overall when prominent features are selected." @default.
- W4225256521 created "2022-05-04" @default.
- W4225256521 creator A5001918184 @default.
- W4225256521 creator A5005785792 @default.
- W4225256521 creator A5022092645 @default.
- W4225256521 creator A5043393776 @default.
- W4225256521 creator A5048357267 @default.
- W4225256521 creator A5053263315 @default.
- W4225256521 creator A5076766601 @default.
- W4225256521 creator A5089641116 @default.
- W4225256521 date "2022-04-30" @default.
- W4225256521 modified "2023-10-17" @default.
- W4225256521 title "Arabic Document Classification: Performance Investigation of Preprocessing and Representation Techniques" @default.
- W4225256521 cites W1900288795 @default.
- W4225256521 cites W1984545377 @default.
- W4225256521 cites W2009988448 @default.
- W4225256521 cites W2077679384 @default.
- W4225256521 cites W2083308811 @default.
- W4225256521 cites W2118020653 @default.
- W4225256521 cites W2121469350 @default.
- W4225256521 cites W2137763598 @default.
- W4225256521 cites W2165612380 @default.
- W4225256521 cites W2204491699 @default.
- W4225256521 cites W2344555666 @default.
- W4225256521 cites W2406386374 @default.
- W4225256521 cites W2493916176 @default.
- W4225256521 cites W2569607704 @default.
- W4225256521 cites W2590061102 @default.
- W4225256521 cites W2749830926 @default.
- W4225256521 cites W2786553709 @default.
- W4225256521 cites W2793956967 @default.
- W4225256521 cites W2803393527 @default.
- W4225256521 cites W2809504579 @default.
- W4225256521 cites W2889308070 @default.
- W4225256521 cites W2900867770 @default.
- W4225256521 cites W2918227608 @default.
- W4225256521 cites W2922032333 @default.
- W4225256521 cites W292284645 @default.
- W4225256521 cites W2944085222 @default.
- W4225256521 cites W2946844671 @default.
- W4225256521 cites W2948589047 @default.
- W4225256521 cites W2962739339 @default.
- W4225256521 cites W2964498586 @default.
- W4225256521 cites W2964992537 @default.
- W4225256521 cites W2999969809 @default.
- W4225256521 cites W3001043111 @default.
- W4225256521 cites W3003603507 @default.
- W4225256521 cites W3007882656 @default.
- W4225256521 cites W3033750579 @default.
- W4225256521 cites W3041604685 @default.
- W4225256521 cites W3049714437 @default.
- W4225256521 cites W3095278059 @default.
- W4225256521 cites W3105625590 @default.
- W4225256521 cites W3110602624 @default.
- W4225256521 cites W3129831852 @default.
- W4225256521 cites W3186641415 @default.
- W4225256521 cites W3186749889 @default.
- W4225256521 cites W3201438617 @default.
- W4225256521 cites W3202829237 @default.
- W4225256521 cites W4205755349 @default.
- W4225256521 cites W4206764328 @default.
- W4225256521 cites W4206945699 @default.
- W4225256521 cites W4211260480 @default.
- W4225256521 cites W4220916830 @default.
- W4225256521 cites W4236122429 @default.
- W4225256521 cites W4248563703 @default.
- W4225256521 cites W4256133489 @default.
- W4225256521 doi "https://doi.org/10.1155/2022/3720358" @default.
- W4225256521 hasPublicationYear "2022" @default.
- W4225256521 type Work @default.
- W4225256521 citedByCount "9" @default.
- W4225256521 countsByYear W42252565212022 @default.
- W4225256521 countsByYear W42252565212023 @default.
- W4225256521 crossrefType "journal-article" @default.
- W4225256521 hasAuthorship W4225256521A5001918184 @default.
- W4225256521 hasAuthorship W4225256521A5005785792 @default.
- W4225256521 hasAuthorship W4225256521A5022092645 @default.
- W4225256521 hasAuthorship W4225256521A5043393776 @default.
- W4225256521 hasAuthorship W4225256521A5048357267 @default.
- W4225256521 hasAuthorship W4225256521A5053263315 @default.
- W4225256521 hasAuthorship W4225256521A5076766601 @default.
- W4225256521 hasAuthorship W4225256521A5089641116 @default.
- W4225256521 hasBestOaLocation W42252565211 @default.
- W4225256521 hasConcept C119857082 @default.
- W4225256521 hasConcept C12267149 @default.
- W4225256521 hasConcept C148483581 @default.
- W4225256521 hasConcept C153180895 @default.
- W4225256521 hasConcept C154945302 @default.
- W4225256521 hasConcept C204321447 @default.
- W4225256521 hasConcept C34736171 @default.
- W4225256521 hasConcept C41008148 @default.
- W4225256521 hasConcept C52001869 @default.
- W4225256521 hasConcept C52622490 @default.
- W4225256521 hasConcept C95623464 @default.
- W4225256521 hasConceptScore W4225256521C119857082 @default.
- W4225256521 hasConceptScore W4225256521C12267149 @default.
- W4225256521 hasConceptScore W4225256521C148483581 @default.
- W4225256521 hasConceptScore W4225256521C153180895 @default.