Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386320423> ?p ?o ?g. }
- W4386320423 endingPage "99100" @default.
- W4386320423 startingPage "99083" @default.
- W4386320423 abstract "The term <italic xmlns:mml=http://www.w3.org/1998/Math/MathML xmlns:xlink=http://www.w3.org/1999/xlink>Text Mining</i> , which is given to the set of techniques used for the extraction, cleaning and processing of the information in texts, has become useful to provide valuable information to other algorithms and widely used with statistical and machine learning methods. By enabling the extraction of useful insights from textual data, Text Mining has become a potent tool in decision-making and knowledge discovery across many areas, including health care, government, education and industry. R is a mature open-source programming environment that has overstepped its initial scope of application for statistical computing and graphics to be used in pretty all the Data Science knowledge Area Groups. The objective of this paper is to present review and benchmarking analysis of packages for text mining techniques with R in computational systems. The paper reviews thirteen different packages comparing them on their execution time and memory used, for which new tests have been specifically designed. The results of this approach have been intended to be used over the most common tasks carried out when analyzing texts, and comparisons included allow R users to know which packages are best for each task and to improve their performance. Text mining package (tm) stands out particularly in Tokenization and Stemming techniques, while fastTextR is the best choice for Topic Modeling and Normalization. Also in the case of the Term Frequency–Inverse Document Frequency (TF-IDF) technique, the textir package is a clear choice. The other packages will depend on whether the technique is applied to a document-term matrix (DTM) or to plain text. In addition, there are packages that perform better in runtime than in memory usage and vice versa, making the choice more difficult. Packages such as udpipe can achieve better results working in parallel. Future works will include the same analysis for parallel computing, hybrid approaches, and novel algorithms." @default.
- W4386320423 created "2023-09-01" @default.
- W4386320423 creator A5003197810 @default.
- W4386320423 creator A5031598229 @default.
- W4386320423 creator A5039218778 @default.
- W4386320423 creator A5076953540 @default.
- W4386320423 creator A5083718621 @default.
- W4386320423 creator A5091983834 @default.
- W4386320423 date "2023-01-01" @default.
- W4386320423 modified "2023-09-27" @default.
- W4386320423 title "A Comparative Study on R Packages for Text Mining" @default.
- W4386320423 cites W1753361524 @default.
- W4386320423 cites W1973193168 @default.
- W4386320423 cites W1979076595 @default.
- W4386320423 cites W1983578042 @default.
- W4386320423 cites W2002639244 @default.
- W4386320423 cites W2008495066 @default.
- W4386320423 cites W2016089260 @default.
- W4386320423 cites W2037450062 @default.
- W4386320423 cites W2042670859 @default.
- W4386320423 cites W2060711947 @default.
- W4386320423 cites W2091273188 @default.
- W4386320423 cites W2102451297 @default.
- W4386320423 cites W2130851608 @default.
- W4386320423 cites W2134967412 @default.
- W4386320423 cites W2151790155 @default.
- W4386320423 cites W2152311353 @default.
- W4386320423 cites W2157963336 @default.
- W4386320423 cites W2158997610 @default.
- W4386320423 cites W2166047860 @default.
- W4386320423 cites W2315432931 @default.
- W4386320423 cites W2406594196 @default.
- W4386320423 cites W2465750682 @default.
- W4386320423 cites W2741029840 @default.
- W4386320423 cites W2770602608 @default.
- W4386320423 cites W2964049346 @default.
- W4386320423 cites W3046858608 @default.
- W4386320423 cites W3091359788 @default.
- W4386320423 cites W3145974016 @default.
- W4386320423 cites W4211036025 @default.
- W4386320423 cites W4285022034 @default.
- W4386320423 doi "https://doi.org/10.1109/access.2023.3310818" @default.
- W4386320423 hasPublicationYear "2023" @default.
- W4386320423 type Work @default.
- W4386320423 citedByCount "0" @default.
- W4386320423 crossrefType "journal-article" @default.
- W4386320423 hasAuthorship W4386320423A5003197810 @default.
- W4386320423 hasAuthorship W4386320423A5031598229 @default.
- W4386320423 hasAuthorship W4386320423A5039218778 @default.
- W4386320423 hasAuthorship W4386320423A5076953540 @default.
- W4386320423 hasAuthorship W4386320423A5083718621 @default.
- W4386320423 hasAuthorship W4386320423A5091983834 @default.
- W4386320423 hasBestOaLocation W43863204231 @default.
- W4386320423 hasConcept C120567893 @default.
- W4386320423 hasConcept C121684516 @default.
- W4386320423 hasConcept C124101348 @default.
- W4386320423 hasConcept C144133560 @default.
- W4386320423 hasConcept C154945302 @default.
- W4386320423 hasConcept C162853370 @default.
- W4386320423 hasConcept C165141518 @default.
- W4386320423 hasConcept C176982825 @default.
- W4386320423 hasConcept C195807954 @default.
- W4386320423 hasConcept C21442007 @default.
- W4386320423 hasConcept C23123220 @default.
- W4386320423 hasConcept C2522767166 @default.
- W4386320423 hasConcept C2779500292 @default.
- W4386320423 hasConcept C41008148 @default.
- W4386320423 hasConcept C71472368 @default.
- W4386320423 hasConcept C86251818 @default.
- W4386320423 hasConceptScore W4386320423C120567893 @default.
- W4386320423 hasConceptScore W4386320423C121684516 @default.
- W4386320423 hasConceptScore W4386320423C124101348 @default.
- W4386320423 hasConceptScore W4386320423C144133560 @default.
- W4386320423 hasConceptScore W4386320423C154945302 @default.
- W4386320423 hasConceptScore W4386320423C162853370 @default.
- W4386320423 hasConceptScore W4386320423C165141518 @default.
- W4386320423 hasConceptScore W4386320423C176982825 @default.
- W4386320423 hasConceptScore W4386320423C195807954 @default.
- W4386320423 hasConceptScore W4386320423C21442007 @default.
- W4386320423 hasConceptScore W4386320423C23123220 @default.
- W4386320423 hasConceptScore W4386320423C2522767166 @default.
- W4386320423 hasConceptScore W4386320423C2779500292 @default.
- W4386320423 hasConceptScore W4386320423C41008148 @default.
- W4386320423 hasConceptScore W4386320423C71472368 @default.
- W4386320423 hasConceptScore W4386320423C86251818 @default.
- W4386320423 hasFunder F4320313831 @default.
- W4386320423 hasLocation W43863204231 @default.
- W4386320423 hasOpenAccess W4386320423 @default.
- W4386320423 hasPrimaryLocation W43863204231 @default.
- W4386320423 hasRelatedWork W1663435917 @default.
- W4386320423 hasRelatedWork W1970516776 @default.
- W4386320423 hasRelatedWork W2181341562 @default.
- W4386320423 hasRelatedWork W2725657302 @default.
- W4386320423 hasRelatedWork W2749535755 @default.
- W4386320423 hasRelatedWork W2778788850 @default.
- W4386320423 hasRelatedWork W2966650678 @default.
- W4386320423 hasRelatedWork W4381093986 @default.
- W4386320423 hasRelatedWork W2168979046 @default.