Matches in SemOpenAlex for { <https://semopenalex.org/work/W77888723> ?p ?o ?g. }
Showing items 1 to 76 of
76
with 100 items per page.
- W77888723 abstract "Globalization has led to a significant increase in the information flow between geographically remote locations with the realization of a common global market. When building a web site for use by various industries, developers need to deal with a wide range of users from different countries. Thus, a multilingual system must be implemented in order to provide the proper environment for those applications. Different languages can be produced by using the same script such as English, Malay, Spanish, etc., that uses Roman script. The issue is how to produce the reliable features of a web page that is to undergo language identification. Incorrectly identifying the language will results in garbled translations, faulty and incomplete analyses. The aim of this study is to enhance the effectiveness of feature selection method of web page language identification. A letter weighting method as feature selection embedded with fuzzy Adaptive Resonance Theory Map (ARTMAP) and simplified entropy embedded with decision tree are proposed to identify the language belonging to a web page. The methodology contains four major stages, namely; data preparation, data preprocessing, feature selection and identification. Data is collected from news website and then fed into preprocessing to filter out the noises. Feature selection reduces unnecessary attributes of the data in a proper feature representation. Language identification is to determine the predefined language of data. The scripts of languages such as Arabic, Hanzi, Roman, Indic and Cyrillic were used for the performance evaluation of web page language identification. Standard measurements such as T-test, f -fold cross validation, precision, recall and F1 measurements were used on results of the analysis. From the experimental analysis, it is observed that the simplified entropy outperforms the N-grams, entropy and letter weighting feature selection with an accuracy of 98.90%, 81.35%, 96.08% and 93.16%, respectively. The finding concludes that the proposed letter weighting and simplified entropy feature selection methods of web page language identification give promising results in terms of accuracy and retrieval performance at the letter representation level of web pages." @default.
- W77888723 created "2016-06-24" @default.
- W77888723 creator A5075266789 @default.
- W77888723 date "2010-02-01" @default.
- W77888723 modified "2023-09-27" @default.
- W77888723 title "Feature selection method of web page language identification" @default.
- W77888723 hasPublicationYear "2010" @default.
- W77888723 type Work @default.
- W77888723 sameAs 77888723 @default.
- W77888723 citedByCount "1" @default.
- W77888723 countsByYear W778887232013 @default.
- W77888723 crossrefType "dissertation" @default.
- W77888723 hasAuthorship W77888723A5075266789 @default.
- W77888723 hasConcept C116834253 @default.
- W77888723 hasConcept C124101348 @default.
- W77888723 hasConcept C129792486 @default.
- W77888723 hasConcept C136764020 @default.
- W77888723 hasConcept C138885662 @default.
- W77888723 hasConcept C148483581 @default.
- W77888723 hasConcept C154945302 @default.
- W77888723 hasConcept C195324797 @default.
- W77888723 hasConcept C199360897 @default.
- W77888723 hasConcept C204321447 @default.
- W77888723 hasConcept C21959979 @default.
- W77888723 hasConcept C23123220 @default.
- W77888723 hasConcept C2776401178 @default.
- W77888723 hasConcept C41008148 @default.
- W77888723 hasConcept C41895202 @default.
- W77888723 hasConcept C59822182 @default.
- W77888723 hasConcept C61423126 @default.
- W77888723 hasConcept C86803240 @default.
- W77888723 hasConceptScore W77888723C116834253 @default.
- W77888723 hasConceptScore W77888723C124101348 @default.
- W77888723 hasConceptScore W77888723C129792486 @default.
- W77888723 hasConceptScore W77888723C136764020 @default.
- W77888723 hasConceptScore W77888723C138885662 @default.
- W77888723 hasConceptScore W77888723C148483581 @default.
- W77888723 hasConceptScore W77888723C154945302 @default.
- W77888723 hasConceptScore W77888723C195324797 @default.
- W77888723 hasConceptScore W77888723C199360897 @default.
- W77888723 hasConceptScore W77888723C204321447 @default.
- W77888723 hasConceptScore W77888723C21959979 @default.
- W77888723 hasConceptScore W77888723C23123220 @default.
- W77888723 hasConceptScore W77888723C2776401178 @default.
- W77888723 hasConceptScore W77888723C41008148 @default.
- W77888723 hasConceptScore W77888723C41895202 @default.
- W77888723 hasConceptScore W77888723C59822182 @default.
- W77888723 hasConceptScore W77888723C61423126 @default.
- W77888723 hasConceptScore W77888723C86803240 @default.
- W77888723 hasLocation W778887231 @default.
- W77888723 hasOpenAccess W77888723 @default.
- W77888723 hasPrimaryLocation W778887231 @default.
- W77888723 hasRelatedWork W1543768931 @default.
- W77888723 hasRelatedWork W1569293908 @default.
- W77888723 hasRelatedWork W1593855322 @default.
- W77888723 hasRelatedWork W1977746397 @default.
- W77888723 hasRelatedWork W1991565788 @default.
- W77888723 hasRelatedWork W2010746350 @default.
- W77888723 hasRelatedWork W2044907962 @default.
- W77888723 hasRelatedWork W2064597428 @default.
- W77888723 hasRelatedWork W2082286200 @default.
- W77888723 hasRelatedWork W2091340087 @default.
- W77888723 hasRelatedWork W2105604080 @default.
- W77888723 hasRelatedWork W2148311256 @default.
- W77888723 hasRelatedWork W2149685069 @default.
- W77888723 hasRelatedWork W2253768319 @default.
- W77888723 hasRelatedWork W2354865300 @default.
- W77888723 hasRelatedWork W2407166915 @default.
- W77888723 hasRelatedWork W2510348187 @default.
- W77888723 hasRelatedWork W2607557087 @default.
- W77888723 hasRelatedWork W3137110532 @default.
- W77888723 hasRelatedWork W34787480 @default.
- W77888723 isParatext "false" @default.
- W77888723 isRetracted "false" @default.
- W77888723 magId "77888723" @default.
- W77888723 workType "dissertation" @default.