Matches in SemOpenAlex for { <https://semopenalex.org/work/W2624072012> ?p ?o ?g. }
Showing items 1 to 97 of
97
with 100 items per page.
- W2624072012 endingPage "1172" @default.
- W2624072012 startingPage "1160" @default.
- W2624072012 abstract "Identifying the language of a text is an important step for several natural language processing applications. State-of-the-art language identification (LID) systems perform very well when discriminating between unrelated languages on standard datasets. However, the LID task has a bottleneck when discriminating between similar languages or language varieties. Furthermore, LID has also proven to be very challenging when dealing with short texts such as the ones from Twitter. In this paper, we propose the use of smoothed n-gram language models to classify tweets in both Brazilian and European Portuguese variants. Word and character n-gram language models were combined and evaluated through five different classifiers. We have compared the smoothed n-gram language models together with the Term Frequency and Inverse Document Frequency weighting scheme. This paper also proposes an ensemble model, in which the class labels output were combined using majority voting and algebraic combiners. The best configuration reached accuracy of 92.71% using an ensemble model, which combines Lidstone (0.1) character 6-gram, Good–Turing word unigram, and Witten–Bell word bigram models, together with the Log-Likelihood Ratio estimation method." @default.
- W2624072012 created "2017-06-15" @default.
- W2624072012 creator A5018359147 @default.
- W2624072012 creator A5025998064 @default.
- W2624072012 creator A5033149946 @default.
- W2624072012 creator A5045903801 @default.
- W2624072012 creator A5055046285 @default.
- W2624072012 date "2017-12-01" @default.
- W2624072012 modified "2023-09-27" @default.
- W2624072012 title "Smoothed n-gram based models for tweet language identification: A case study of the Brazilian and European Portuguese national varieties" @default.
- W2624072012 cites W2015525779 @default.
- W2624072012 cites W2097927681 @default.
- W2624072012 cites W2142774925 @default.
- W2624072012 cites W2158994553 @default.
- W2624072012 cites W2167917621 @default.
- W2624072012 cites W2215376118 @default.
- W2624072012 cites W2306706380 @default.
- W2624072012 cites W2318083288 @default.
- W2624072012 cites W4239510810 @default.
- W2624072012 doi "https://doi.org/10.1016/j.asoc.2017.05.065" @default.
- W2624072012 hasPublicationYear "2017" @default.
- W2624072012 type Work @default.
- W2624072012 sameAs 2624072012 @default.
- W2624072012 citedByCount "18" @default.
- W2624072012 countsByYear W26240720122017 @default.
- W2624072012 countsByYear W26240720122018 @default.
- W2624072012 countsByYear W26240720122019 @default.
- W2624072012 countsByYear W26240720122020 @default.
- W2624072012 countsByYear W26240720122021 @default.
- W2624072012 countsByYear W26240720122022 @default.
- W2624072012 crossrefType "journal-article" @default.
- W2624072012 hasAuthorship W2624072012A5018359147 @default.
- W2624072012 hasAuthorship W2624072012A5025998064 @default.
- W2624072012 hasAuthorship W2624072012A5033149946 @default.
- W2624072012 hasAuthorship W2624072012A5045903801 @default.
- W2624072012 hasAuthorship W2624072012A5055046285 @default.
- W2624072012 hasConcept C108757681 @default.
- W2624072012 hasConcept C116834253 @default.
- W2624072012 hasConcept C117884012 @default.
- W2624072012 hasConcept C129353971 @default.
- W2624072012 hasConcept C129792486 @default.
- W2624072012 hasConcept C137293760 @default.
- W2624072012 hasConcept C137546455 @default.
- W2624072012 hasConcept C138885662 @default.
- W2624072012 hasConcept C154945302 @default.
- W2624072012 hasConcept C189430467 @default.
- W2624072012 hasConcept C195324797 @default.
- W2624072012 hasConcept C204321447 @default.
- W2624072012 hasConcept C39608478 @default.
- W2624072012 hasConcept C41008148 @default.
- W2624072012 hasConcept C41895202 @default.
- W2624072012 hasConcept C59822182 @default.
- W2624072012 hasConcept C83479923 @default.
- W2624072012 hasConcept C86803240 @default.
- W2624072012 hasConcept C90805587 @default.
- W2624072012 hasConceptScore W2624072012C108757681 @default.
- W2624072012 hasConceptScore W2624072012C116834253 @default.
- W2624072012 hasConceptScore W2624072012C117884012 @default.
- W2624072012 hasConceptScore W2624072012C129353971 @default.
- W2624072012 hasConceptScore W2624072012C129792486 @default.
- W2624072012 hasConceptScore W2624072012C137293760 @default.
- W2624072012 hasConceptScore W2624072012C137546455 @default.
- W2624072012 hasConceptScore W2624072012C138885662 @default.
- W2624072012 hasConceptScore W2624072012C154945302 @default.
- W2624072012 hasConceptScore W2624072012C189430467 @default.
- W2624072012 hasConceptScore W2624072012C195324797 @default.
- W2624072012 hasConceptScore W2624072012C204321447 @default.
- W2624072012 hasConceptScore W2624072012C39608478 @default.
- W2624072012 hasConceptScore W2624072012C41008148 @default.
- W2624072012 hasConceptScore W2624072012C41895202 @default.
- W2624072012 hasConceptScore W2624072012C59822182 @default.
- W2624072012 hasConceptScore W2624072012C83479923 @default.
- W2624072012 hasConceptScore W2624072012C86803240 @default.
- W2624072012 hasConceptScore W2624072012C90805587 @default.
- W2624072012 hasFunder F4320321091 @default.
- W2624072012 hasFunder F4320322025 @default.
- W2624072012 hasFunder F4320323678 @default.
- W2624072012 hasLocation W26240720121 @default.
- W2624072012 hasOpenAccess W2624072012 @default.
- W2624072012 hasPrimaryLocation W26240720121 @default.
- W2624072012 hasRelatedWork W100305897 @default.
- W2624072012 hasRelatedWork W138710363 @default.
- W2624072012 hasRelatedWork W1857365372 @default.
- W2624072012 hasRelatedWork W2098128378 @default.
- W2624072012 hasRelatedWork W2150143935 @default.
- W2624072012 hasRelatedWork W2392645474 @default.
- W2624072012 hasRelatedWork W2624072012 @default.
- W2624072012 hasRelatedWork W2774532642 @default.
- W2624072012 hasRelatedWork W4206038897 @default.
- W2624072012 hasRelatedWork W4377970538 @default.
- W2624072012 hasVolume "61" @default.
- W2624072012 isParatext "false" @default.
- W2624072012 isRetracted "false" @default.
- W2624072012 magId "2624072012" @default.
- W2624072012 workType "article" @default.