Matches in SemOpenAlex for { <https://semopenalex.org/work/W4386790580> ?p ?o ?g. }
Showing items 1 to 57 of
57
with 100 items per page.
- W4386790580 abstract "Natural language processing (NLP) is a topic in artificial intelligence to teach computer to understand human language. Researchers can feed text of some particular language in any length and type such as characters, words, and sentences into the algorithm to extract a summarized context in terms of numbers. To accept a word array in Thai language, tokenization process is needed to split a text into words because each sentence is written consecutively without any space between words. In general, different tokenizers can produce different sets of words from a single sentence, resulting in uncontrolled accuracies in NLP and related tasks. In this research, a method to solve the different results from different Thai tokenizers is introduced by aligning tokenization results together in the similar direction using neural networks encoders. Bi-LSTM and DistilBERT with triplet hard loss are used to train and transform sets of words to data in a new domain where vectors of each similar sentence are significantly closer. Finally, twenty-eight classifiers are created using two types of encoders, seven different tokenizers, with and without using the proposed method for comparative and analysis purposes. To demonstrate that the proposed approach can be used as a pre-trained method for other tasks, the sentiment datasets are used to measure the classification accuracy and investigate similarities of results from all classifiers." @default.
- W4386790580 created "2023-09-16" @default.
- W4386790580 creator A5005100428 @default.
- W4386790580 date "2023-09-15" @default.
- W4386790580 modified "2023-09-26" @default.
- W4386790580 title "Thai tokenizer invariant classification based on bi-lstm and distilbert encoders" @default.
- W4386790580 doi "https://doi.org/10.58837/chula.the.2021.113" @default.
- W4386790580 hasPublicationYear "2023" @default.
- W4386790580 type Work @default.
- W4386790580 citedByCount "0" @default.
- W4386790580 crossrefType "dissertation" @default.
- W4386790580 hasAuthorship W4386790580A5005100428 @default.
- W4386790580 hasBestOaLocation W43867905801 @default.
- W4386790580 hasConcept C111919701 @default.
- W4386790580 hasConcept C118505674 @default.
- W4386790580 hasConcept C138885662 @default.
- W4386790580 hasConcept C154945302 @default.
- W4386790580 hasConcept C176982825 @default.
- W4386790580 hasConcept C190470478 @default.
- W4386790580 hasConcept C204321447 @default.
- W4386790580 hasConcept C2777530160 @default.
- W4386790580 hasConcept C28490314 @default.
- W4386790580 hasConcept C33923547 @default.
- W4386790580 hasConcept C37914503 @default.
- W4386790580 hasConcept C41008148 @default.
- W4386790580 hasConcept C41895202 @default.
- W4386790580 hasConcept C90805587 @default.
- W4386790580 hasConceptScore W4386790580C111919701 @default.
- W4386790580 hasConceptScore W4386790580C118505674 @default.
- W4386790580 hasConceptScore W4386790580C138885662 @default.
- W4386790580 hasConceptScore W4386790580C154945302 @default.
- W4386790580 hasConceptScore W4386790580C176982825 @default.
- W4386790580 hasConceptScore W4386790580C190470478 @default.
- W4386790580 hasConceptScore W4386790580C204321447 @default.
- W4386790580 hasConceptScore W4386790580C2777530160 @default.
- W4386790580 hasConceptScore W4386790580C28490314 @default.
- W4386790580 hasConceptScore W4386790580C33923547 @default.
- W4386790580 hasConceptScore W4386790580C37914503 @default.
- W4386790580 hasConceptScore W4386790580C41008148 @default.
- W4386790580 hasConceptScore W4386790580C41895202 @default.
- W4386790580 hasConceptScore W4386790580C90805587 @default.
- W4386790580 hasLocation W43867905801 @default.
- W4386790580 hasOpenAccess W4386790580 @default.
- W4386790580 hasPrimaryLocation W43867905801 @default.
- W4386790580 hasRelatedWork W159132833 @default.
- W4386790580 hasRelatedWork W2106036226 @default.
- W4386790580 hasRelatedWork W2125145484 @default.
- W4386790580 hasRelatedWork W2903246208 @default.
- W4386790580 hasRelatedWork W2936858556 @default.
- W4386790580 hasRelatedWork W2969773591 @default.
- W4386790580 hasRelatedWork W3024381485 @default.
- W4386790580 hasRelatedWork W4225619937 @default.
- W4386790580 hasRelatedWork W4323341682 @default.
- W4386790580 hasRelatedWork W4385572842 @default.
- W4386790580 isParatext "false" @default.
- W4386790580 isRetracted "false" @default.
- W4386790580 workType "dissertation" @default.