Matches in SemOpenAlex for { <https://semopenalex.org/work/W2561899685> ?p ?o ?g. }
Showing items 1 to 86 of
86
with 100 items per page.
- W2561899685 abstract "Objectives: A framework involving Scansnap SV600 scanner and Google Optical character recognition (OCR) for creating parallel corpus which is a very essential component of Statistical Machine Translation (SMT). Methods and Analysis: Training a language model for a SMT system highly depends on the availability of a parallel corpus. An efficacious approach for collecting parallel sentences is the predominant step in an MT system. However, the creation of a parallel corpus requires extensive knowledge in both languages which is a time consuming process. Due to these limitations, making the documents digital becomes very difficult and which in turn affects the quality of machine translation systems. In this paper, we propose a faster and efficient way of generating English to Indian languages parallel corpus with less human involvement. With the help of a special type of scanner called Scansnap SV600 and Google OCR and a little linguistic knowledge, we can create a parallel corpus for any language pair, provided there should be paper documents with parallel sentences. Findings: It was possible to generate 40 parallel sentences in 1 hour time with this approach. Sophisticated morphological tools were used for changing the morphology of the text generated and thereby increase the size of the corpus. An additional benefit of this is to make ancient scriptures or other manuscripts in digital format which can then be referred by the coming generation to keep up the traditions of a nation or a society. Novelty: Time required for creating parallel corpus is reduced by incorporating Google OCR and book scanner." @default.
- W2561899685 created "2017-01-06" @default.
- W2561899685 creator A5002233877 @default.
- W2561899685 creator A5017904569 @default.
- W2561899685 creator A5037692978 @default.
- W2561899685 creator A5056488545 @default.
- W2561899685 creator A5070140601 @default.
- W2561899685 date "2016-12-08" @default.
- W2561899685 modified "2023-10-05" @default.
- W2561899685 title "A Fast and Efficient Framework for Creating Parallel Corpus" @default.
- W2561899685 cites W1964952992 @default.
- W2561899685 doi "https://doi.org/10.17485/ijst/2016/v9i45/106520" @default.
- W2561899685 hasPublicationYear "2016" @default.
- W2561899685 type Work @default.
- W2561899685 sameAs 2561899685 @default.
- W2561899685 citedByCount "0" @default.
- W2561899685 crossrefType "journal-article" @default.
- W2561899685 hasAuthorship W2561899685A5002233877 @default.
- W2561899685 hasAuthorship W2561899685A5017904569 @default.
- W2561899685 hasAuthorship W2561899685A5037692978 @default.
- W2561899685 hasAuthorship W2561899685A5056488545 @default.
- W2561899685 hasAuthorship W2561899685A5070140601 @default.
- W2561899685 hasBestOaLocation W25618996851 @default.
- W2561899685 hasConcept C115961682 @default.
- W2561899685 hasConcept C121332964 @default.
- W2561899685 hasConcept C137293760 @default.
- W2561899685 hasConcept C138885662 @default.
- W2561899685 hasConcept C154945302 @default.
- W2561899685 hasConcept C168167062 @default.
- W2561899685 hasConcept C203005215 @default.
- W2561899685 hasConcept C204321447 @default.
- W2561899685 hasConcept C2524010 @default.
- W2561899685 hasConcept C27206212 @default.
- W2561899685 hasConcept C2778738651 @default.
- W2561899685 hasConcept C2779751349 @default.
- W2561899685 hasConcept C2780861071 @default.
- W2561899685 hasConcept C2985367798 @default.
- W2561899685 hasConcept C33923547 @default.
- W2561899685 hasConcept C41008148 @default.
- W2561899685 hasConcept C546480517 @default.
- W2561899685 hasConcept C97355855 @default.
- W2561899685 hasConceptScore W2561899685C115961682 @default.
- W2561899685 hasConceptScore W2561899685C121332964 @default.
- W2561899685 hasConceptScore W2561899685C137293760 @default.
- W2561899685 hasConceptScore W2561899685C138885662 @default.
- W2561899685 hasConceptScore W2561899685C154945302 @default.
- W2561899685 hasConceptScore W2561899685C168167062 @default.
- W2561899685 hasConceptScore W2561899685C203005215 @default.
- W2561899685 hasConceptScore W2561899685C204321447 @default.
- W2561899685 hasConceptScore W2561899685C2524010 @default.
- W2561899685 hasConceptScore W2561899685C27206212 @default.
- W2561899685 hasConceptScore W2561899685C2778738651 @default.
- W2561899685 hasConceptScore W2561899685C2779751349 @default.
- W2561899685 hasConceptScore W2561899685C2780861071 @default.
- W2561899685 hasConceptScore W2561899685C2985367798 @default.
- W2561899685 hasConceptScore W2561899685C33923547 @default.
- W2561899685 hasConceptScore W2561899685C41008148 @default.
- W2561899685 hasConceptScore W2561899685C546480517 @default.
- W2561899685 hasConceptScore W2561899685C97355855 @default.
- W2561899685 hasLocation W25618996851 @default.
- W2561899685 hasOpenAccess W2561899685 @default.
- W2561899685 hasPrimaryLocation W25618996851 @default.
- W2561899685 hasRelatedWork W1524280836 @default.
- W2561899685 hasRelatedWork W1993141938 @default.
- W2561899685 hasRelatedWork W2017180565 @default.
- W2561899685 hasRelatedWork W2049992851 @default.
- W2561899685 hasRelatedWork W2104907655 @default.
- W2561899685 hasRelatedWork W2151197196 @default.
- W2561899685 hasRelatedWork W2168801328 @default.
- W2561899685 hasRelatedWork W2250929565 @default.
- W2561899685 hasRelatedWork W2294373863 @default.
- W2561899685 hasRelatedWork W2362815773 @default.
- W2561899685 hasRelatedWork W2379329769 @default.
- W2561899685 hasRelatedWork W2394256012 @default.
- W2561899685 hasRelatedWork W2618692990 @default.
- W2561899685 hasRelatedWork W2786343958 @default.
- W2561899685 hasRelatedWork W2791377086 @default.
- W2561899685 hasRelatedWork W2957227267 @default.
- W2561899685 hasRelatedWork W2962853934 @default.
- W2561899685 hasRelatedWork W3103037747 @default.
- W2561899685 hasRelatedWork W3115711567 @default.
- W2561899685 hasRelatedWork W3211839983 @default.
- W2561899685 isParatext "false" @default.
- W2561899685 isRetracted "false" @default.
- W2561899685 magId "2561899685" @default.
- W2561899685 workType "article" @default.