Matches in SemOpenAlex for { <https://semopenalex.org/work/W2384651694> ?p ?o ?g. }
Showing items 1 to 87 of
87
with 100 items per page.
- W2384651694 abstract "Bilingual parallel corpora can be used in many applications of NLP,but it's not easy to acquire the large-scale corpora automatically.This paper proposes an effective solution to mine high-quality bilingual parallel corpora from web pages and analyses the key technology of obtaining candidate mix-languages web pages and sentence alignment.We have extracted 1.67 million parallel sentences,which average accuracy is 93.75%,and the accuracy of the first 1 million sentences is 96%.This paper also proposes the sentences re-ranking method and domain information retrieval method to apply the web data to the training of SMT model.Experiments conducted on the IWSLT tasks show 2 to 5 BLEU gains over baseline." @default.
- W2384651694 created "2016-06-24" @default.
- W2384651694 creator A5076550702 @default.
- W2384651694 date "2010-01-01" @default.
- W2384651694 modified "2023-09-27" @default.
- W2384651694 title "Mining Parallel Corpora from Web and Its Application in Machine Translation" @default.
- W2384651694 hasPublicationYear "2010" @default.
- W2384651694 type Work @default.
- W2384651694 sameAs 2384651694 @default.
- W2384651694 citedByCount "0" @default.
- W2384651694 crossrefType "journal-article" @default.
- W2384651694 hasAuthorship W2384651694A5076550702 @default.
- W2384651694 hasConcept C104317684 @default.
- W2384651694 hasConcept C105580179 @default.
- W2384651694 hasConcept C111368507 @default.
- W2384651694 hasConcept C12725497 @default.
- W2384651694 hasConcept C127313418 @default.
- W2384651694 hasConcept C134306372 @default.
- W2384651694 hasConcept C136764020 @default.
- W2384651694 hasConcept C149364088 @default.
- W2384651694 hasConcept C154945302 @default.
- W2384651694 hasConcept C185592680 @default.
- W2384651694 hasConcept C189430467 @default.
- W2384651694 hasConcept C203005215 @default.
- W2384651694 hasConcept C204321447 @default.
- W2384651694 hasConcept C21959979 @default.
- W2384651694 hasConcept C23123220 @default.
- W2384651694 hasConcept C26517878 @default.
- W2384651694 hasConcept C2777530160 @default.
- W2384651694 hasConcept C2985367798 @default.
- W2384651694 hasConcept C33923547 @default.
- W2384651694 hasConcept C36503486 @default.
- W2384651694 hasConcept C38652104 @default.
- W2384651694 hasConcept C41008148 @default.
- W2384651694 hasConcept C55493867 @default.
- W2384651694 hasConcept C622187 @default.
- W2384651694 hasConceptScore W2384651694C104317684 @default.
- W2384651694 hasConceptScore W2384651694C105580179 @default.
- W2384651694 hasConceptScore W2384651694C111368507 @default.
- W2384651694 hasConceptScore W2384651694C12725497 @default.
- W2384651694 hasConceptScore W2384651694C127313418 @default.
- W2384651694 hasConceptScore W2384651694C134306372 @default.
- W2384651694 hasConceptScore W2384651694C136764020 @default.
- W2384651694 hasConceptScore W2384651694C149364088 @default.
- W2384651694 hasConceptScore W2384651694C154945302 @default.
- W2384651694 hasConceptScore W2384651694C185592680 @default.
- W2384651694 hasConceptScore W2384651694C189430467 @default.
- W2384651694 hasConceptScore W2384651694C203005215 @default.
- W2384651694 hasConceptScore W2384651694C204321447 @default.
- W2384651694 hasConceptScore W2384651694C21959979 @default.
- W2384651694 hasConceptScore W2384651694C23123220 @default.
- W2384651694 hasConceptScore W2384651694C26517878 @default.
- W2384651694 hasConceptScore W2384651694C2777530160 @default.
- W2384651694 hasConceptScore W2384651694C2985367798 @default.
- W2384651694 hasConceptScore W2384651694C33923547 @default.
- W2384651694 hasConceptScore W2384651694C36503486 @default.
- W2384651694 hasConceptScore W2384651694C38652104 @default.
- W2384651694 hasConceptScore W2384651694C41008148 @default.
- W2384651694 hasConceptScore W2384651694C55493867 @default.
- W2384651694 hasConceptScore W2384651694C622187 @default.
- W2384651694 hasLocation W23846516941 @default.
- W2384651694 hasOpenAccess W2384651694 @default.
- W2384651694 hasPrimaryLocation W23846516941 @default.
- W2384651694 hasRelatedWork W1988332774 @default.
- W2384651694 hasRelatedWork W2050690204 @default.
- W2384651694 hasRelatedWork W2083720130 @default.
- W2384651694 hasRelatedWork W2105673178 @default.
- W2384651694 hasRelatedWork W2131526336 @default.
- W2384651694 hasRelatedWork W2143927888 @default.
- W2384651694 hasRelatedWork W2159872955 @default.
- W2384651694 hasRelatedWork W2161891324 @default.
- W2384651694 hasRelatedWork W2251076467 @default.
- W2384651694 hasRelatedWork W2251366179 @default.
- W2384651694 hasRelatedWork W2378676934 @default.
- W2384651694 hasRelatedWork W2468033056 @default.
- W2384651694 hasRelatedWork W2548103837 @default.
- W2384651694 hasRelatedWork W2584080049 @default.
- W2384651694 hasRelatedWork W2740274673 @default.
- W2384651694 hasRelatedWork W2994054137 @default.
- W2384651694 hasRelatedWork W3204963350 @default.
- W2384651694 hasRelatedWork W80085545 @default.
- W2384651694 hasRelatedWork W970670907 @default.
- W2384651694 hasRelatedWork W2524010971 @default.
- W2384651694 isParatext "false" @default.
- W2384651694 isRetracted "false" @default.
- W2384651694 magId "2384651694" @default.
- W2384651694 workType "article" @default.