Matches in SemOpenAlex for { <https://semopenalex.org/work/W2157668051> ?p ?o ?g. }
- W2157668051 abstract "We present in this paper the development of a specialized comparable corpora compilation tool, for which quality would be close to a manually compiled corpus. The comparability is based on three levels: domain, topic and type of discourse. Domain and topic can be filtered with the keywords used through web search. But the detection of the type of discourse needs a wide linguistic analysis. The first step of our work is to automate the detection of the type of discourse that can be found in a scientific domain (science and popular science) in French and Japanese languages. First, a contrastive stylistic analysis of the two types of discourse is done on both languages. This analysis leads to the creation of a reusable, generic and robust typology. Machine learning algorithms are then applied to the typology, using shallow parsing. We obtain good results, with an average precision of 80% and an average recall of 70% that demonstrate the efficiency of this typology. This classification tool is then inserted in a corpus compilation tool which is a text collection treatment chain realized through IBM UIMA system. Starting from two specialized web documents collection in French and Japanese, this tool creates the corresponding corpus." @default.
- W2157668051 created "2016-06-24" @default.
- W2157668051 creator A5018688029 @default.
- W2157668051 creator A5026850687 @default.
- W2157668051 creator A5062228577 @default.
- W2157668051 date "2009-01-01" @default.
- W2157668051 modified "2023-09-30" @default.
- W2157668051 title "Compilation of specialized comparable corpora in French and Japanese" @default.
- W2157668051 cites W1489992655 @default.
- W2157668051 cites W1509874444 @default.
- W2157668051 cites W1540550673 @default.
- W2157668051 cites W1604792744 @default.
- W2157668051 cites W1803293993 @default.
- W2157668051 cites W187110561 @default.
- W2157668051 cites W1972871876 @default.
- W2157668051 cites W2011248508 @default.
- W2157668051 cites W2023972959 @default.
- W2157668051 cites W2040759639 @default.
- W2157668051 cites W2041232209 @default.
- W2157668051 cites W2073865254 @default.
- W2157668051 cites W2093585241 @default.
- W2157668051 cites W2096797897 @default.
- W2157668051 cites W2102749417 @default.
- W2157668051 cites W2118020653 @default.
- W2157668051 cites W2122141987 @default.
- W2157668051 cites W2125055259 @default.
- W2157668051 cites W2189289576 @default.
- W2157668051 cites W2245509969 @default.
- W2157668051 cites W2503755628 @default.
- W2157668051 cites W2949108231 @default.
- W2157668051 cites W437205222 @default.
- W2157668051 cites W4627272 @default.
- W2157668051 cites W635275769 @default.
- W2157668051 cites W650585419 @default.
- W2157668051 cites W2462019851 @default.
- W2157668051 doi "https://doi.org/10.3115/1690339.1690353" @default.
- W2157668051 hasPublicationYear "2009" @default.
- W2157668051 type Work @default.
- W2157668051 sameAs 2157668051 @default.
- W2157668051 citedByCount "9" @default.
- W2157668051 countsByYear W21576680512012 @default.
- W2157668051 countsByYear W21576680512013 @default.
- W2157668051 crossrefType "proceedings-article" @default.
- W2157668051 hasAuthorship W2157668051A5018688029 @default.
- W2157668051 hasAuthorship W2157668051A5026850687 @default.
- W2157668051 hasAuthorship W2157668051A5062228577 @default.
- W2157668051 hasBestOaLocation W21576680511 @default.
- W2157668051 hasConcept C111472728 @default.
- W2157668051 hasConcept C114614502 @default.
- W2157668051 hasConcept C134306372 @default.
- W2157668051 hasConcept C138885662 @default.
- W2157668051 hasConcept C154945302 @default.
- W2157668051 hasConcept C166957645 @default.
- W2157668051 hasConcept C171250308 @default.
- W2157668051 hasConcept C186644900 @default.
- W2157668051 hasConcept C192562407 @default.
- W2157668051 hasConcept C197947376 @default.
- W2157668051 hasConcept C204321447 @default.
- W2157668051 hasConcept C23123220 @default.
- W2157668051 hasConcept C2779530757 @default.
- W2157668051 hasConcept C33923547 @default.
- W2157668051 hasConcept C36503486 @default.
- W2157668051 hasConcept C41008148 @default.
- W2157668051 hasConcept C70388272 @default.
- W2157668051 hasConcept C75795011 @default.
- W2157668051 hasConcept C95457728 @default.
- W2157668051 hasConceptScore W2157668051C111472728 @default.
- W2157668051 hasConceptScore W2157668051C114614502 @default.
- W2157668051 hasConceptScore W2157668051C134306372 @default.
- W2157668051 hasConceptScore W2157668051C138885662 @default.
- W2157668051 hasConceptScore W2157668051C154945302 @default.
- W2157668051 hasConceptScore W2157668051C166957645 @default.
- W2157668051 hasConceptScore W2157668051C171250308 @default.
- W2157668051 hasConceptScore W2157668051C186644900 @default.
- W2157668051 hasConceptScore W2157668051C192562407 @default.
- W2157668051 hasConceptScore W2157668051C197947376 @default.
- W2157668051 hasConceptScore W2157668051C204321447 @default.
- W2157668051 hasConceptScore W2157668051C23123220 @default.
- W2157668051 hasConceptScore W2157668051C2779530757 @default.
- W2157668051 hasConceptScore W2157668051C33923547 @default.
- W2157668051 hasConceptScore W2157668051C36503486 @default.
- W2157668051 hasConceptScore W2157668051C41008148 @default.
- W2157668051 hasConceptScore W2157668051C70388272 @default.
- W2157668051 hasConceptScore W2157668051C75795011 @default.
- W2157668051 hasConceptScore W2157668051C95457728 @default.
- W2157668051 hasLocation W21576680511 @default.
- W2157668051 hasOpenAccess W2157668051 @default.
- W2157668051 hasPrimaryLocation W21576680511 @default.
- W2157668051 hasRelatedWork W1592893681 @default.
- W2157668051 hasRelatedWork W1806995473 @default.
- W2157668051 hasRelatedWork W1978971213 @default.
- W2157668051 hasRelatedWork W1992419927 @default.
- W2157668051 hasRelatedWork W2167662847 @default.
- W2157668051 hasRelatedWork W2502722637 @default.
- W2157668051 hasRelatedWork W3107474891 @default.
- W2157668051 hasRelatedWork W1551406738 @default.
- W2157668051 hasRelatedWork W2594281132 @default.
- W2157668051 hasRelatedWork W2594596051 @default.
- W2157668051 isParatext "false" @default.
- W2157668051 isRetracted "false" @default.