Matches in SemOpenAlex for { <https://semopenalex.org/work/W4288099416> ?p ?o ?g. }
Showing items 1 to 51 of
51
with 100 items per page.
- W4288099416 abstract "Sentence embeddings are commonly used in text clustering and semantic retrieval tasks. State-of-the-art sentence representation methods are based on artificial neural networks fine-tuned on large collections of manually labeled sentence pairs. Sufficient amount of annotated data is available for high-resource languages such as English or Chinese. In less popular languages, multilingual models have to be used, which offer lower performance. In this publication, we address this problem by proposing a method for training effective language-specific sentence encoders without manually labeled data. Our approach is to automatically construct a dataset of paraphrase pairs from sentence-aligned bilingual text corpora. We then use the collected data to fine-tune a Transformer language model with an additional recurrent pooling layer. Our sentence encoder can be trained in less than a day on a single graphics card, achieving high performance on a diverse set of sentence-level tasks. We evaluate our method on eight linguistic tasks in Polish, comparing it with the best available multilingual sentence encoders." @default.
- W4288099416 created "2022-07-28" @default.
- W4288099416 creator A5037566627 @default.
- W4288099416 date "2022-07-26" @default.
- W4288099416 modified "2023-10-18" @default.
- W4288099416 title "Training Effective Neural Sentence Encoders from Automatically Mined Paraphrases" @default.
- W4288099416 doi "https://doi.org/10.48550/arxiv.2207.12759" @default.
- W4288099416 hasPublicationYear "2022" @default.
- W4288099416 type Work @default.
- W4288099416 citedByCount "0" @default.
- W4288099416 crossrefType "posted-content" @default.
- W4288099416 hasAuthorship W4288099416A5037566627 @default.
- W4288099416 hasBestOaLocation W42880994161 @default.
- W4288099416 hasConcept C111919701 @default.
- W4288099416 hasConcept C118505674 @default.
- W4288099416 hasConcept C121332964 @default.
- W4288099416 hasConcept C154945302 @default.
- W4288099416 hasConcept C165801399 @default.
- W4288099416 hasConcept C204321447 @default.
- W4288099416 hasConcept C2777530160 @default.
- W4288099416 hasConcept C2780922921 @default.
- W4288099416 hasConcept C41008148 @default.
- W4288099416 hasConcept C62520636 @default.
- W4288099416 hasConcept C66322947 @default.
- W4288099416 hasConceptScore W4288099416C111919701 @default.
- W4288099416 hasConceptScore W4288099416C118505674 @default.
- W4288099416 hasConceptScore W4288099416C121332964 @default.
- W4288099416 hasConceptScore W4288099416C154945302 @default.
- W4288099416 hasConceptScore W4288099416C165801399 @default.
- W4288099416 hasConceptScore W4288099416C204321447 @default.
- W4288099416 hasConceptScore W4288099416C2777530160 @default.
- W4288099416 hasConceptScore W4288099416C2780922921 @default.
- W4288099416 hasConceptScore W4288099416C41008148 @default.
- W4288099416 hasConceptScore W4288099416C62520636 @default.
- W4288099416 hasConceptScore W4288099416C66322947 @default.
- W4288099416 hasLocation W42880994161 @default.
- W4288099416 hasOpenAccess W4288099416 @default.
- W4288099416 hasPrimaryLocation W42880994161 @default.
- W4288099416 hasRelatedWork W11511616 @default.
- W4288099416 hasRelatedWork W12732426 @default.
- W4288099416 hasRelatedWork W2061806 @default.
- W4288099416 hasRelatedWork W2155528 @default.
- W4288099416 hasRelatedWork W8300060 @default.
- W4288099416 hasRelatedWork W867563 @default.
- W4288099416 hasRelatedWork W8895266 @default.
- W4288099416 hasRelatedWork W8912579 @default.
- W4288099416 hasRelatedWork W7571534 @default.
- W4288099416 hasRelatedWork W8411197 @default.
- W4288099416 isParatext "false" @default.
- W4288099416 isRetracted "false" @default.
- W4288099416 workType "article" @default.