Matches in SemOpenAlex for { <https://semopenalex.org/work/W3162962341> ?p ?o ?g. }
- W3162962341 abstract "Learning code representations has found many uses in software engineering, such as code classification, code search, comment generation, and bug prediction, etc. Although representations of code in tokens, syntax trees, dependency graphs, paths in trees, or the combinations of their variants have been proposed, existing learning techniques have a major limitation that these models are often trained on datasets labeled for specific downstream tasks, and as such the code representations may not be suitable for other tasks. Even though some techniques generate representations from unlabeled code, they are far from being satisfactory when applied to the downstream tasks. To overcome the limitation, this paper proposes InferCode, which adapts the self-supervised learning idea from natural language processing to the abstract syntax trees (ASTs) of code. The novelty lies in the training of code representations by predicting subtrees automatically identified from the contexts of ASTs. With InferCode, subtrees in ASTs are treated as the labels for training the code representations without any human labelling effort or the overhead of expensive graph construction, and the trained representations are no longer tied to any specific downstream tasks or code units. We have trained an instance of InferCode model using Tree-Based Convolutional Neural Network (TBCNN) as the encoder of a large set of Java code. This pre-trained model can then be applied to downstream unsupervised tasks such as code clustering, code clone detection, cross-language code search, or be reused under a transfer learning scheme to continue training the model weights for supervised tasks such as code classification and method name prediction. Compared to prior techniques applied to the same downstream tasks, such as code2vec, code2seq, ASTNN, using our pre-trained InferCode model higher performance is achieved with a significant margin for most of the tasks, including those involving different programming languages. The implementation of InferCode and the trained embeddings are available at the link: https://github.com/bdqnghi/infercode." @default.
- W3162962341 created "2021-05-24" @default.
- W3162962341 creator A5012201218 @default.
- W3162962341 creator A5020785018 @default.
- W3162962341 creator A5033649567 @default.
- W3162962341 date "2021-05-01" @default.
- W3162962341 modified "2023-10-16" @default.
- W3162962341 title "InferCode: Self-Supervised Learning of Code Representations by Predicting Subtrees" @default.
- W3162962341 cites W1523794535 @default.
- W3162962341 cites W1966948031 @default.
- W3162962341 cites W2000431947 @default.
- W3162962341 cites W2079887492 @default.
- W3162962341 cites W2128782367 @default.
- W3162962341 cites W2131774270 @default.
- W3162962341 cites W2136922672 @default.
- W3162962341 cites W2161160262 @default.
- W3162962341 cites W2282866165 @default.
- W3162962341 cites W2511803001 @default.
- W3162962341 cites W2735043505 @default.
- W3162962341 cites W2741705590 @default.
- W3162962341 cites W2743316948 @default.
- W3162962341 cites W2794601162 @default.
- W3162962341 cites W2795013376 @default.
- W3162962341 cites W2805788202 @default.
- W3162962341 cites W2806718802 @default.
- W3162962341 cites W2883359218 @default.
- W3162962341 cites W2884276923 @default.
- W3162962341 cites W2888557792 @default.
- W3162962341 cites W2913939497 @default.
- W3162962341 cites W2937836435 @default.
- W3162962341 cites W2954552517 @default.
- W3162962341 cites W2955426500 @default.
- W3162962341 cites W2962824366 @default.
- W3162962341 cites W2962894772 @default.
- W3162962341 cites W2963502184 @default.
- W3162962341 cites W2963804993 @default.
- W3162962341 cites W2963814513 @default.
- W3162962341 cites W2963872035 @default.
- W3162962341 cites W2964150020 @default.
- W3162962341 cites W2994865335 @default.
- W3162962341 cites W2999760805 @default.
- W3162962341 cites W3043078865 @default.
- W3162962341 cites W3105535951 @default.
- W3162962341 cites W3121414853 @default.
- W3162962341 cites W4301168982 @default.
- W3162962341 cites W2945377946 @default.
- W3162962341 doi "https://doi.org/10.1109/icse43902.2021.00109" @default.
- W3162962341 hasPublicationYear "2021" @default.
- W3162962341 type Work @default.
- W3162962341 sameAs 3162962341 @default.
- W3162962341 citedByCount "49" @default.
- W3162962341 countsByYear W31629623412020 @default.
- W3162962341 countsByYear W31629623412021 @default.
- W3162962341 countsByYear W31629623412022 @default.
- W3162962341 countsByYear W31629623412023 @default.
- W3162962341 crossrefType "proceedings-article" @default.
- W3162962341 hasAuthorship W3162962341A5012201218 @default.
- W3162962341 hasAuthorship W3162962341A5020785018 @default.
- W3162962341 hasAuthorship W3162962341A5033649567 @default.
- W3162962341 hasBestOaLocation W31629623412 @default.
- W3162962341 hasConcept C119857082 @default.
- W3162962341 hasConcept C137287247 @default.
- W3162962341 hasConcept C150292731 @default.
- W3162962341 hasConcept C154945302 @default.
- W3162962341 hasConcept C177264268 @default.
- W3162962341 hasConcept C195324797 @default.
- W3162962341 hasConcept C199360897 @default.
- W3162962341 hasConcept C204321447 @default.
- W3162962341 hasConcept C2776760102 @default.
- W3162962341 hasConcept C2777904410 @default.
- W3162962341 hasConcept C41008148 @default.
- W3162962341 hasConcept C43126263 @default.
- W3162962341 hasConcept C529173508 @default.
- W3162962341 hasConcept C58646249 @default.
- W3162962341 hasConcept C60048249 @default.
- W3162962341 hasConceptScore W3162962341C119857082 @default.
- W3162962341 hasConceptScore W3162962341C137287247 @default.
- W3162962341 hasConceptScore W3162962341C150292731 @default.
- W3162962341 hasConceptScore W3162962341C154945302 @default.
- W3162962341 hasConceptScore W3162962341C177264268 @default.
- W3162962341 hasConceptScore W3162962341C195324797 @default.
- W3162962341 hasConceptScore W3162962341C199360897 @default.
- W3162962341 hasConceptScore W3162962341C204321447 @default.
- W3162962341 hasConceptScore W3162962341C2776760102 @default.
- W3162962341 hasConceptScore W3162962341C2777904410 @default.
- W3162962341 hasConceptScore W3162962341C41008148 @default.
- W3162962341 hasConceptScore W3162962341C43126263 @default.
- W3162962341 hasConceptScore W3162962341C529173508 @default.
- W3162962341 hasConceptScore W3162962341C58646249 @default.
- W3162962341 hasConceptScore W3162962341C60048249 @default.
- W3162962341 hasFunder F4320311687 @default.
- W3162962341 hasFunder F4320320006 @default.
- W3162962341 hasLocation W31629623411 @default.
- W3162962341 hasLocation W31629623412 @default.
- W3162962341 hasLocation W31629623413 @default.
- W3162962341 hasLocation W31629623414 @default.
- W3162962341 hasOpenAccess W3162962341 @default.
- W3162962341 hasPrimaryLocation W31629623411 @default.
- W3162962341 hasRelatedWork W20188161 @default.
- W3162962341 hasRelatedWork W2097696338 @default.