Matches in SemOpenAlex for { <https://semopenalex.org/work/W3086007799> ?p ?o ?g. }
- W3086007799 abstract "Pre-trained models for programming language have achieved dramatic empirical improvements on a variety of code-related tasks such as code search, code completion, code summarization, etc. However, existing pre-trained models regard a code snippet as a sequence of tokens, while ignoring the inherent structure of code, which provides crucial code semantics and would enhance the code understanding process. We present GraphCodeBERT, a pre-trained model for programming language that considers the inherent structure of code. Instead of taking syntactic-level structure of code like abstract syntax tree (AST), we use data flow in the pre-training stage, which is a semantic-level structure of code that encodes the relation of where-the-value-comes-from between variables. Such a semantic-level structure is neat and does not bring an unnecessarily deep hierarchy of AST, the property of which makes the model more efficient. We develop GraphCodeBERT based on Transformer. In addition to using the task of masked language modeling, we introduce two structure-aware pre-training tasks. One is to predict code structure edges, and the other is to align representations between source code and code structure. We implement the model in an efficient way with a graph-guided masked attention function to incorporate the code structure. We evaluate our model on four tasks, including code search, clone detection, code translation, and code refinement. Results show that code structure and newly introduced pre-training tasks can improve GraphCodeBERT and achieves state-of-the-art performance on the four downstream tasks. We further show that the model prefers structure-level attentions over token-level attentions in the task of code search." @default.
- W3086007799 created "2020-09-21" @default.
- W3086007799 creator A5004928013 @default.
- W3086007799 creator A5007059246 @default.
- W3086007799 creator A5007389688 @default.
- W3086007799 creator A5010323000 @default.
- W3086007799 creator A5017205177 @default.
- W3086007799 creator A5020154435 @default.
- W3086007799 creator A5021352035 @default.
- W3086007799 creator A5028710085 @default.
- W3086007799 creator A5031659418 @default.
- W3086007799 creator A5034770439 @default.
- W3086007799 creator A5042018181 @default.
- W3086007799 creator A5042646731 @default.
- W3086007799 creator A5052381023 @default.
- W3086007799 creator A5052842216 @default.
- W3086007799 creator A5059021264 @default.
- W3086007799 creator A5060116992 @default.
- W3086007799 creator A5060364305 @default.
- W3086007799 creator A5085284883 @default.
- W3086007799 date "2020-09-17" @default.
- W3086007799 modified "2023-10-06" @default.
- W3086007799 title "GraphCodeBERT: Pre-training Code Representations with Data Flow" @default.
- W3086007799 cites W1972141422 @default.
- W3086007799 cites W2065053490 @default.
- W3086007799 cites W2074032109 @default.
- W3086007799 cites W2083878868 @default.
- W3086007799 cites W2128782367 @default.
- W3086007799 cites W2153653739 @default.
- W3086007799 cites W2162006472 @default.
- W3086007799 cites W2247864914 @default.
- W3086007799 cites W2511803001 @default.
- W3086007799 cites W2610002206 @default.
- W3086007799 cites W2741705590 @default.
- W3086007799 cites W2787560479 @default.
- W3086007799 cites W2804660315 @default.
- W3086007799 cites W2884276923 @default.
- W3086007799 cites W2887364112 @default.
- W3086007799 cites W2890194927 @default.
- W3086007799 cites W2892197424 @default.
- W3086007799 cites W2914120296 @default.
- W3086007799 cites W2950813464 @default.
- W3086007799 cites W2953071394 @default.
- W3086007799 cites W2955426500 @default.
- W3086007799 cites W2963341956 @default.
- W3086007799 cites W2963403868 @default.
- W3086007799 cites W2963499994 @default.
- W3086007799 cites W2963617989 @default.
- W3086007799 cites W2965373594 @default.
- W3086007799 cites W2972082064 @default.
- W3086007799 cites W2973529529 @default.
- W3086007799 cites W2981852735 @default.
- W3086007799 cites W2995259046 @default.
- W3086007799 cites W2995333547 @default.
- W3086007799 cites W2996086147 @default.
- W3086007799 cites W2997275048 @default.
- W3086007799 cites W3008088841 @default.
- W3086007799 cites W3008282111 @default.
- W3086007799 cites W3008999375 @default.
- W3086007799 cites W3014797428 @default.
- W3086007799 cites W3025993830 @default.
- W3086007799 cites W3035300716 @default.
- W3086007799 cites W3035882142 @default.
- W3086007799 doi "https://doi.org/10.48550/arxiv.2009.08366" @default.
- W3086007799 hasPublicationYear "2020" @default.
- W3086007799 type Work @default.
- W3086007799 sameAs 3086007799 @default.
- W3086007799 citedByCount "22" @default.
- W3086007799 countsByYear W30860077992020 @default.
- W3086007799 countsByYear W30860077992021 @default.
- W3086007799 countsByYear W30860077992023 @default.
- W3086007799 crossrefType "posted-content" @default.
- W3086007799 hasAuthorship W3086007799A5004928013 @default.
- W3086007799 hasAuthorship W3086007799A5007059246 @default.
- W3086007799 hasAuthorship W3086007799A5007389688 @default.
- W3086007799 hasAuthorship W3086007799A5010323000 @default.
- W3086007799 hasAuthorship W3086007799A5017205177 @default.
- W3086007799 hasAuthorship W3086007799A5020154435 @default.
- W3086007799 hasAuthorship W3086007799A5021352035 @default.
- W3086007799 hasAuthorship W3086007799A5028710085 @default.
- W3086007799 hasAuthorship W3086007799A5031659418 @default.
- W3086007799 hasAuthorship W3086007799A5034770439 @default.
- W3086007799 hasAuthorship W3086007799A5042018181 @default.
- W3086007799 hasAuthorship W3086007799A5042646731 @default.
- W3086007799 hasAuthorship W3086007799A5052381023 @default.
- W3086007799 hasAuthorship W3086007799A5052842216 @default.
- W3086007799 hasAuthorship W3086007799A5059021264 @default.
- W3086007799 hasAuthorship W3086007799A5060116992 @default.
- W3086007799 hasAuthorship W3086007799A5060364305 @default.
- W3086007799 hasAuthorship W3086007799A5085284883 @default.
- W3086007799 hasBestOaLocation W30860077991 @default.
- W3086007799 hasConcept C11413529 @default.
- W3086007799 hasConcept C133162039 @default.
- W3086007799 hasConcept C151578736 @default.
- W3086007799 hasConcept C154945302 @default.
- W3086007799 hasConcept C162319229 @default.
- W3086007799 hasConcept C163797641 @default.
- W3086007799 hasConcept C170858558 @default.
- W3086007799 hasConcept C177264268 @default.
- W3086007799 hasConcept C186644900 @default.