Matches in SemOpenAlex for { <https://semopenalex.org/work/W4308152639> ?p ?o ?g. }
Showing items 1 to 76 of
76
with 100 items per page.
- W4308152639 endingPage "109135" @default.
- W4308152639 startingPage "109135" @default.
- W4308152639 abstract "• Obtaining representations with different properties for the two sub-tasks (action classification and action localization) of temporal action detection leads to better performance. • Attention maps based on feature similarity can enhance intra-class homogeneity and inter-class heterogeneity of video representation for action classification sub-task. • Adaptive receptive fields based on actions facilitate precise action localization. Video representation is of significant importance for temporal action detection. The two sub-tasks of temporal action detection, i.e., action classification and action localization, have different requirements for video representation. Specifically, action classification requires video representations to be highly discriminative, so that action features and background features are as dissimilar as possible. For action localization, it is crucial to obtain information about the action itself and the surrounding context for accurate prediction of action boundaries. However, the previous methods failed to extract the optimal representations for the two sub-tasks, whose representations for both sub-tasks are obtained in a similar way. In this paper, a Global-Local Attention (GLA) mechanism is proposed to produce a more powerful video representation for temporal action detection without introducing additional parameters. The global attention mechanism predicts each action category by integrating features in the entire video that are similar to the action while suppressing other features, thus enhancing the discriminability of video representation during the training process. The local attention mechanism uses a Gaussian weighting function to integrate each action and its surrounding contextual information, thereby enabling precise localization of the action. The effectiveness of GLA is demonstrated on THUMOS’14 and ActivityNet-1.3 with a simple one-stage action detection network, achieving state-of-the-art performance among the methods using only RGB images as input. The inference speed of the proposed model reaches 1373 FPS on a single Nvidia Titan Xp GPU. The generalizability of GLA to other detection architectures is verified using R-C3D and Decouple-SSAD, both of which achieve consistent improvements. The experimental results demonstrate that designing representations with different properties for the two sub-tasks leads to better performance for temporal action detection compared to the representations obtained in a similar way." @default.
- W4308152639 created "2022-11-08" @default.
- W4308152639 creator A5010189817 @default.
- W4308152639 creator A5032808299 @default.
- W4308152639 creator A5037855900 @default.
- W4308152639 creator A5075078021 @default.
- W4308152639 creator A5076462869 @default.
- W4308152639 creator A5077260423 @default.
- W4308152639 date "2023-02-01" @default.
- W4308152639 modified "2023-09-23" @default.
- W4308152639 title "Video representation learning for temporal action detection using global-local attention" @default.
- W4308152639 cites W2963400312 @default.
- W4308152639 cites W2971915722 @default.
- W4308152639 cites W2985134635 @default.
- W4308152639 cites W3007900690 @default.
- W4308152639 cites W3037147829 @default.
- W4308152639 cites W3196414989 @default.
- W4308152639 cites W4223556858 @default.
- W4308152639 cites W4224300071 @default.
- W4308152639 doi "https://doi.org/10.1016/j.patcog.2022.109135" @default.
- W4308152639 hasPublicationYear "2023" @default.
- W4308152639 type Work @default.
- W4308152639 citedByCount "2" @default.
- W4308152639 countsByYear W43081526392023 @default.
- W4308152639 crossrefType "journal-article" @default.
- W4308152639 hasAuthorship W4308152639A5010189817 @default.
- W4308152639 hasAuthorship W4308152639A5032808299 @default.
- W4308152639 hasAuthorship W4308152639A5037855900 @default.
- W4308152639 hasAuthorship W4308152639A5075078021 @default.
- W4308152639 hasAuthorship W4308152639A5076462869 @default.
- W4308152639 hasAuthorship W4308152639A5077260423 @default.
- W4308152639 hasConcept C121332964 @default.
- W4308152639 hasConcept C153180895 @default.
- W4308152639 hasConcept C154945302 @default.
- W4308152639 hasConcept C17744445 @default.
- W4308152639 hasConcept C199539241 @default.
- W4308152639 hasConcept C2776359362 @default.
- W4308152639 hasConcept C2777212361 @default.
- W4308152639 hasConcept C2780791683 @default.
- W4308152639 hasConcept C2987834672 @default.
- W4308152639 hasConcept C31972630 @default.
- W4308152639 hasConcept C41008148 @default.
- W4308152639 hasConcept C62520636 @default.
- W4308152639 hasConcept C94625758 @default.
- W4308152639 hasConceptScore W4308152639C121332964 @default.
- W4308152639 hasConceptScore W4308152639C153180895 @default.
- W4308152639 hasConceptScore W4308152639C154945302 @default.
- W4308152639 hasConceptScore W4308152639C17744445 @default.
- W4308152639 hasConceptScore W4308152639C199539241 @default.
- W4308152639 hasConceptScore W4308152639C2776359362 @default.
- W4308152639 hasConceptScore W4308152639C2777212361 @default.
- W4308152639 hasConceptScore W4308152639C2780791683 @default.
- W4308152639 hasConceptScore W4308152639C2987834672 @default.
- W4308152639 hasConceptScore W4308152639C31972630 @default.
- W4308152639 hasConceptScore W4308152639C41008148 @default.
- W4308152639 hasConceptScore W4308152639C62520636 @default.
- W4308152639 hasConceptScore W4308152639C94625758 @default.
- W4308152639 hasLocation W43081526391 @default.
- W4308152639 hasOpenAccess W4308152639 @default.
- W4308152639 hasPrimaryLocation W43081526391 @default.
- W4308152639 hasRelatedWork W1981202246 @default.
- W4308152639 hasRelatedWork W2007815619 @default.
- W4308152639 hasRelatedWork W2013076218 @default.
- W4308152639 hasRelatedWork W2100518354 @default.
- W4308152639 hasRelatedWork W2726222394 @default.
- W4308152639 hasRelatedWork W2752217129 @default.
- W4308152639 hasRelatedWork W2941155331 @default.
- W4308152639 hasRelatedWork W3106494386 @default.
- W4308152639 hasRelatedWork W3131297908 @default.
- W4308152639 hasRelatedWork W66091190 @default.
- W4308152639 hasVolume "134" @default.
- W4308152639 isParatext "false" @default.
- W4308152639 isRetracted "false" @default.
- W4308152639 workType "article" @default.