Matches in SemOpenAlex for { <https://semopenalex.org/work/W4282943820> ?p ?o ?g. }
- W4282943820 endingPage "1" @default.
- W4282943820 startingPage "1" @default.
- W4282943820 abstract "Human action recognition (HAR) in RGB-D videos has been widely investigated since the release of affordable depth sensors. Currently, unimodal approaches (e.g., skeleton-based and RGB video-based) have realized substantial improvements with increasingly larger datasets. However, multimodal methods specifically with model-level fusion have seldom been investigated. In this article, we propose a model-based multimodal network (MMNet) that fuses skeleton and RGB modalities via a model-based approach. The objective of our method is to improve ensemble recognition accuracy by effectively applying mutually complementary information from different data modalities. For the model-based fusion scheme, we use a spatiotemporal graph convolution network for the skeleton modality to learn attention weights that will be transferred to the network of the RGB modality. Extensive experiments are conducted on five benchmark datasets: NTU RGB+D 60, NTU RGB+D 120, PKU-MMD, Northwestern-UCLA Multiview, and Toyota Smarthome. Upon aggregating the results of multiple modalities, our method is found to outperform state-of-the-art approaches on six evaluation protocols of the five datasets; thus, the proposed MMNet can effectively capture mutually complementary features in different RGB-D video modalities and provide more discriminative features for HAR. We also tested our MMNet on an RGB video dataset Kinetics 400 that contains more outdoor actions, which shows consistent results with those of RGB-D video datasets." @default.
- W4282943820 created "2022-06-16" @default.
- W4282943820 creator A5021293751 @default.
- W4282943820 creator A5027453740 @default.
- W4282943820 creator A5031804038 @default.
- W4282943820 creator A5086801574 @default.
- W4282943820 creator A5091608695 @default.
- W4282943820 date "2022-01-01" @default.
- W4282943820 modified "2023-09-28" @default.
- W4282943820 title "MMNet: A Model-based Multimodal Network for Human Action Recognition in RGB-D Videos" @default.
- W4282943820 cites W1522734439 @default.
- W4282943820 cites W1758575531 @default.
- W4282943820 cites W1926974744 @default.
- W4282943820 cites W1983705368 @default.
- W4282943820 cites W1986847617 @default.
- W4282943820 cites W1989665047 @default.
- W4282943820 cites W2014601643 @default.
- W4282943820 cites W2048821851 @default.
- W4282943820 cites W2054041160 @default.
- W4282943820 cites W2054780155 @default.
- W4282943820 cites W2056339039 @default.
- W4282943820 cites W2057232399 @default.
- W4282943820 cites W2085735683 @default.
- W4282943820 cites W2106996050 @default.
- W4282943820 cites W2126579184 @default.
- W4282943820 cites W2143267104 @default.
- W4282943820 cites W2183341477 @default.
- W4282943820 cites W2194775991 @default.
- W4282943820 cites W2270470215 @default.
- W4282943820 cites W2295038166 @default.
- W4282943820 cites W2296311849 @default.
- W4282943820 cites W2309561466 @default.
- W4282943820 cites W2341313195 @default.
- W4282943820 cites W2416798379 @default.
- W4282943820 cites W2510185399 @default.
- W4282943820 cites W2559085405 @default.
- W4282943820 cites W2593146028 @default.
- W4282943820 cites W2603861860 @default.
- W4282943820 cites W2619383789 @default.
- W4282943820 cites W2716916105 @default.
- W4282943820 cites W2736191430 @default.
- W4282943820 cites W2736334449 @default.
- W4282943820 cites W2746726611 @default.
- W4282943820 cites W2765433083 @default.
- W4282943820 cites W2778523960 @default.
- W4282943820 cites W2792140610 @default.
- W4282943820 cites W2793547936 @default.
- W4282943820 cites W2798644314 @default.
- W4282943820 cites W2799211965 @default.
- W4282943820 cites W2809440904 @default.
- W4282943820 cites W2883630736 @default.
- W4282943820 cites W2914177721 @default.
- W4282943820 cites W2940457086 @default.
- W4282943820 cites W2944006115 @default.
- W4282943820 cites W2948058585 @default.
- W4282943820 cites W2948246283 @default.
- W4282943820 cites W2962688385 @default.
- W4282943820 cites W2963076818 @default.
- W4282943820 cites W2963273301 @default.
- W4282943820 cites W2963369114 @default.
- W4282943820 cites W2963465695 @default.
- W4282943820 cites W2963524571 @default.
- W4282943820 cites W2963901033 @default.
- W4282943820 cites W2964134613 @default.
- W4282943820 cites W2969776042 @default.
- W4282943820 cites W2972327058 @default.
- W4282943820 cites W2981923053 @default.
- W4282943820 cites W2984192355 @default.
- W4282943820 cites W3002271958 @default.
- W4282943820 cites W3035225512 @default.
- W4282943820 cites W3098538019 @default.
- W4282943820 cites W3106932221 @default.
- W4282943820 cites W3109392981 @default.
- W4282943820 cites W3110266941 @default.
- W4282943820 cites W3120967848 @default.
- W4282943820 cites W3123784868 @default.
- W4282943820 cites W3138516171 @default.
- W4282943820 cites W3203634062 @default.
- W4282943820 cites W4226079235 @default.
- W4282943820 cites W4230005465 @default.
- W4282943820 doi "https://doi.org/10.1109/tpami.2022.3177813" @default.
- W4282943820 hasPubMedId "https://pubmed.ncbi.nlm.nih.gov/35617191" @default.
- W4282943820 hasPublicationYear "2022" @default.
- W4282943820 type Work @default.
- W4282943820 citedByCount "2" @default.
- W4282943820 countsByYear W42829438202023 @default.
- W4282943820 crossrefType "journal-article" @default.
- W4282943820 hasAuthorship W4282943820A5021293751 @default.
- W4282943820 hasAuthorship W4282943820A5027453740 @default.
- W4282943820 hasAuthorship W4282943820A5031804038 @default.
- W4282943820 hasAuthorship W4282943820A5086801574 @default.
- W4282943820 hasAuthorship W4282943820A5091608695 @default.
- W4282943820 hasConcept C108583219 @default.
- W4282943820 hasConcept C13280743 @default.
- W4282943820 hasConcept C144024400 @default.
- W4282943820 hasConcept C153180895 @default.
- W4282943820 hasConcept C154945302 @default.
- W4282943820 hasConcept C185798385 @default.