Matches in SemOpenAlex for { <https://semopenalex.org/work/W4317940408> ?p ?o ?g. }
Showing items 1 to 73 of
73
with 100 items per page.
- W4317940408 abstract "We study object interaction anticipation in egocentric videos. This task requires an understanding of the spatiotemporal context formed by past actions on objects, coined action context. We propose TransFusion, a multimodal transformer-based architecture. It exploits the representational power of language by summarising the action context. TransFusion leverages pre-trained image captioning and vision-language models to extract the action context from past video frames. This action context together with the next video frame is processed by the multimodal fusion module to forecast the next object interaction. Our model enables more efficient end-to-end learning. The large pre-trained language models add common sense and a generalisation capability. Experiments on Ego4D and EPIC-KITCHENS-100 show the effectiveness of our multimodal fusion model. They also highlight the benefits of using language-based context summaries in a task where vision seems to suffice. Our method outperforms state-of-the-art approaches by 40.4% in relative terms in overall mAP on the Ego4D test set. We validate the effectiveness of TransFusion via experiments on EPIC-KITCHENS-100. Video and code are available at https://eth-ait.github.io/transfusion-proj/." @default.
- W4317940408 created "2023-01-25" @default.
- W4317940408 creator A5011706284 @default.
- W4317940408 creator A5017751534 @default.
- W4317940408 creator A5025000908 @default.
- W4317940408 creator A5077068949 @default.
- W4317940408 creator A5087296750 @default.
- W4317940408 date "2023-01-22" @default.
- W4317940408 modified "2023-10-17" @default.
- W4317940408 title "Summarize the Past to Predict the Future: Natural Language Descriptions of Context Boost Multimodal Object Interaction" @default.
- W4317940408 doi "https://doi.org/10.48550/arxiv.2301.09209" @default.
- W4317940408 hasPublicationYear "2023" @default.
- W4317940408 type Work @default.
- W4317940408 citedByCount "0" @default.
- W4317940408 crossrefType "posted-content" @default.
- W4317940408 hasAuthorship W4317940408A5011706284 @default.
- W4317940408 hasAuthorship W4317940408A5017751534 @default.
- W4317940408 hasAuthorship W4317940408A5025000908 @default.
- W4317940408 hasAuthorship W4317940408A5077068949 @default.
- W4317940408 hasAuthorship W4317940408A5087296750 @default.
- W4317940408 hasBestOaLocation W43179404081 @default.
- W4317940408 hasConcept C107457646 @default.
- W4317940408 hasConcept C115961682 @default.
- W4317940408 hasConcept C121332964 @default.
- W4317940408 hasConcept C151730666 @default.
- W4317940408 hasConcept C154945302 @default.
- W4317940408 hasConcept C157657479 @default.
- W4317940408 hasConcept C162324750 @default.
- W4317940408 hasConcept C183322885 @default.
- W4317940408 hasConcept C187736073 @default.
- W4317940408 hasConcept C195324797 @default.
- W4317940408 hasConcept C204321447 @default.
- W4317940408 hasConcept C2779343474 @default.
- W4317940408 hasConcept C2780451532 @default.
- W4317940408 hasConcept C2780791683 @default.
- W4317940408 hasConcept C2781238097 @default.
- W4317940408 hasConcept C41008148 @default.
- W4317940408 hasConcept C62520636 @default.
- W4317940408 hasConcept C86803240 @default.
- W4317940408 hasConceptScore W4317940408C107457646 @default.
- W4317940408 hasConceptScore W4317940408C115961682 @default.
- W4317940408 hasConceptScore W4317940408C121332964 @default.
- W4317940408 hasConceptScore W4317940408C151730666 @default.
- W4317940408 hasConceptScore W4317940408C154945302 @default.
- W4317940408 hasConceptScore W4317940408C157657479 @default.
- W4317940408 hasConceptScore W4317940408C162324750 @default.
- W4317940408 hasConceptScore W4317940408C183322885 @default.
- W4317940408 hasConceptScore W4317940408C187736073 @default.
- W4317940408 hasConceptScore W4317940408C195324797 @default.
- W4317940408 hasConceptScore W4317940408C204321447 @default.
- W4317940408 hasConceptScore W4317940408C2779343474 @default.
- W4317940408 hasConceptScore W4317940408C2780451532 @default.
- W4317940408 hasConceptScore W4317940408C2780791683 @default.
- W4317940408 hasConceptScore W4317940408C2781238097 @default.
- W4317940408 hasConceptScore W4317940408C41008148 @default.
- W4317940408 hasConceptScore W4317940408C62520636 @default.
- W4317940408 hasConceptScore W4317940408C86803240 @default.
- W4317940408 hasLocation W43179404081 @default.
- W4317940408 hasOpenAccess W4317940408 @default.
- W4317940408 hasPrimaryLocation W43179404081 @default.
- W4317940408 hasRelatedWork W1504101963 @default.
- W4317940408 hasRelatedWork W2081647779 @default.
- W4317940408 hasRelatedWork W2335758940 @default.
- W4317940408 hasRelatedWork W2735824434 @default.
- W4317940408 hasRelatedWork W2783312365 @default.
- W4317940408 hasRelatedWork W3090988983 @default.
- W4317940408 hasRelatedWork W3185852197 @default.
- W4317940408 hasRelatedWork W4200486724 @default.
- W4317940408 hasRelatedWork W4289293977 @default.
- W4317940408 hasRelatedWork W800264538 @default.
- W4317940408 isParatext "false" @default.
- W4317940408 isRetracted "false" @default.
- W4317940408 workType "article" @default.