Matches in SemOpenAlex for { <https://semopenalex.org/work/W4367299629> ?p ?o ?g. }
- W4367299629 endingPage "2297" @default.
- W4367299629 startingPage "2267" @default.
- W4367299629 abstract "Abstract Advances in visual perceptual tasks have been mainly driven by the amount, and types, of annotations of large-scale datasets. Researchers have focused on fully-supervised settings to train models using offline epoch-based schemes. Despite the evident advancements, limitations and cost of manually annotated datasets have hindered further development for event perceptual tasks, such as detection and localization of objects and events in videos. The problem is more apparent in zoological applications due to the scarcity of annotations and length of videos-most videos are at most ten minutes long. Inspired by cognitive theories, we present a self-supervised perceptual prediction framework to tackle the problem of temporal event segmentation by building a stable representation of event-related objects. The approach is simple but effective. We rely on LSTM predictions of high-level features computed by a standard deep learning backbone. For spatial segmentation, the stable representation of the object is used by an attention mechanism to filter the input features before the prediction step. The self-learned attention maps effectively localize the object as a side effect of perceptual prediction. We demonstrate our approach on long videos from continuous wildlife video monitoring, spanning multiple days at 25 FPS. We aim to facilitate automated ethogramming by detecting and localizing events without the need for labels. Our approach is trained in an online manner on streaming input and requires only a single pass through the video, with no separate training set. Given the lack of long and realistic (includes real-world challenges) datasets, we introduce a new wildlife video dataset–nest monitoring of the Kagu (a flightless bird from New Caledonia)–to benchmark our approach. Our dataset features a video from 10 days (over 23 million frames) of continuous monitoring of the Kagu in its natural habitat. We annotate every frame with bounding boxes and event labels. Additionally, each frame is annotated with time-of-day and illumination conditions. We will make the dataset, which is the first of its kind, and the code available to the research community. We find that the approach significantly outperforms other self-supervised, traditional (e.g., Optical Flow, Background Subtraction) and NN-based (e.g., PA-DPC, DINO, iBOT), baselines and performs on par with supervised boundary detection approaches (i.e., PC). At a recall rate of 80%, our best performing model detects one false positive activity every 50 min of training. On average, we at least double the performance of self-supervised approaches for spatial segmentation. Additionally, we show that our approach is robust to various environmental conditions (e.g., moving shadows). We also benchmark the framework on other datasets (i.e., Kinetics-GEBD, TAPOS) from different domains to demonstrate its generalizability. The data and code are available on our project page: https://aix.eng.usf.edu/research_automated_ethogramming.html" @default.
- W4367299629 created "2023-04-29" @default.
- W4367299629 creator A5003595886 @default.
- W4367299629 creator A5020674314 @default.
- W4367299629 creator A5031973877 @default.
- W4367299629 creator A5054893915 @default.
- W4367299629 creator A5076176076 @default.
- W4367299629 date "2023-04-28" @default.
- W4367299629 modified "2023-10-06" @default.
- W4367299629 title "Towards Automated Ethogramming: Cognitively-Inspired Event Segmentation for Streaming Wildlife Video Monitoring" @default.
- W4367299629 cites W1902237438 @default.
- W4367299629 cites W2011605810 @default.
- W4367299629 cites W2022302343 @default.
- W4367299629 cites W2024244622 @default.
- W4367299629 cites W2024621423 @default.
- W4367299629 cites W2027972120 @default.
- W4367299629 cites W2031340449 @default.
- W4367299629 cites W2031489346 @default.
- W4367299629 cites W2035866593 @default.
- W4367299629 cites W2058176332 @default.
- W4367299629 cites W2064675550 @default.
- W4367299629 cites W2066859619 @default.
- W4367299629 cites W2106069155 @default.
- W4367299629 cites W2108710284 @default.
- W4367299629 cites W2141086994 @default.
- W4367299629 cites W2183341477 @default.
- W4367299629 cites W2395937778 @default.
- W4367299629 cites W2413367505 @default.
- W4367299629 cites W2477205648 @default.
- W4367299629 cites W2491875666 @default.
- W4367299629 cites W2550143307 @default.
- W4367299629 cites W2555859288 @default.
- W4367299629 cites W2581974582 @default.
- W4367299629 cites W2600081845 @default.
- W4367299629 cites W2607296803 @default.
- W4367299629 cites W2612694663 @default.
- W4367299629 cites W2740268464 @default.
- W4367299629 cites W2765148382 @default.
- W4367299629 cites W2795187948 @default.
- W4367299629 cites W2798560564 @default.
- W4367299629 cites W2902616437 @default.
- W4367299629 cites W2905502390 @default.
- W4367299629 cites W2913368959 @default.
- W4367299629 cites W2919306532 @default.
- W4367299629 cites W2951866553 @default.
- W4367299629 cites W2962677524 @default.
- W4367299629 cites W2962795934 @default.
- W4367299629 cites W2963499153 @default.
- W4367299629 cites W2963563276 @default.
- W4367299629 cites W2963739929 @default.
- W4367299629 cites W2964040984 @default.
- W4367299629 cites W2964167369 @default.
- W4367299629 cites W2964218552 @default.
- W4367299629 cites W2971364617 @default.
- W4367299629 cites W2983918066 @default.
- W4367299629 cites W2985871763 @default.
- W4367299629 cites W2985990843 @default.
- W4367299629 cites W2990943996 @default.
- W4367299629 cites W2991653934 @default.
- W4367299629 cites W3006150989 @default.
- W4367299629 cites W3009419647 @default.
- W4367299629 cites W3009442072 @default.
- W4367299629 cites W3009693647 @default.
- W4367299629 cites W3010874390 @default.
- W4367299629 cites W3034209137 @default.
- W4367299629 cites W3034309634 @default.
- W4367299629 cites W3034403061 @default.
- W4367299629 cites W3035096461 @default.
- W4367299629 cites W3035305184 @default.
- W4367299629 cites W3035551320 @default.
- W4367299629 cites W3038699438 @default.
- W4367299629 cites W3042434376 @default.
- W4367299629 cites W3082752470 @default.
- W4367299629 cites W3088916779 @default.
- W4367299629 cites W3093326439 @default.
- W4367299629 cites W3096609285 @default.
- W4367299629 cites W3097582741 @default.
- W4367299629 cites W3107128832 @default.
- W4367299629 cites W3109187938 @default.
- W4367299629 cites W3116651890 @default.
- W4367299629 cites W3118451559 @default.
- W4367299629 cites W3120214499 @default.
- W4367299629 cites W3120948247 @default.
- W4367299629 cites W3124314487 @default.
- W4367299629 cites W3128401539 @default.
- W4367299629 cites W3128849750 @default.
- W4367299629 cites W3139790935 @default.
- W4367299629 cites W3153832461 @default.
- W4367299629 cites W3159481202 @default.
- W4367299629 cites W3159619744 @default.
- W4367299629 cites W3159695738 @default.
- W4367299629 cites W3171007011 @default.
- W4367299629 cites W3173986289 @default.
- W4367299629 cites W3174852108 @default.
- W4367299629 cites W3176780013 @default.
- W4367299629 cites W3177295391 @default.
- W4367299629 cites W3199400162 @default.
- W4367299629 cites W3203071852 @default.