Matches in SemOpenAlex for { <https://semopenalex.org/work/W2973005564> ?p ?o ?g. }
- W2973005564 abstract "Computer vision aims to provide computers with a conceptual understanding of images or video by learning a high-level representation. This representation is typically derived from the pixel domain (i.e., RGB channels) for tasks such as image classification or action recognition. In this thesis, we explore how RGB inputs can either be pre-processed or supplemented with other compressed visual modalities, in order to improve the accuracy-complexity tradeoff for various computer vision tasks. Beginning with RGB-domain data only, we propose a multi-level, Voronoi based spatial partitioning of images, which are individually processed by a convolutional neural network (CNN), to improve the scale invariance of the embedding. We combine this with a novel and efficient approach for optimal bit allocation within the quantized cell representations. We evaluate this proposal on the content-based image retrieval task, which constitutes finding similar images in a dataset to a given query. We then move to the more challenging domain of action recognition, where a video sequence is classified according to its constituent action. In this case, we demonstrate how the RGB modality can be supplemented with a flow modality, comprising motion vectors extracted directly from the video codec. The motion vectors (MVs) are used both as input to a CNN and as an activity sensor for providing selective macroblock (MB) decoding of RGB frames instead of full-frame decoding. We independently train two CNNs on RGB and MV correspondences and then fuse their scores during inference, demonstrating faster end-to-end processing and competitive classification accuracy to recent work. In order to explore the use of more efficient sensing modalities, we replace the MV stream with a neuromorphic vision sensing (NVS) stream for action recognition. NVS hardware mimics the biological retina and operates with substantially lower power and at significantly higher sampling rates than conventional active pixel sensing (APS) cameras. Due to the lack of training data in this domain, we generate emulated NVS frames directly from consecutive RGB frames and use these to train a teacher-student framework that additionally leverages on the abundance of optical flow training data. In the final part of this thesis, we introduce a novel unsupervised domain adaptation method for further minimizing the domain shift between emulated (source) and real (target) NVS data domains." @default.
- W2973005564 created "2019-09-19" @default.
- W2973005564 creator A5000944047 @default.
- W2973005564 date "2019-01-27" @default.
- W2973005564 modified "2023-09-23" @default.
- W2973005564 title "From pixels to spikes : efficient multimodal learning in the presence of domain shift" @default.
- W2973005564 cites W1532362218 @default.
- W2973005564 cites W1556531089 @default.
- W2973005564 cites W1663973292 @default.
- W2973005564 cites W1676552347 @default.
- W2973005564 cites W1677182931 @default.
- W2973005564 cites W1679894842 @default.
- W2973005564 cites W1686810756 @default.
- W2973005564 cites W1722318740 @default.
- W2973005564 cites W1731081199 @default.
- W2973005564 cites W1849277567 @default.
- W2973005564 cites W1867429401 @default.
- W2973005564 cites W1922773808 @default.
- W2973005564 cites W1947481528 @default.
- W2973005564 cites W1979931042 @default.
- W2973005564 cites W1980911747 @default.
- W2973005564 cites W1988511239 @default.
- W2973005564 cites W1993304626 @default.
- W2973005564 cites W1993928670 @default.
- W2973005564 cites W2012592962 @default.
- W2973005564 cites W2012833704 @default.
- W2973005564 cites W2020096355 @default.
- W2973005564 cites W2024066070 @default.
- W2973005564 cites W204268067 @default.
- W2973005564 cites W2053154970 @default.
- W2973005564 cites W2062118960 @default.
- W2973005564 cites W2071027807 @default.
- W2973005564 cites W2103924867 @default.
- W2973005564 cites W2105101328 @default.
- W2973005564 cites W2108598243 @default.
- W2973005564 cites W2111362445 @default.
- W2973005564 cites W2119044429 @default.
- W2973005564 cites W2122115292 @default.
- W2973005564 cites W2124386111 @default.
- W2973005564 cites W2124509324 @default.
- W2973005564 cites W2126574503 @default.
- W2973005564 cites W2126579184 @default.
- W2973005564 cites W2131675349 @default.
- W2973005564 cites W2131747574 @default.
- W2973005564 cites W2131846894 @default.
- W2973005564 cites W2133665775 @default.
- W2973005564 cites W2133728753 @default.
- W2973005564 cites W2138011018 @default.
- W2973005564 cites W2141362318 @default.
- W2973005564 cites W2143668817 @default.
- W2973005564 cites W2144187604 @default.
- W2973005564 cites W2144420897 @default.
- W2973005564 cites W2145406111 @default.
- W2973005564 cites W2146395539 @default.
- W2973005564 cites W2147238549 @default.
- W2973005564 cites W2148809531 @default.
- W2973005564 cites W2154422044 @default.
- W2973005564 cites W2154956324 @default.
- W2973005564 cites W2156303437 @default.
- W2973005564 cites W2162915993 @default.
- W2973005564 cites W2163605009 @default.
- W2973005564 cites W2174103991 @default.
- W2973005564 cites W2179042386 @default.
- W2973005564 cites W2187089797 @default.
- W2973005564 cites W2203224402 @default.
- W2973005564 cites W2204750386 @default.
- W2973005564 cites W2204975001 @default.
- W2973005564 cites W2252058457 @default.
- W2973005564 cites W2293597654 @default.
- W2973005564 cites W2295537791 @default.
- W2973005564 cites W2335728318 @default.
- W2973005564 cites W2342662179 @default.
- W2973005564 cites W2342776425 @default.
- W2973005564 cites W2478454054 @default.
- W2973005564 cites W2504108613 @default.
- W2973005564 cites W2507540959 @default.
- W2973005564 cites W2553594924 @default.
- W2973005564 cites W2560474170 @default.
- W2973005564 cites W2586564970 @default.
- W2973005564 cites W2587802086 @default.
- W2973005564 cites W2593768305 @default.
- W2973005564 cites W2608988379 @default.
- W2973005564 cites W2613967272 @default.
- W2973005564 cites W2744915377 @default.
- W2973005564 cites W2755006634 @default.
- W2973005564 cites W2757430550 @default.
- W2973005564 cites W2913932916 @default.
- W2973005564 cites W2949117887 @default.
- W2973005564 cites W2950292946 @default.
- W2973005564 cites W2962793481 @default.
- W2973005564 cites W2962947361 @default.
- W2973005564 cites W2963166708 @default.
- W2973005564 cites W2963173190 @default.
- W2973005564 cites W2963373786 @default.
- W2973005564 cites W2963524571 @default.
- W2973005564 cites W2964273528 @default.
- W2973005564 cites W2964288524 @default.
- W2973005564 cites W3122190729 @default.
- W2973005564 cites W2605010847 @default.
- W2973005564 hasPublicationYear "2019" @default.