Matches in SemOpenAlex for { <https://semopenalex.org/work/W746806679> ?p ?o ?g. }
- W746806679 abstract "Along the past two decades, the industry has developed several commercial products with audio-visual sensing capabilities. Most of them consists on a videocamera with an embedded microphone (mobile phones, tablets, etc). Other, such as Kinect, include depth sensors and/or small microphone arrays. Also, there are some mobile phones equipped with a stereo camera pair. At the same time, many research-oriented systems became available (e.g., humanoid robots such as NAO). Since all these systems are small in volume, their sensors are close to each other. Therefore, they are not able to capture de global scene, but one point of view of the ongoing social interplay. We refer to this as Egocentric Audio-Visual Scene Analysis''.This thesis contributes to this field in several aspects. Firstly, by providing a publicly available data set targeting applications such as action/gesture recognition, speaker localization, tracking and diarisation, sound source localization, dialogue modelling, etc. This work has been used later on inside and outside the thesis. We also investigated the problem of AV event detection. We showed how the trust on one of the modalities (visual to be precise) can be modeled and used to bias the method, leading to a visually-supervised EM algorithm (ViSEM). Afterwards we modified the approach to target audio-visual speaker detection yielding to an on-line method working in the humanoid robot NAO. In parallel to the work on audio-visual speaker detection, we developed a new approach for audio-visual command recognition. We explored different features and classifiers and confirmed that the use of audio-visual data increases the performance when compared to auditory-only and to video-only classifiers. Later, we sought for the best method using tiny training sets (5-10 samples per class). This is interesting because real systems need to adapt and learn new commands from the user. Such systems need to be operational with a few examples for the general public usage. Finally, we contributed to the field of sound source localization, in the particular case of non-coplanar microphone arrays. This is interesting because the geometry of the microphone can be any. Consequently, this opens the door to dynamic microphone arrays that would adapt their geometry to fit some particular tasks. Also, because the design of commercial systems may be subject to certain constraints for which circular or linear arrays are not suited." @default.
- W746806679 created "2016-06-24" @default.
- W746806679 creator A5066621495 @default.
- W746806679 date "2013-10-15" @default.
- W746806679 modified "2023-09-23" @default.
- W746806679 title "Egocentric Audio-Visual Scene Analysis : a machine learning and signal processing approach" @default.
- W746806679 cites W1484093391 @default.
- W746806679 cites W1506806321 @default.
- W746806679 cites W1527240141 @default.
- W746806679 cites W1540647394 @default.
- W746806679 cites W1570419640 @default.
- W746806679 cites W1571024744 @default.
- W746806679 cites W1603096025 @default.
- W746806679 cites W1604034532 @default.
- W746806679 cites W1694389219 @default.
- W746806679 cites W1965342088 @default.
- W746806679 cites W1969299255 @default.
- W746806679 cites W1971791733 @default.
- W746806679 cites W1975143133 @default.
- W746806679 cites W1975350514 @default.
- W746806679 cites W1977646036 @default.
- W746806679 cites W1978511849 @default.
- W746806679 cites W1980370676 @default.
- W746806679 cites W1988935075 @default.
- W746806679 cites W1991139021 @default.
- W746806679 cites W1994630425 @default.
- W746806679 cites W1998891211 @default.
- W746806679 cites W2000221426 @default.
- W746806679 cites W2000466020 @default.
- W746806679 cites W2004909598 @default.
- W746806679 cites W2010219725 @default.
- W746806679 cites W2011699475 @default.
- W746806679 cites W2013076218 @default.
- W746806679 cites W2014914041 @default.
- W746806679 cites W2015143272 @default.
- W746806679 cites W2018832332 @default.
- W746806679 cites W2019743084 @default.
- W746806679 cites W2020163092 @default.
- W746806679 cites W2023004272 @default.
- W746806679 cites W2025954386 @default.
- W746806679 cites W2026257980 @default.
- W746806679 cites W2026665343 @default.
- W746806679 cites W2033819227 @default.
- W746806679 cites W2034328688 @default.
- W746806679 cites W2049633694 @default.
- W746806679 cites W2051224630 @default.
- W746806679 cites W2052846226 @default.
- W746806679 cites W2054199123 @default.
- W746806679 cites W2054852564 @default.
- W746806679 cites W2056197638 @default.
- W746806679 cites W2058549804 @default.
- W746806679 cites W2064543148 @default.
- W746806679 cites W2067584370 @default.
- W746806679 cites W2071828724 @default.
- W746806679 cites W2073227393 @default.
- W746806679 cites W2074142487 @default.
- W746806679 cites W2074392550 @default.
- W746806679 cites W2078046413 @default.
- W746806679 cites W2083270451 @default.
- W746806679 cites W2085331248 @default.
- W746806679 cites W2087586617 @default.
- W746806679 cites W2089337412 @default.
- W746806679 cites W2092524451 @default.
- W746806679 cites W2096241474 @default.
- W746806679 cites W2096301647 @default.
- W746806679 cites W2096763807 @default.
- W746806679 cites W2097089247 @default.
- W746806679 cites W2097128017 @default.
- W746806679 cites W2098469993 @default.
- W746806679 cites W2098923380 @default.
- W746806679 cites W2099151709 @default.
- W746806679 cites W2099867504 @default.
- W746806679 cites W2100335648 @default.
- W746806679 cites W2100668143 @default.
- W746806679 cites W2100916003 @default.
- W746806679 cites W2101414012 @default.
- W746806679 cites W2103032457 @default.
- W746806679 cites W2105582566 @default.
- W746806679 cites W2105710328 @default.
- W746806679 cites W2106995601 @default.
- W746806679 cites W2107376953 @default.
- W746806679 cites W2107456423 @default.
- W746806679 cites W2109243860 @default.
- W746806679 cites W2110097273 @default.
- W746806679 cites W2111308925 @default.
- W746806679 cites W2113744809 @default.
- W746806679 cites W2114404361 @default.
- W746806679 cites W2116635162 @default.
- W746806679 cites W2118440833 @default.
- W746806679 cites W2119090229 @default.
- W746806679 cites W2120350100 @default.
- W746806679 cites W2123066336 @default.
- W746806679 cites W2127025755 @default.
- W746806679 cites W2128529557 @default.
- W746806679 cites W2129671742 @default.
- W746806679 cites W2129821199 @default.
- W746806679 cites W2130239535 @default.
- W746806679 cites W2130423769 @default.
- W746806679 cites W2130992468 @default.
- W746806679 cites W2130994896 @default.