SemOpenAlex |

SemOpenAlex

Matches in SemOpenAlex for { <https://semopenalex.org/work/W2955492712> ?p ?o ?g. }

Showing items 1 to 65 of 65 with 100 items per page.

W2955492712 abstract "Robot perception plays a crucial role in human-robot interaction (HRI). Theperception system provides the robot with information of the surroundingsand enables it to interact with people. In a conversational scenario, a groupof people may chat in front of the robot and move freely. In such situations,robots are expected to understand where the people are, who is speaking, orwhat they are talking about. This thesis concentrates on answering the firsttwo questions, namely speaker tracking and diarization. To that end, we usedifferent modalities of the robot’s perception system. Similar to seeing andhearing for humans, audio and visual information are critical cues for robotsin a conversational scenario. Advancements in computer vision and audioprocessing in the last decade revolutionized robot perception abilities and enabledus to build joint audio-visual applications. In this thesis, we present thefollowing contributions: we first develop a variational Bayesian frameworkfor tracking multiple objects. The variational Bayesian framework providesclosed-form tractable problem solutions, enabling an efficient tracking process.The framework is first applied to visual multiple-person tracking. Thebirth and death processes are built jointly to deal with the varying numberof people in the scene. We then augment the framework by exploiting thecomplementarity of vision and robot motor information. On the one hand, therobot’s active motion can be integrated into the visual tracking system to stabilizethe tracking. On the other hand, visual information can be used to performmotor servoing. As a next step we combine audio and visual information in theframework and exploit the association between the acoustic feature frequencybins with tracked people, to estimate the smooth trajectories of people, and toinfer their acoustic status (i.e. speaking or silent). To adapt the framework toapplications with no vision information, we employ it to acoustic-only speakerlocalization and tracking. Online dereverberation techniques are first appliedthen followed by the tracking system. Finally, we propose a variant of theacoustic-only tracking model based on the von-Mises distribution, which isspecifically adapted to directional data. All proposed methods are validatedon datasets both qualitatively and quantitatively." @default.
W2955492712 created "2019-07-12" @default.
W2955492712 creator A5011762462 @default.
W2955492712 date "2019-05-10" @default.
W2955492712 modified "2023-09-23" @default.
W2955492712 title "Audio-Visual Multiple-Speaker Tracking for Robot Perception" @default.
W2955492712 hasPublicationYear "2019" @default.
W2955492712 type Work @default.
W2955492712 sameAs 2955492712 @default.
W2955492712 citedByCount "0" @default.
W2955492712 crossrefType "dissertation" @default.
W2955492712 hasAuthorship W2955492712A5011762462 @default.
W2955492712 hasConcept C107457646 @default.
W2955492712 hasConcept C111919701 @default.
W2955492712 hasConcept C154945302 @default.
W2955492712 hasConcept C15744967 @default.
W2955492712 hasConcept C162947575 @default.
W2955492712 hasConcept C169760540 @default.
W2955492712 hasConcept C19966478 @default.
W2955492712 hasConcept C26760741 @default.
W2955492712 hasConcept C31972630 @default.
W2955492712 hasConcept C41008148 @default.
W2955492712 hasConcept C56461940 @default.
W2955492712 hasConcept C65401140 @default.
W2955492712 hasConcept C90509273 @default.
W2955492712 hasConcept C98045186 @default.
W2955492712 hasConceptScore W2955492712C107457646 @default.
W2955492712 hasConceptScore W2955492712C111919701 @default.
W2955492712 hasConceptScore W2955492712C154945302 @default.
W2955492712 hasConceptScore W2955492712C15744967 @default.
W2955492712 hasConceptScore W2955492712C162947575 @default.
W2955492712 hasConceptScore W2955492712C169760540 @default.
W2955492712 hasConceptScore W2955492712C19966478 @default.
W2955492712 hasConceptScore W2955492712C26760741 @default.
W2955492712 hasConceptScore W2955492712C31972630 @default.
W2955492712 hasConceptScore W2955492712C41008148 @default.
W2955492712 hasConceptScore W2955492712C56461940 @default.
W2955492712 hasConceptScore W2955492712C65401140 @default.
W2955492712 hasConceptScore W2955492712C90509273 @default.
W2955492712 hasConceptScore W2955492712C98045186 @default.
W2955492712 hasOpenAccess W2955492712 @default.
W2955492712 hasRelatedWork W1984310582 @default.
W2955492712 hasRelatedWork W1994367471 @default.
W2955492712 hasRelatedWork W1999980707 @default.
W2955492712 hasRelatedWork W2052633826 @default.
W2955492712 hasRelatedWork W2053160158 @default.
W2955492712 hasRelatedWork W2076593519 @default.
W2955492712 hasRelatedWork W2078155142 @default.
W2955492712 hasRelatedWork W2101451914 @default.
W2955492712 hasRelatedWork W2108089588 @default.
W2955492712 hasRelatedWork W2128847674 @default.
W2955492712 hasRelatedWork W2137561966 @default.
W2955492712 hasRelatedWork W2138153334 @default.
W2955492712 hasRelatedWork W2154549428 @default.
W2955492712 hasRelatedWork W2165067341 @default.
W2955492712 hasRelatedWork W2219322027 @default.
W2955492712 hasRelatedWork W2321703195 @default.
W2955492712 hasRelatedWork W2767521608 @default.
W2955492712 hasRelatedWork W2902684407 @default.
W2955492712 hasRelatedWork W3119206490 @default.
W2955492712 hasRelatedWork W2085753829 @default.
W2955492712 isParatext "false" @default.
W2955492712 isRetracted "false" @default.
W2955492712 magId "2955492712" @default.
W2955492712 workType "dissertation" @default.