|
N. N

Vortrag von Prof. Dr.-Ing. Gernot A. Fink, TU Dortmund, Robotics Research Institute, Intelligent Systems Group: Multimodal Perception in Smart Environments am 21.04.2010 um 17 Uhr im M5

Wednesday, 21.04.2010 17:00 im Raum M5

Mathematik und Informatik

Abstract: Smart Environments achieve their seemingly intelligent behavior by interpreting the data obtained from a wide variety of sensors on embedded information processing devices. For the interaction with potential users, however, still mostly traditional interfaces are used as, e.g., touch sensitive displays or smart phones. In order to make human-machine interaction in smart environments more intuitive it is, therefore, necessary to develop more natural human-machine interfaces. These should be based on humans' natural perception capabilities - most prominently acoustic and visual perception. For natural interaction with untrained users a smart environment equipped with acoustic and visual perception capabilities needs to solve several challenging problems. First, the potential interaction partners need to be localized, which can be achieved using both visual and acoustic cues. Second, the communicative intent of persons, which in natural human-human interaction is mostly expressed by speech and gesture, needs to be recognized. In order to solve this complex problem the indiviual perception modalities need to be combined to complement each other. Furthermore, the preception system needs to be able to distinguish between interesting and irrelevant percepts in order to be able to focus its active sensor elements and its internal processing power. In this talk I will report on recent research at TU Dortmund on acoustic and visual perception and their multimodal combination in smart environments. I will exemplify the challenges associated with speech perception in reverberant rooms, describe our approach to robust 3D multi-view gesture recognition, and present our multimodal attention system. Building on these foundations, I will present recent results on the combination of different perceptual modalities using a coherent attention model when trying to detect the multimodal referent specified by a combination of speech and gesture.



Angelegt am 08.04.2010 von N. N
Geändert am 15.04.2010 von N. N
[Edit | Vorlage]

Aktuelles aus der Informatik
Kolloquium der Informatik