Vortrag von Prof. Dr.-Ing. Gernot A. Fink, TU Dortmund, Robotics Research Institute, Intelligent Systems Group: Multimodal Perception in Smart Environments am 21.04.2010 um 17 Uhr im M5
Wednesday, 21.04.2010 17:00 im Raum M5
Abstract:
Smart Environments achieve their seemingly intelligent behavior by interpreting
the data obtained from a wide variety of sensors on embedded information
processing devices. For the interaction with potential users, however, still
mostly traditional interfaces are used as, e.g., touch sensitive displays or
smart phones. In order to make human-machine interaction in smart environments
more intuitive it is, therefore, necessary to develop more natural
human-machine interfaces. These should be based on humans' natural perception
capabilities - most prominently acoustic and visual perception.
For natural interaction with untrained users a smart environment equipped with
acoustic and visual perception capabilities needs to solve several challenging
problems. First, the potential interaction partners need to be localized,
which can be achieved using both visual and acoustic cues. Second, the
communicative intent of persons, which in natural human-human interaction is
mostly expressed by speech and gesture, needs to be recognized. In order
to solve this complex problem the indiviual perception modalities need
to be combined to complement each other. Furthermore, the preception system
needs to be able to distinguish between interesting and irrelevant
percepts in order to be able to focus its active sensor elements and
its internal processing power.
In this talk I will report on recent research at TU Dortmund on acoustic
and visual perception and their multimodal combination in smart environments.
I will exemplify the challenges associated with speech perception in
reverberant rooms, describe our approach to robust 3D multi-view gesture
recognition, and present our multimodal attention system. Building on these
foundations, I will present recent results on the combination of different
perceptual modalities using a coherent attention model when trying to detect
the multimodal referent specified by a combination of speech and gesture.
Angelegt am 08.04.2010 von N. N
Geändert am 15.04.2010 von N. N
[Edit | Vorlage]