Institut für Informatik

Multimodal Interfaces

The multimodal interaction paradigm simultaneously uses various modalities like speech, gesture, touch or ga ze, to communicate with computers and machines. Basically, multimodal interaction includes the analysis as well as the synthesis of multimodal utterances. This course concentrates on the analysis, i. e. the input proces sing. Input processing has the goal to derive meaning from signal to provide a computerised description and un derstanding of the input and to execute the desired interaction. In multimodal systems, this process is interlea ved between various modalities and multiple interdependencies exist between simultaneous utterances neces sary to take into account for a successful machine interpretation. In this course, students will learn about the necessary steps involved in processing unimodal as well as multimo dal input. The course will highlight typical stages in multimodal processing. Using speech processing as a prima ry example, they learn about: 1. A/D conversion 2. Segmentation 3. Syntactical analysis 4. Semantic analysis 5. Pragmatic analysis 6. Discourse analysis A specific emphasis will be on stages like morphology and semantic analysis. Typical aspects of multimodal in terdependencies, i. e. temporal and semantic interrelations are highlighted and consequences for an algorithmic processing are derived. Prominent multimodal integration (aka multimodal fusion) approaches are described, in cluding transducers, state machines and unification.