From Synchronous to Asynchronous Event-driven Fusion Approaches in Multi-modal Affect Recognition
- The cues that describe emotional conditions are encoded within multiple modalities and fusion of multi-modal information is a natural way to improve the automated recognition of emotions. Throughout many studies, we see traditional fusion approaches in which decisions are synchronously forced for fixed time segments across all considered modalities and generic combination rules are applied. Varying success is reported, sometimes performance is worse than uni-modal classification. Starting from these premises, this thesis investigates and compares the performance of various synchronous fusion techniques. We enrich the traditional set with custom and emotion adapted fusion algorithms that are tailored towards the affect recognition domain they are used in. These developments enhance recognition quality to a certain degree, but do not solve the sometimes occurring performance problems. To isolate the issue, we conduct a systematic investigation of synchronous fusion techniques on actedThe cues that describe emotional conditions are encoded within multiple modalities and fusion of multi-modal information is a natural way to improve the automated recognition of emotions. Throughout many studies, we see traditional fusion approaches in which decisions are synchronously forced for fixed time segments across all considered modalities and generic combination rules are applied. Varying success is reported, sometimes performance is worse than uni-modal classification. Starting from these premises, this thesis investigates and compares the performance of various synchronous fusion techniques. We enrich the traditional set with custom and emotion adapted fusion algorithms that are tailored towards the affect recognition domain they are used in. These developments enhance recognition quality to a certain degree, but do not solve the sometimes occurring performance problems. To isolate the issue, we conduct a systematic investigation of synchronous fusion techniques on acted and natural data and conclude that the synchronous fusion approach shows a crucial weakness especially on non-acted emotions: The implicit assumption that relevant affective cues happen at the same time across all modalities is only true if emotions are depicted very coherent and clear - which we cannot expect in a natural setting. This implies a switch to asynchronous fusion approaches. This change can be realized by the application of classification models with memory capabilities (\eg recurrent neural networks), but these are often data hungry and non-transparent. We consequently present an alternative approach to asynchronous modality treatment: The event-driven fusion strategy, in which modalities decide when to contribute information to the fusion process in the form of affective events. These events can be used to introduce an additional abstraction layer to the recognition process, as provided events do not necessarily need to match the sought target class but can be cues that indicate the final assessment. Furthermore, we will see that the architecture of an event-driven fusion system is well suited for real-time usage and is very tolerant to temporarily missing input from single modalities and is therefore a good choice for affect recognition in the wild. We will demonstrate mentioned capabilities in various comparison and prototype studies and present the application of event-driven fusion strategies in multiple European research projects.…
Author: | Florian LingenfelserGND |
---|---|
URN: | urn:nbn:de:bvb:384-opus4-388295 |
Frontdoor URL | https://opus.bibliothek.uni-augsburg.de/opus4/38829 |
Advisor: | Elisabeth André |
Type: | Doctoral Thesis |
Language: | English |
Year of first Publication: | 2018 |
Publishing Institution: | Universität Augsburg |
Granting Institution: | Universität Augsburg, Fakultät für Angewandte Informatik |
Date of final exam: | 2018/02/22 |
Release Date: | 2018/09/24 |
Tag: | Multi-modale Fusion; Emotionale Ereignisse Multi-modal Fusion; Affective Events; Real-time Systems |
GND-Keyword: | Echtzeitsystem; Mensch-Maschine-Kommunikation; Mensch-Maschine-Schnittstelle; Emotionales Verhalten; Ausdrucksverhalten |
Institutes: | Fakultät für Angewandte Informatik |
Fakultät für Angewandte Informatik / Institut für Informatik | |
Fakultät für Angewandte Informatik / Institut für Informatik / Lehrstuhl für Menschzentrierte Künstliche Intelligenz | |
Dewey Decimal Classification: | 0 Informatik, Informationswissenschaft, allgemeine Werke / 00 Informatik, Wissen, Systeme / 004 Datenverarbeitung; Informatik |
Licence (German): | Deutsches Urheberrecht mit Print on Demand |