Late Fusion Model for Emotion Recognition from Facial Expressions and Biosignals in a Dataset of Children with Autism Spectrum Disorder.
Kiejdo Dominika, Depka Prądzinska Monika, Zawadzka Teresa
What this study means for families
Researchers created a computer system that recognizes emotions in autistic children by analyzing facial expressions and body signals like heart rate and skin temperature together. The system was moderately successful, correctly identifying emotions about 68-78% of the time. While this is promising for developing better support tools, the accuracy was lower than similar studies with non-autistic children, highlighting the unique challenges of emotion recognition in autism.
Summary by AutismInsights from published abstract. This is not a substitute for reading the original paper.
Research summary
This study developed a multimodal machine learning system to recognize emotions in children with autism spectrum disorder by combining facial expressions with physiological measures (heart rate, skin conductance, temperature). Using the EMBOA dataset, researchers tested a late fusion approach that integrates outputs from separate models trained on each data type. The system achieved 68% accuracy for categorical emotion classification and 78% accuracy using likelihood-based emotion estimation. While these results are lower than some studies with neurotypical populations, they demonstrate the feasibility of multimodal emotion recognition in autistic children, though the researchers acknowledge challenges including missing data and limited sample representation for certain emotions.
Summary by AutismInsights from published abstract. This is not a substitute for reading the original paper.
Key findings
- 1
Late fusion multimodal approach achieved 68% accuracy for categorical emotion classification and 78% for likelihood-based emotion estimation in autistic children
Confidence: moderateRelevance: Demonstrates potential for technology-assisted emotion recognition tools in autism interventions - 2
Multimodal approach outperformed individual modality baselines, showing benefit of combining facial and physiological data
Confidence: moderateRelevance: Supports use of multiple data sources for more robust emotion assessment in clinical settings - 3
Performance was lower than reported in other studies, indicating unique challenges in emotion recognition for autistic populations
Confidence: highRelevance: Highlights need for autism-specific approaches rather than adapting neurotypical-based systems
Summary by AutismInsights from published abstract. This is not a substitute for reading the original paper.
Clinical implications
Results suggest potential for developing assistive technologies for emotion recognition in autism interventions, but highlight need for larger datasets and autism-specific approaches. The moderate accuracy indicates current limitations for direct clinical application without further development.
Summary by AutismInsights from published abstract. This is not a substitute for reading the original paper.
Limitations
The study acknowledges significant missing values in the dataset and low sample representation for certain emotions. Sample size is not reported, and accuracy results are lower than other emotion recognition studies, limiting generalizability.
Summary by AutismInsights from published abstract. This is not a substitute for reading the original paper.
Original abstract
Children with autism spectrum disorder (ASD) often display atypical emotional expressions and physiological responses, making emotion recognition challenging. This study proposes a multimodal recognition model employing a late fusion framework combining facial expression with physiological measures: electrodermal activity (EDA), temperature (TEMP), and heart rate (HR). Emotional states are annotated using two complementary schemes derived from a shared set of labels. Three annotators provide one categorical Ekman emotion for each timestamp.
From these annotations, a majority-vote label identifies the dominant emotion, while a proportional distribution reflects the likelihood of each emotion based on the relative frequency of the annotators' selections. Separate machine learning models are trained for each modality and for each annotation scheme, and their outputs are integrated through decision-level fusion. A distinct decision-level fusion model is constructed for each annotation scheme, ensuring that both the categorical and likelihood-based representations are optimally combined. The experiments on the EMBOA dataset, collected within the project "Affective loop in Socially Assistive Robotics as an intervention tool for children with autism", show that the late fusion model achieves higher accuracy and robustness than unimodal baselines.
The system attains an accuracy of 68% for categorical emotion classification and 78% under the likelihood-estimation scheme. The results obtained, although lower than those reported in other studies, suggest that further research into emotion recognition in autistic children using other fusions is warranted, even in the case of datasets with a significant number of missing values and low sample representation for certain emotions.
Evidence Grade
emerging
Grade assigned by AutismInsights based on study type and published abstract.
Study Details
- Journal
- Sensors (Basel, Switzerland)
- Year
- 2025
- PMID
- 41471479
- DOI
- 10.3390/s25247485
MeSH Terms