Join Free Log In

Back to research database

Emerging

Clinical correlates of errors in machine-learning diagnostic model of autism spectrum disorder: Impact of sample cohorts.

Autism : the international journal of research and practice2025

Wang Yen-Chin, Cheng Chung-Yuan, Wu Chi-Shin, Lee Chi-Chun, Gau Susan Shur-Fen

What this study means for families

Researchers tested computer models that help identify autism using questionnaire data from Taiwan. The models worked well within the same type of group they were trained on, but struggled when applied to different groups. When mistakes happened, non-autistic people wrongly identified as autistic often had behavioral challenges or ADHD. Autistic people wrongly identified as non-autistic typically had fewer difficulties and sometimes higher IQ scores.

Summary by AutismInsights from published abstract. This is not a substitute for reading the original paper.

Research summary

This study examined machine-learning models for autism diagnosis using Social Responsiveness Scale data from two Taiwanese cohorts: clinical (1,203 autistic, 1,182 non-autistic) and community (35 autistic, 3,297 non-autistic). Models showed high within-cohort accuracy (sensitivity 0.91-1.00, specificity 0.89-0.96) but poor cross-cohort generalizability. Community-trained models applied to clinical data showed reduced performance (sensitivity 0.65, specificity 0.95). Misclassification patterns revealed that non-autistic individuals incorrectly identified as autistic had elevated behavioral symptoms and ADHD prevalence.

Autistic individuals misclassified as non-autistic showed fewer behavioral symptoms, and in community samples, higher IQ with more aggressive but fewer social/attention problems.

Summary by AutismInsights from published abstract. This is not a substitute for reading the original paper.

Key findings

1
Machine-learning models showed high within-cohort diagnostic accuracy (sensitivity 0.91-1.00, specificity 0.89-0.96) but poor generalizability across different cohorts
Confidence: moderateRelevance: Highlights the importance of training data source and population characteristics in diagnostic tool development
2
Non-autistic individuals misclassified as autistic showed elevated behavioral symptoms and higher ADHD prevalence
Confidence: moderateRelevance: Important for understanding potential false positives and differential diagnosis considerations
3
Autistic individuals misclassified as non-autistic had fewer behavioral symptoms and higher IQ in community samples
Confidence: moderateRelevance: May indicate masking or compensatory behaviors in higher-functioning individuals leading to missed diagnoses

Summary by AutismInsights from published abstract. This is not a substitute for reading the original paper.

Clinical implications

Machine-learning diagnostic tools require careful validation across diverse populations. Training data characteristics significantly impact model performance and generalizability. Clinicians should be aware that automated tools may miss higher-functioning autistic individuals and incorrectly flag those with ADHD or behavioral challenges.

Summary by AutismInsights from published abstract. This is not a substitute for reading the original paper.

Limitations

Small community cohort autism sample (n=35). Single assessment tool used. Cross-cultural generalizability unclear beyond Taiwan. Limited information on participant demographics and clinical characteristics beyond basic measures reported.

Summary by AutismInsights from published abstract. This is not a substitute for reading the original paper.

Original abstract

Machine-learning models can assist in diagnosing autism but have biases. We examines the correlates of misclassifications and how training data affect model generalizability. The Social Responsive Scale data were collected from two cohorts in Taiwan: the clinical cohort comprised 1203 autistic participants and 1182 non-autistic comparisons, and the community cohort consisted of 35 autistic participants and 3297 non-autistic comparisons. Classification models were trained, and the misclassification cases were investigated regarding their associations with sex, age, intelligence quotient (IQ), symptoms from the child behavioral checklist (CBCL), and co-occurring psychiatric diagnosis.

Models showed high within-cohort accuracy (clinical: sensitivity 0.91-0.95, specificity 0.93-0.94; community: sensitivity 0.91-1.00, specificity 0.89-0.96), but generalizability across cohorts was limited. When the community-trained model was applied to the clinical cohort, performance declined (sensitivity 0.65, specificity 0.95). In both models, non-autistic individuals misclassified as autistic showed elevated behavioral symptoms and attention-deficit hyperactivity disorder (ADHD) prevalence. Conversely, autistic individuals who were misclassified tended to show fewer behavioral symptoms and, in the community model, higher IQ and aggressive behavior but less social and attention problems.

Error patterns of machine-learning model and the impact of training data warrant careful consideration in future research.Lay AbstractMachine-learning is a type of computer model that can help identify patterns in data and make predictions. In autism research, these models may support earlier or more accurate identification of autistic individuals. But to be useful, they need to make reliable predictions across different groups of people. In this study, we explored when and why these models might make mistakes-and how the kind of data used to train them affects their accuracy.

Training models means using information to teach the computer model how to tell the difference between autistic and non-autistic individuals. We used the information from the Social Responsiveness Scale (SRS), which is a questionnaire that measures autistic features. We tested these models on two different groups: one from clinical settings and one from the general community. The models worked well when tested within the same type of group they were trained.

However, a model trained on the community group did not perform as accurately when tested on the clinical group. Sometimes, the model got it wrong. For example, in the clinical group, some autistic individuals were mistakenly identified as non-autistic. These individuals tended to have fewer emotional or behavioral difficulties.

In the community group, autistic individuals who were mistakenly identified as non-autistic had higher IQs and showed more aggressive behaviors but fewer attention or social problems. On the contrary, some non-autistic people were incorrectly identified as autistic. These people had more emotional or behavioral challenges and were more likely to have attention-deficit hyperactivity disorder (ADHD). These findings highlight that machine-learning models are sensitive to the type of data they are trained on.

To build fair and accurate models for predicting autism, it is essential to consider where the training data come from and whether it represents the full diversity of individuals. Understanding these patterns of error can help improve future tools used in both research and clinical care.

View Original Paper

View original paperFull paper via publisher (may require subscription)

Evidence Grade

Emerging

moderate

Grade assigned by AutismInsights based on study type and published abstract.

Study Details

Journal: Autism : the international journal of research and practice
Year: 2025
PMID: 40762098
DOI: 10.1177/13623613251360271

MeSH Terms

HumansAutism Spectrum DisorderMaleMachine LearningFemaleChildTaiwanCohort StudiesAdolescentSensitivity and SpecificityDiagnostic ErrorsAttention Deficit Disorder with Hyperactivity