Abstract
Author List & Affiliations: Joe Butler, joe.butler@sunderland.ac.uk; School of Psychology, University of Sunderland, Sunderland, UK. Helen McArdle Nursing and Care Research Institute, University of Sunderland, Sunderland, UK. Adewale Samuel Owobowale, bi26ae@student.sunderland.ac.uk; School of Computer Science, University of Sunderland, Sunderland, UK Tamlyn J. Watermeyer; tamlyn.watermeyer@northumbria.ac.uk; Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, College of Medicine & Veterinary Sciences; University of Edinburgh, Edinburgh, UK & Faculty of Health & Life Sciences, Northumbria University, Newcastle-Upon-Tyne, UK Sam Danso*; sam.danso@sunderland.ac.uk; School of Computer Science, University of Sunderland, Sunderland, UK & Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, College of Medicine & Veterinary Sciences; University of Edinburgh, Edinburgh, UK
Mario Parra-Rodrigues*; mario.parra-rodriguez@strath.ac.uk School of Psychology, University of Strathclyde, Glasgow, UK. Edinburgh Dementia Prevention, Centre for Clinical Brain Sciences, College of Medicine & Veterinary Sciences; University of Edinburgh, Edinburgh, UK.
*Co-Supervising authors
Background:
The Visual Short Term Memory Binding (VSTMBT) task is a gold-standard cognitive assessment for the identification of Alzheimer's Disease and associated risk factors, including during the preclinical stage. Previous work from our group (Butler, Watermeyer,...& Parra 2024) demonstrated in a small number (n=37) of healthy older adults that data collected using a web-based, self-administrated version of the task provides data comparable to that collected in laboratory conditions. Here we incorporated a machine learning (ML) approach to explore impacts of risk factors on this task in a larger digital dataset.
Methods:
Using data (n=359) collected from an online study incorporating the VSTMBT and lifestyle, psychological, and health data, we created a Binding Cost score which has shown to approximate AD-related neuropathology (Parra et al., 2024). This categorised participants as either strong-binders (SB – indicative of no pathology; 85.9% percent of the sample) or weak-binders (WB – indicative of pathology; 14.1%).
We trained three ML algorithms (Random Forest (RF), K-Nearest Neighbour (KNN) and Decision Tree (DT) by employing SMOTE technique to overcome the imbalance in group distribution. We applied a 10-fold cross-validation with hyper-parameter tuning to optimise the models based on the selected variables (including age, sex, education, BMI, loneliness, and existing-morbidities) to predict individual’s risk of cognitive impairment based on the groupings (SB vs WB). Models’ performances were examined on 20% of unseen test set.
Results:
Aside from existing morbidities, which were higher in weak binders (WB = 0.41 (sd+2=0.79); SB =0.22(sd+2=0.49); t=2.21; p=0.03), other measures did not differ between groups. Regarding performance of the ML models, RF achieved the best performance (accuracy: 91%; recall=91%; precision=91%; AUC=97%) compared to KNN (accuracy: 81%; recall=81%; precision=84%; AUC=91%) and DT (accuracy: 81%; recall=81%; precision=82%; AUC= 85%). Feature importance analysis of the RF model suggests mental health, BMI, and fatigue have the highest impact on the prediction model, while sex and multi-morbidity score have the least impact.
Conclusions:
The study underscores the potential of web-based cognitive assessments and ML for remote monitoring and early identification of AD risk factors, contributing to the advancement of accessible tools for early detection.
Full text not available from this repository.