Explainable statistical learning in public health for policy development: the case of real-world suicide data

van Schaik, Paul, Peng, Yonghong, Ojelabi, Adedokun and Ling, Jonathan (2019) Explainable statistical learning in public health for policy development: the case of real-world suicide data. BMC Medical Research Methodology, 19 (1). ISSN 1471-2288

Item Type:	Article

Abstract

Background

In recent years, the availability of publicly available data related to public health has significantly increased. These data have substantial potential to develop public health policy; however, this requires meaningful and insightful analysis. Our aim is to demonstrate how data analysis techniques can be used to address the issues of data reduction, prediction and explanation using online available public health data, in order to provide a sound basis for informing public health policy.
Methods

Observational suicide prevention data were analysed from an existing online United Kingdom national public health database. Multi-collinearity analysis and principal-component analysis were used to reduce correlated data, followed by regression analyses for prediction and explanation of suicide.
Results

Multi-collinearity analysis was effective in reducing the indicator set of predictors by 30% and principal component analysis further reduced the set by 86%. Regression for prediction identified four significant indicator predictors of suicide behaviour (emergency hospital admissions for intentional self-harm, children leaving care, statutory homelessness and self-reported well-being/low happiness) and two main component predictors (relatedness dysfunction, and behavioural problems and mental illness). Regression for explanation identified significant moderation of a well-being predictor (low happiness) of suicide behaviour by a social factor (living alone), thereby supporting existing theory and providing insight beyond the results of regression for prediction. Two independent predictors capturing relatedness needs in social care service delivery were also identified.
Conclusions

We demonstrate the effectiveness of regression techniques in the analysis of online public health data. Regression analysis for prediction and explanation can both be appropriate for public health data analysis for a better understanding of public health outcomes. It is therefore essential to clarify the aim of the analysis (prediction accuracy or theory development) as a basis for choosing the most appropriate model. We apply these techniques to the analysis of suicide data; however, we argue that the analysis presented in this study should be applied to datasets across public health in order to improve the quality of health policy recommendations.

[thumbnail of Schaik2019_BMC_ExplainableStatisticalLearning.pdf]

Preview

PDF
Schaik2019_BMC_ExplainableStatisticalLearning.pdf - Published Version
Available under License Creative Commons Attribution.
Download (707kB) | Preview

More Information

Depositing User: Jonathan Ling

Identifiers

Item ID: 10987

Identification Number: 10.1186/s12874-019-0796-7

ISSN: 1471-2288

URI: https://sure.sunderland.ac.uk/id/eprint/10987

Official URL: http://dx.doi.org/10.1186/s12874-019-0796-7

Users with ORCIDS

ORCID for Jonathan Ling:

orcid.org/0000-0003-2932-4474

Catalogue record

Date Deposited: 05 Aug 2019 13:35

Last Modified: 04 Jun 2025 17:24

Contributors

Author:	Jonathan Ling
Author:	Paul van Schaik
Author:	Yonghong Peng
Author:	Adedokun Ojelabi

University Divisions

Faculty of Business and Technology > School of Computer Science and Engineering
Faculty of Business and Technology
Faculty of Health Sciences and Wellbeing

Subjects

Computing > Data Science
Sciences > Health Sciences

Actions (login required)

View Item (Repository Staff Only)

Dimensions

Altmetric

Download Statistics

Downloads per month over past year

SURE

Explainable statistical learning in public health for policy development: the case of real-world suicide data

Abstract

More Information

Identifiers

Users with ORCIDS

Catalogue record

Contributors

University Divisions

Subjects

Actions (login required)

Download Statistics

Download Statistics

Download Statistics