Aims: Likelihood of alcohol dependence (AD) is increased among people who transition to greater levels of alcohol involvement at a younger age. Indicated interventions delivered early may be effective in reducing risk, but could be costly. One way to increase cost-effectiveness would be to develop a prediction model that targeted interventions to the subset of youth with early alcohol use who are at highest risk of subsequent AD. Design: A prediction model was developed for DSM-IV AD onset by age 25 years using an ensemble machine-learning algorithm known as ‘Super Learner’. Shapley additive explanations (SHAP) assessed variable importance. Setting and Participants: Respondents reporting early onset of regular alcohol use (i.e. by 17 years of age) who were aged 25 years or older at interview from 14 representative community surveys conducted in 13 countries as part of WHO's World Mental Health Surveys. Measurements: The primary outcome to be predicted was onset of life-time DSM-IV AD by age 25 as measured using the Composite International Diagnostic Interview, a fully structured diagnostic interview. Findings: AD prevalence by age 25 was 5.1% among the 10 687 individuals who reported drinking alcohol regularly by age 17. The prediction model achieved an external area under the curve [0.78; 95% confidence interval (CI) = 0.74–0.81] higher than any individual candidate risk model (0.73–0.77) and an area under the precision-recall curve of 0.22. Overall calibration was good [integrated calibration index (ICI) = 1.05%]; however, miscalibration was observed at the extreme ends of the distribution of predicted probabilities. Interventions provided to the 20% of people with highest risk would identify 49% of AD cases and require treating four people without AD to reach one with AD. Important predictors of increased risk included younger onset of alcohol use, males, higher cohort alcohol use and more mental disorders. Conclusions: A risk algorithm can be created using data collected at the onset of regular alcohol use to target youth at highest risk of alcohol dependence by early adulthood. Important considerations remain for advancing the development and practical implementation of such models.
- alcohol use
- machine learning