TY - JOUR
T1 - AliClu - Temporal sequence alignment for clustering longitudinal clinical data
AU - Rama, Kishan
AU - Canhão, Helena
AU - Carvalho, Alexandra M.
AU - Vinga, Susana
N1 - The authors acknowledge funding the Portuguese Foundation for Science
and Technology (Fundação para a Ciência e a Tecnologia - FCT) under
contracts INESC-ID (UID/CEC/50021/2019) and IT (UID/EEA/50008/2019),
projects PREDICT (PTDC/CCI-CIF/29877/2017), PERSEIDS
(PTDC/EMS-SIS/0642/2014) and NEUROCLINOMICS2 (PTDC/EEI-SII/1937/2014).
The funders had no role in the design of the study, collection, analysis and
interpretation of data, or writing the manuscript.
PY - 2019/12/30
Y1 - 2019/12/30
N2 - BACKGROUND: Patient stratification is a critical task in clinical decision making since it can allow physicians to choose treatments in a personalized way. Given the increasing availability of electronic medical records (EMRs) with longitudinal data, one crucial problem is how to efficiently cluster the patients based on the temporal information from medical appointments. In this work, we propose applying the Temporal Needleman-Wunsch (TNW) algorithm to align discrete sequences with the transition time information between symbols. These symbols may correspond to a patient's current therapy, their overall health status, or any other discrete state. The transition time information represents the duration of each of those states. The obtained TNW pairwise scores are then used to perform hierarchical clustering. To find the best number of clusters and assess their stability, a resampling technique is applied. RESULTS: We propose the AliClu, a novel tool for clustering temporal clinical data based on the TNW algorithm coupled with clustering validity assessments through bootstrapping. The AliClu was applied for the analysis of the rheumatoid arthritis EMRs obtained from the Portuguese database of rheumatologic patient visits (Reuma.pt). In particular, the AliClu was used for the analysis of therapy switches, which were coded as letters corresponding to biologic drugs and included their durations before each change occurred. The obtained optimized clusters allow one to stratify the patients based on their temporal therapy profiles and to support the identification of common features for those groups. CONCLUSIONS: The AliClu is a promising computational strategy to analyse longitudinal patient data by providing validated clusters and by unravelling the patterns that exist in clinical outcomes. Patient stratification is performed in an automatic or semi-automatic way, allowing one to tune the alignment, clustering, and validation parameters. The AliClu is freely available at https://github.com/sysbiomed/AliClu.
AB - BACKGROUND: Patient stratification is a critical task in clinical decision making since it can allow physicians to choose treatments in a personalized way. Given the increasing availability of electronic medical records (EMRs) with longitudinal data, one crucial problem is how to efficiently cluster the patients based on the temporal information from medical appointments. In this work, we propose applying the Temporal Needleman-Wunsch (TNW) algorithm to align discrete sequences with the transition time information between symbols. These symbols may correspond to a patient's current therapy, their overall health status, or any other discrete state. The transition time information represents the duration of each of those states. The obtained TNW pairwise scores are then used to perform hierarchical clustering. To find the best number of clusters and assess their stability, a resampling technique is applied. RESULTS: We propose the AliClu, a novel tool for clustering temporal clinical data based on the TNW algorithm coupled with clustering validity assessments through bootstrapping. The AliClu was applied for the analysis of the rheumatoid arthritis EMRs obtained from the Portuguese database of rheumatologic patient visits (Reuma.pt). In particular, the AliClu was used for the analysis of therapy switches, which were coded as letters corresponding to biologic drugs and included their durations before each change occurred. The obtained optimized clusters allow one to stratify the patients based on their temporal therapy profiles and to support the identification of common features for those groups. CONCLUSIONS: The AliClu is a promising computational strategy to analyse longitudinal patient data by providing validated clusters and by unravelling the patterns that exist in clinical outcomes. Patient stratification is performed in an automatic or semi-automatic way, allowing one to tune the alignment, clustering, and validation parameters. The AliClu is freely available at https://github.com/sysbiomed/AliClu.
KW - Bootstrap
KW - Clustering
KW - clustering indices
KW - Temporal sequence alignment
UR - http://www.scopus.com/inward/record.url?scp=85077319579&partnerID=8YFLogxK
U2 - 10.1186/s12911-019-1013-7
DO - 10.1186/s12911-019-1013-7
M3 - Article
C2 - 31888660
AN - SCOPUS:85077319579
VL - 19
JO - BMC Medical Informatics and Decision Making
JF - BMC Medical Informatics and Decision Making
SN - 1472-6947
IS - 1
M1 - 289
ER -