TY - GEN
T1 - Network-Based Variable Selection for Survival Outcomes in Oncological Data
AU - Carrasquinha, Eunice
AU - Veríssimo, André
AU - Lopes, Marta B.
AU - Vinga, Susana
N1 - info:eu-repo/grantAgreement/EC/H2020/633974/EU#
info:eu-repo/grantAgreement/FCT/SFRH/SFRH%2FBD%2F97415%2F2013/PT#
Portuguese Foundation for Science & Technology FCT (UIDB/00297/2020, UIDB/04516/2020, UIDB/50021/2020, UIDB/50022/2020, PTDC/CCI-CIF/29877/2017, PTDC/CCI-INF/29168/2017.
PY - 2020
Y1 - 2020
N2 - The accessibility to “big data” sets down an ambitious challenge in the medical field, especially in personalized medicine, where gene expression data are increasingly being used to establish a diagnosis and optimize treatment of oncological patients. However, the high-dimensionality nature of the data brings many constraints, for which several approaches have been considered, with regularization techniques in the cutting-edge research front. Additionally, the network structure of gene expression data has fostered the development of network-based regularization techniques to convey data into a low-dimensional and interpretable level. In this work, classical elastic net and two recently proposed network-based methods, HubCox and OrphanCox, are applied to high-dimensional gene expression data, to model survival data. An oncological transcriptomic dataset obtained from The Cancer Genome Atlas (TCGA) is used, with patients’ RNA-seq measurements as covariates. The application of sparsity-inducing techniques to the dataset enabled the selection of relevant genes over a range of parameters evaluated. Comparable results were obtained for the elastic net and the network-based OrphanCox regarding model performance and genes selected.
AB - The accessibility to “big data” sets down an ambitious challenge in the medical field, especially in personalized medicine, where gene expression data are increasingly being used to establish a diagnosis and optimize treatment of oncological patients. However, the high-dimensionality nature of the data brings many constraints, for which several approaches have been considered, with regularization techniques in the cutting-edge research front. Additionally, the network structure of gene expression data has fostered the development of network-based regularization techniques to convey data into a low-dimensional and interpretable level. In this work, classical elastic net and two recently proposed network-based methods, HubCox and OrphanCox, are applied to high-dimensional gene expression data, to model survival data. An oncological transcriptomic dataset obtained from The Cancer Genome Atlas (TCGA) is used, with patients’ RNA-seq measurements as covariates. The application of sparsity-inducing techniques to the dataset enabled the selection of relevant genes over a range of parameters evaluated. Comparable results were obtained for the elastic net and the network-based OrphanCox regarding model performance and genes selected.
KW - Gene expression data
KW - High-dimensional data
KW - Network-based regularization
KW - Regularized optimization
UR - http://www.scopus.com/inward/record.url?scp=85085171455&partnerID=8YFLogxK
U2 - 10.1007/978-3-030-45385-5_49
DO - 10.1007/978-3-030-45385-5_49
M3 - Conference contribution
AN - SCOPUS:85085171455
SN - 978-3-030-45384-8
T3 - Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
SP - 550
EP - 561
BT - Bioinformatics and Biomedical Engineering - 8th International Work-Conference, IWBBIO 2020, Proceedings
A2 - Rojas, Ignacio
A2 - Valenzuela, Olga
A2 - Rojas, Fernando
A2 - Herrera, Luis Javier
A2 - Ortuño, Francisco
PB - Springer
CY - Cham
T2 - 8th International Work-Conference on Bioinformatics and Biomedical Engineering, IWBBIO 2020
Y2 - 6 May 2020 through 8 May 2020
ER -