TY - JOUR
T1 - A toolbox of machine learning software to support microbiome analysis
AU - Marcos-Zambrano, Laura Judith
AU - López-Molina, Víctor Manuel
AU - Bakir-Gungor, Burcu
AU - Frohme, Marcus
AU - Karaduzovic-Hadziabdic, Kanita
AU - Klammsteiner, Thomas
AU - Ibrahimi, Eliana
AU - Lahti, Leo
AU - Loncar-Turukalo, Tatjana
AU - Dhamo, Xhilda
AU - Simeon, Andrea
AU - Nechyporenko, Alina
AU - Pio, Gianvito
AU - Przymus, Piotr
AU - Sampri, Alexia
AU - Trajkovik, Vladimir
AU - Lacruz-Pleguezuelos, Blanca
AU - Aasmets, Oliver
AU - Araújo, Ricardo
AU - Anagnostopoulos, Ioannis
AU - Aydemir, Önder
AU - Berland, Magali
AU - Calle, M. Luz
AU - Ceci, Michelangelo
AU - Duman, Hatice
AU - Gündoğdu, Aycan
AU - Havulinna, Aki S.
AU - Kaka Bra, Kardokh Hama Najib
AU - Kalluci, Eglantina
AU - Karav, Sercan
AU - Lode, Daniel
AU - Lopes, Marta B.
AU - May, Patrick
AU - Nap, Bram
AU - Nedyalkova, Miroslava
AU - Paciência, Inês
AU - Pasic, Lejla
AU - Pujolassos, Meritxell
AU - Shigdel, Rajesh
AU - Susín, Antonio
AU - Thiele, Ines
AU - Truică, Ciprian Octavian
AU - Wilmes, Paul
AU - Yilmaz, Ercument
AU - Yousef, Malik
AU - Claesson, Marcus Joakim
AU - Truu, Jaak
AU - Carrillo de Santa Pau, Enrique
N1 - Funding Information:
This study was supported by COST Action CA18131 “Statistical and machine learning techniques in human microbiome studies.” LM-Z is supported by Spanish State Research Agency Juan de la Cierva Grant IJC2019-042188-I (LM-Z). MB is supported by Metagenopolis grant ANR-11-DPBS-0001. MLC was partially supported by the Spanish Ministry of Economy, Industry and Competitiveness, Reference PID2019-104830RB-I00.
Funding Information:
This article is based upon work from COST Action ML4Microbiome “Statistical and machine learning techniques in human microbiome studies,” CA18131, supported by COST (European Cooperation in Science and Technology), www.cost.eu .
Publisher Copyright:
Copyright © 2023 Marcos-Zambrano, López-Molina, Bakir-Gungor, Frohme, Karaduzovic-Hadziabdic, Klammsteiner, Ibrahimi, Lahti, Loncar Turukalo, Dhamo, Simeon, Nechyporenko, Pio, Przymus, Sampri, Trajkovik, Lacruz-Pleguezuelos, Aasmets, Araujo, Anagnostopoulos, Aydemir, Berland, Calle, Ceci, Duman, Gündoğdu, Havulinna, Kaka Bra, Kalluci, Karav, Lode, Lopes, May, Nap, Nedyalkova, Paciência, Pasic, Pujolassos, Shigdel, Susín, Thiele, Truică, Wilmes, Yilmaz, Yousef, Claesson, Truu, Carrillo de Santa Pau.
PY - 2023/11/22
Y1 - 2023/11/22
N2 - The human microbiome has become an area of intense research due to its potential impact on human health. However, the analysis and interpretation of this data have proven to be challenging due to its complexity and high dimensionality. Machine learning (ML) algorithms can process vast amounts of data to uncover informative patterns and relationships within the data, even with limited prior knowledge. Therefore, there has been a rapid growth in the development of software specifically designed for the analysis and interpretation of microbiome data using ML techniques. These software incorporate a wide range of ML algorithms for clustering, classification, regression, or feature selection, to identify microbial patterns and relationships within the data and generate predictive models. This rapid development with a constant need for new developments and integration of new features require efforts into compile, catalog and classify these tools to create infrastructures and services with easy, transparent, and trustable standards. Here we review the state-of-the-art for ML tools applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on ML based software and framework resources currently available for the analysis of microbiome data in humans. The aim is to support microbiologists and biomedical scientists to go deeper into specialized resources that integrate ML techniques and facilitate future benchmarking to create standards for the analysis of microbiome data. The software resources are organized based on the type of analysis they were developed for and the ML techniques they implement. A description of each software with examples of usage is provided including comments about pitfalls and lacks in the usage of software based on ML methods in relation to microbiome data that need to be considered by developers and users. This review represents an extensive compilation to date, offering valuable insights and guidance for researchers interested in leveraging ML approaches for microbiome analysis.
AB - The human microbiome has become an area of intense research due to its potential impact on human health. However, the analysis and interpretation of this data have proven to be challenging due to its complexity and high dimensionality. Machine learning (ML) algorithms can process vast amounts of data to uncover informative patterns and relationships within the data, even with limited prior knowledge. Therefore, there has been a rapid growth in the development of software specifically designed for the analysis and interpretation of microbiome data using ML techniques. These software incorporate a wide range of ML algorithms for clustering, classification, regression, or feature selection, to identify microbial patterns and relationships within the data and generate predictive models. This rapid development with a constant need for new developments and integration of new features require efforts into compile, catalog and classify these tools to create infrastructures and services with easy, transparent, and trustable standards. Here we review the state-of-the-art for ML tools applied in human microbiome studies, performed as part of the COST Action ML4Microbiome activities. This scoping review focuses on ML based software and framework resources currently available for the analysis of microbiome data in humans. The aim is to support microbiologists and biomedical scientists to go deeper into specialized resources that integrate ML techniques and facilitate future benchmarking to create standards for the analysis of microbiome data. The software resources are organized based on the type of analysis they were developed for and the ML techniques they implement. A description of each software with examples of usage is provided including comments about pitfalls and lacks in the usage of software based on ML methods in relation to microbiome data that need to be considered by developers and users. This review represents an extensive compilation to date, offering valuable insights and guidance for researchers interested in leveraging ML approaches for microbiome analysis.
KW - data integration
KW - feature analysis
KW - feature generation
KW - machine learning
KW - microbial gene prediction
KW - microbial metabolic modeling
KW - microbiome
KW - software
UR - http://www.scopus.com/inward/record.url?scp=85178948402&partnerID=8YFLogxK
U2 - 10.3389/fmicb.2023.1250806
DO - 10.3389/fmicb.2023.1250806
M3 - Review article
C2 - 38075858
AN - SCOPUS:85178948402
SN - 1664-302X
VL - 14
JO - Frontiers in Microbiology
JF - Frontiers in Microbiology
M1 - 1250806
ER -