TY - JOUR
T1 - The influence of population size in geometric semantic GP
AU - Castelli, Mauro
AU - Manzoni, Luca
AU - Silva, Sara
AU - Vanneschi, Leonardo
AU - Popovič, Aleš
N1 - Castelli, M., Manzoni, L., Silva, S., Vanneschi, L., & Popovič, A. (2017). The influence of population size in geometric semantic GP. Swarm and Evolutionary Computation, 32, 110-120. https://doi.org/10.1016/j.swevo.2016.05.004
PY - 2017/2/1
Y1 - 2017/2/1
N2 - In this work, we study the influence of the population size on the learning ability of Geometric Semantic Genetic Programming for the task of symbolic regression. A large set of experiments, considering different population size values on different regression problems, has been performed. Results show that, on real-life problems, having small populations results in a better training fitness with respect to the use of large populations after the same number of fitness evaluations. However, performance on the test instances varies among the different problems: in datasets with a high number of features, models obtained with large populations present a better performance on unseen data, while in datasets characterized by a relative small number of variables a better generalization ability is achieved by using small population size values. When synthetic problems are taken into account, large population size values represent the best option for achieving good quality solutions on both training and test instances.
AB - In this work, we study the influence of the population size on the learning ability of Geometric Semantic Genetic Programming for the task of symbolic regression. A large set of experiments, considering different population size values on different regression problems, has been performed. Results show that, on real-life problems, having small populations results in a better training fitness with respect to the use of large populations after the same number of fitness evaluations. However, performance on the test instances varies among the different problems: in datasets with a high number of features, models obtained with large populations present a better performance on unseen data, while in datasets characterized by a relative small number of variables a better generalization ability is achieved by using small population size values. When synthetic problems are taken into account, large population size values represent the best option for achieving good quality solutions on both training and test instances.
KW - Genetic programming
KW - Population size
KW - Semantics
UR - http://www.scopus.com/inward/record.url?scp=84973549912&partnerID=8YFLogxK
U2 - 10.1016/j.swevo.2016.05.004
DO - 10.1016/j.swevo.2016.05.004
M3 - Article
AN - SCOPUS:84973549912
SN - 2210-6502
VL - 32
SP - 110
EP - 120
JO - Swarm and Evolutionary Computation
JF - Swarm and Evolutionary Computation
ER -