Prediction and simulation of the risk of traffic accidents using neural networks and gradient boosting with an hybrid classification/regression modelling approach in urban context

Activity: Talk or presentationOral presentation


Traffic accidents are the cause of considerable losses both in property and in human lives, as they can result in economic problems to the people involved and to society, in injury, incapacity and even death. To reduce and minimize these disastrous effects, it is important that emergency services have the ability to plan and define strategies to reduce the time taken to provide a first aid response to affected individuals. In this sense, traffic accident risk prediction can play a crucial role in the definition of these strategies, as it allows to both understand the factors that influence the occurrence of traffic accidents and, to anticipate in space and time in which location it is more likely that traffic accident occur. Several studies have been developed in regard to traffic accident prediction, such as Poisson’s and binomial negative algorithms (Fancello, Soddu, & Fadda, 2018), ARIMA models (Ihueze & Onwurah, 2018), machine learning techniques like regression models (Chang & Chen, 2005), K-Nearest Neighbour (KNN), Bayesian networks (Hossain & Muromachi, 2012) and decision trees (Lin, Wang, & Sadek, 2015). Moreover, some deep learning approaches (Chen, Song, Yamada, & Shibasaki, 2016; Ren, Song, Wang, Hu, & Lei, 2018) have been developed to estimate the risk of traffic accidents, but in coarser regular spatial grids, failing to provide the necessary spatial detail needed for emergency operations. Besides this aspect, most of the studies regarding prediction of traffic accidents are made in a non-urban context and not enough attention has been provided to the prediction of traffic accident risk in urban environments (Yu et al., 2021). In this paper we have developed and tested two traffic accident probability prediction models based on neural networks architectures and a gradient boosting framework that uses tree-based learning algorithms. For this purpose, we used information regarding traffic accidents occurrences, that required firefighters’ intervention, in the city of Lisbon from 2013 to 2020. Traffic accidents occurrences were aggregated at the road level by period of day, along with road characteristics data, available on Lisbon Open Data Portal, and weather information. Naturally, there are far more periods without accidents than with, to deal with this unbalanced data, the modelling strategy was divided in two main steps, in which the first one consisted in a classification to identify periods where the probability of having traffic accidents was different than zero. From the resulting sample of the first step, a regression was used to compute the probability of traffic accidents by period of day at street level. The tested models provided good estimates for both the neural network and tree-based learning algorithms. From the results, a traffic accident risk simulator was developed, allowing the re-assessment of the risk of traffic accidents, if street characteristics and weather conditions are changed for a specific street and period of day. This simulator provides to the emergency services, an essential tool for planning and management of emergency operations.
This work was supported by the Connecting Europe Facility (CEF) – Telecommunications sector in the framework of project Urban Co-Creation Data Lab [INEA/CEF/ICT/A2018/1837945].
Period25 Nov 2021
Event titlePlanning in the context of the rapid transformations: data and decision making - VI Conference on Regional and Urban Planning/ I Conference on Data Science for the Social Sciences/ Conference of the Research Project DRIVIT-UP
Event typeConference
Conference numberVI
LocationAveiro, Portugal
Degree of RecognitionInternational


  • Traffic accidents
  • Urban Planning
  • Neural Networks
  • Gradient Boosting Framework
  • Simulation