TY - JOUR
T1 - Decision Support Models for Predicting and Explaining Airport Passenger Connectivity From Data
AU - Guimarães, Marta
AU - Soares, Cláudia
AU - Ventura, Rodrigo
N1 - info:eu-repo/grantAgreement/FCT/6817 - DCRRNI ID/UIDB%2F50009%2F2020/PT#
Publisher Copyright:
IEEE
PY - 2022/9
Y1 - 2022/9
N2 - Predicting if passengers in a connecting flight will lose their connection is paramount for airline profitability. We present novel machine learning-based decision support models for the different stages of connection flight management, namely for strategic, pre-tactical, tactical and post-operations. We predict missed flight connections in an airline's hub airport using historical data on flights and passengers, and analyse the factors that contribute additively to the predicted outcome for each decision horizon. Our data is high-dimensional, heterogeneous, imbalanced and noisy, and does not inform about passenger arrival/departure transit time. We employ probabilistic encoding of categorical classes, data balancing with Gaussian Mixture Models, and boosting. For all planning horizons, our models attain an under the curve (AUC) of the receiver operating characteristic (ROC) higher than 0.93. SHAP value explanations of our models indicate that scheduled/perceived connection times contribute the most to the prediction, followed by passenger age and whether border controls are required.
AB - Predicting if passengers in a connecting flight will lose their connection is paramount for airline profitability. We present novel machine learning-based decision support models for the different stages of connection flight management, namely for strategic, pre-tactical, tactical and post-operations. We predict missed flight connections in an airline's hub airport using historical data on flights and passengers, and analyse the factors that contribute additively to the predicted outcome for each decision horizon. Our data is high-dimensional, heterogeneous, imbalanced and noisy, and does not inform about passenger arrival/departure transit time. We employ probabilistic encoding of categorical classes, data balancing with Gaussian Mixture Models, and boosting. For all planning horizons, our models attain an under the curve (AUC) of the receiver operating characteristic (ROC) higher than 0.93. SHAP value explanations of our models indicate that scheduled/perceived connection times contribute the most to the prediction, followed by passenger age and whether border controls are required.
KW - Airline schedule planning
KW - data-driven operations
KW - decision support models.
KW - imbalanced classification
KW - model explanations
UR - http://www.scopus.com/inward/record.url?scp=85124736728&partnerID=8YFLogxK
U2 - 10.1109/TITS.2022.3147155
DO - 10.1109/TITS.2022.3147155
M3 - Article
AN - SCOPUS:85124736728
SN - 1524-9050
VL - 23
SP - 16005
EP - 16015
JO - IEEE Transactions on Intelligent Transportation Systems
JF - IEEE Transactions on Intelligent Transportation Systems
IS - 9
ER -