Leveraging national tourist offices through data analytics

Sérgio Moro, Paulo Rita, Cristina Oliveira, Fernando Batista, Ricardo Ribeiro

Research output: Contribution to journalArticle

2 Citations (Scopus)

Abstract

Purpose: This study aims to propose a data-driven approach, based on open-source tools, that makes it possible to understand customer satisfaction of the accommodation offer of a whole country. Design/methodology/approach: The method starts by extracting information from all hotels of Portugal available at TripAdvisor through Web scraping. Then, a support vector machine is adopted for modeling the TripAdvisor score, which is considered a proxy of customer satisfaction. Finally, knowledge extraction from the model is achieved using sensitivity analysis to unveil the influence of features on the score. Findings: The model of the TripAdvisor score achieved a mean absolute percentage error of around 5 per cent, proving the value of modeling the extracted data. The number of rooms of the unit and the minimum price are the two most relevant features, showing that customers appreciate smaller and more expensive units, whereas the location of the hotel does not hold significant relevance. Originality/value: National tourist offices can use the proposed approach to understand what drives tourists’ satisfaction, helping to shape a country’s strategy. For example, licensing new hotels may take into account the unit size and other characteristics that make it more attractive to tourists. Furthermore, the procedure can be replicated at any time and in any country, making it a valuable tool for data-driven decision support on a national scale.

Original languageEnglish
Pages (from-to)420-426
Number of pages7
JournalInternational Journal of Culture, Tourism, and Hospitality Research
Volume12
Issue number4
DOIs
Publication statusPublished - 1 Oct 2018

Fingerprint

tourist
customer
Portugal
accommodation
modeling
sensitivity analysis
methodology
office
TripAdvisor
Hotels
Tourists
Values
Customer satisfaction
Modeling
time
method
support vector machine
price
decision
licencing

Keywords

  • Data analytics
  • Data mining
  • National tourist offices
  • Online reviews
  • Sensitivity analysis
  • Web scraping

Cite this

@article{f6e3ad569823459da32e50423ce811eb,
title = "Leveraging national tourist offices through data analytics",
abstract = "Purpose: This study aims to propose a data-driven approach, based on open-source tools, that makes it possible to understand customer satisfaction of the accommodation offer of a whole country. Design/methodology/approach: The method starts by extracting information from all hotels of Portugal available at TripAdvisor through Web scraping. Then, a support vector machine is adopted for modeling the TripAdvisor score, which is considered a proxy of customer satisfaction. Finally, knowledge extraction from the model is achieved using sensitivity analysis to unveil the influence of features on the score. Findings: The model of the TripAdvisor score achieved a mean absolute percentage error of around 5 per cent, proving the value of modeling the extracted data. The number of rooms of the unit and the minimum price are the two most relevant features, showing that customers appreciate smaller and more expensive units, whereas the location of the hotel does not hold significant relevance. Originality/value: National tourist offices can use the proposed approach to understand what drives tourists’ satisfaction, helping to shape a country’s strategy. For example, licensing new hotels may take into account the unit size and other characteristics that make it more attractive to tourists. Furthermore, the procedure can be replicated at any time and in any country, making it a valuable tool for data-driven decision support on a national scale.",
keywords = "Data analytics, Data mining, National tourist offices, Online reviews, Sensitivity analysis, Web scraping",
author = "S{\'e}rgio Moro and Paulo Rita and Cristina Oliveira and Fernando Batista and Ricardo Ribeiro",
note = "Moro, S., Rita, P., Oliveira, C., Batista, F., & Ribeiro, R. (2018). Leveraging national tourist offices through data analytics. International Journal of Culture, Tourism, and Hospitality Research, 12(4), 420-426. DOI: 10.1108/IJCTHR-04-2018-0051",
year = "2018",
month = "10",
day = "1",
doi = "10.1108/IJCTHR-04-2018-0051",
language = "English",
volume = "12",
pages = "420--426",
journal = "International Journal of Research in Tourism and Hospitality",
issn = "1750-6182",
publisher = "Emerald Group Publishing Ltd.",
number = "4",

}

Leveraging national tourist offices through data analytics. / Moro, Sérgio; Rita, Paulo; Oliveira, Cristina; Batista, Fernando; Ribeiro, Ricardo.

In: International Journal of Culture, Tourism, and Hospitality Research, Vol. 12, No. 4, 01.10.2018, p. 420-426.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Leveraging national tourist offices through data analytics

AU - Moro, Sérgio

AU - Rita, Paulo

AU - Oliveira, Cristina

AU - Batista, Fernando

AU - Ribeiro, Ricardo

N1 - Moro, S., Rita, P., Oliveira, C., Batista, F., & Ribeiro, R. (2018). Leveraging national tourist offices through data analytics. International Journal of Culture, Tourism, and Hospitality Research, 12(4), 420-426. DOI: 10.1108/IJCTHR-04-2018-0051

PY - 2018/10/1

Y1 - 2018/10/1

N2 - Purpose: This study aims to propose a data-driven approach, based on open-source tools, that makes it possible to understand customer satisfaction of the accommodation offer of a whole country. Design/methodology/approach: The method starts by extracting information from all hotels of Portugal available at TripAdvisor through Web scraping. Then, a support vector machine is adopted for modeling the TripAdvisor score, which is considered a proxy of customer satisfaction. Finally, knowledge extraction from the model is achieved using sensitivity analysis to unveil the influence of features on the score. Findings: The model of the TripAdvisor score achieved a mean absolute percentage error of around 5 per cent, proving the value of modeling the extracted data. The number of rooms of the unit and the minimum price are the two most relevant features, showing that customers appreciate smaller and more expensive units, whereas the location of the hotel does not hold significant relevance. Originality/value: National tourist offices can use the proposed approach to understand what drives tourists’ satisfaction, helping to shape a country’s strategy. For example, licensing new hotels may take into account the unit size and other characteristics that make it more attractive to tourists. Furthermore, the procedure can be replicated at any time and in any country, making it a valuable tool for data-driven decision support on a national scale.

AB - Purpose: This study aims to propose a data-driven approach, based on open-source tools, that makes it possible to understand customer satisfaction of the accommodation offer of a whole country. Design/methodology/approach: The method starts by extracting information from all hotels of Portugal available at TripAdvisor through Web scraping. Then, a support vector machine is adopted for modeling the TripAdvisor score, which is considered a proxy of customer satisfaction. Finally, knowledge extraction from the model is achieved using sensitivity analysis to unveil the influence of features on the score. Findings: The model of the TripAdvisor score achieved a mean absolute percentage error of around 5 per cent, proving the value of modeling the extracted data. The number of rooms of the unit and the minimum price are the two most relevant features, showing that customers appreciate smaller and more expensive units, whereas the location of the hotel does not hold significant relevance. Originality/value: National tourist offices can use the proposed approach to understand what drives tourists’ satisfaction, helping to shape a country’s strategy. For example, licensing new hotels may take into account the unit size and other characteristics that make it more attractive to tourists. Furthermore, the procedure can be replicated at any time and in any country, making it a valuable tool for data-driven decision support on a national scale.

KW - Data analytics

KW - Data mining

KW - National tourist offices

KW - Online reviews

KW - Sensitivity analysis

KW - Web scraping

UR - http://www.scopus.com/inward/record.url?scp=85055082213&partnerID=8YFLogxK

UR - http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=WOS_CPL&DestLinkType=FullRecord&UT=WOS:000447558300003

U2 - 10.1108/IJCTHR-04-2018-0051

DO - 10.1108/IJCTHR-04-2018-0051

M3 - Article

VL - 12

SP - 420

EP - 426

JO - International Journal of Research in Tourism and Hospitality

JF - International Journal of Research in Tourism and Hospitality

SN - 1750-6182

IS - 4

ER -