A statistical approach for studying the spatio-temporal distribution of geolocated tweets in urban environments

Fernando Santa, Roberto Henriques, Joaquín Torres-Sospedra, Edzer Pebesma

Research output: Contribution to journalArticle

55 Downloads (Pure)

Abstract

An in-depth descriptive approach to the dynamics of the urban population is fundamental as a first step towards promoting effective planning and designing processes in cities. Understanding the behavioral aspects of human activities can contribute to their effective management and control. We present a framework, based on statistical methods, for studying the spatio-temporal distribution of geolocated tweets as a proxy for where and when people carry out their activities. We have evaluated our proposal by analyzing the distribution of collected geolocated tweets over a two-week period in the summer of 2017 in Lisbon, London, and Manhattan. Our proposal considers a negative binomial regression analysis for the time series of counts of tweets as a first step. We further estimate a functional principal component analysis of second-order summary statistics of the hourly spatial point patterns formed by the locations of the tweets. Finally, we find groups of hours with a similar spatial arrangement of places where humans develop their activities through hierarchical clustering over the principal scores. Social media events are found to show strong temporal trends such as seasonal variation due to the hour of the day and the day of the week in addition to autoregressive schemas. We have also identified spatio-temporal patterns of clustering, i.e., groups of hours of the day that present a similar spatial distribution of human activities.

Original languageEnglish
Article number595
JournalSustainability (Switzerland)
Volume11
Issue number3
DOIs
Publication statusPublished - 23 Jan 2019

Fingerprint

Regression analysis
temporal distribution
Principal component analysis
Spatial distribution
Time series
Statistical methods
human activity
Statistics
Planning
urban population
principal component analysis
regression analysis
seasonal variation
time series
spatial distribution
media event
summer
statistical method
social media
Group

Keywords

  • Functional principal component analysis
  • Human activity
  • Multitype spatial point patterns
  • Negative binomial regression
  • Spatio-temporal statistics

Cite this

@article{263ecbeb701a44959da7f0e93da3c150,
title = "A statistical approach for studying the spatio-temporal distribution of geolocated tweets in urban environments",
abstract = "An in-depth descriptive approach to the dynamics of the urban population is fundamental as a first step towards promoting effective planning and designing processes in cities. Understanding the behavioral aspects of human activities can contribute to their effective management and control. We present a framework, based on statistical methods, for studying the spatio-temporal distribution of geolocated tweets as a proxy for where and when people carry out their activities. We have evaluated our proposal by analyzing the distribution of collected geolocated tweets over a two-week period in the summer of 2017 in Lisbon, London, and Manhattan. Our proposal considers a negative binomial regression analysis for the time series of counts of tweets as a first step. We further estimate a functional principal component analysis of second-order summary statistics of the hourly spatial point patterns formed by the locations of the tweets. Finally, we find groups of hours with a similar spatial arrangement of places where humans develop their activities through hierarchical clustering over the principal scores. Social media events are found to show strong temporal trends such as seasonal variation due to the hour of the day and the day of the week in addition to autoregressive schemas. We have also identified spatio-temporal patterns of clustering, i.e., groups of hours of the day that present a similar spatial distribution of human activities.",
keywords = "Functional principal component analysis, Human activity, Multitype spatial point patterns, Negative binomial regression, Spatio-temporal statistics",
author = "Fernando Santa and Roberto Henriques and Joaqu{\'i}n Torres-Sospedra and Edzer Pebesma",
note = "Santa, F., Henriques, R., Torres-Sospedra, J., & Pebesma, E. (2019). A statistical approach for studying the spatio-temporal distribution of geolocated tweets in urban environments. Sustainability (Switzerland), 11(3), [595]. DOI: 10.3390/su11030595",
year = "2019",
month = "1",
day = "23",
doi = "10.3390/su11030595",
language = "English",
volume = "11",
journal = "Sustainability (Switzerland)",
issn = "2071-1050",
publisher = "MDPI AG",
number = "3",

}

A statistical approach for studying the spatio-temporal distribution of geolocated tweets in urban environments. / Santa, Fernando; Henriques, Roberto; Torres-Sospedra, Joaquín; Pebesma, Edzer.

In: Sustainability (Switzerland), Vol. 11, No. 3, 595, 23.01.2019.

Research output: Contribution to journalArticle

TY - JOUR

T1 - A statistical approach for studying the spatio-temporal distribution of geolocated tweets in urban environments

AU - Santa, Fernando

AU - Henriques, Roberto

AU - Torres-Sospedra, Joaquín

AU - Pebesma, Edzer

N1 - Santa, F., Henriques, R., Torres-Sospedra, J., & Pebesma, E. (2019). A statistical approach for studying the spatio-temporal distribution of geolocated tweets in urban environments. Sustainability (Switzerland), 11(3), [595]. DOI: 10.3390/su11030595

PY - 2019/1/23

Y1 - 2019/1/23

N2 - An in-depth descriptive approach to the dynamics of the urban population is fundamental as a first step towards promoting effective planning and designing processes in cities. Understanding the behavioral aspects of human activities can contribute to their effective management and control. We present a framework, based on statistical methods, for studying the spatio-temporal distribution of geolocated tweets as a proxy for where and when people carry out their activities. We have evaluated our proposal by analyzing the distribution of collected geolocated tweets over a two-week period in the summer of 2017 in Lisbon, London, and Manhattan. Our proposal considers a negative binomial regression analysis for the time series of counts of tweets as a first step. We further estimate a functional principal component analysis of second-order summary statistics of the hourly spatial point patterns formed by the locations of the tweets. Finally, we find groups of hours with a similar spatial arrangement of places where humans develop their activities through hierarchical clustering over the principal scores. Social media events are found to show strong temporal trends such as seasonal variation due to the hour of the day and the day of the week in addition to autoregressive schemas. We have also identified spatio-temporal patterns of clustering, i.e., groups of hours of the day that present a similar spatial distribution of human activities.

AB - An in-depth descriptive approach to the dynamics of the urban population is fundamental as a first step towards promoting effective planning and designing processes in cities. Understanding the behavioral aspects of human activities can contribute to their effective management and control. We present a framework, based on statistical methods, for studying the spatio-temporal distribution of geolocated tweets as a proxy for where and when people carry out their activities. We have evaluated our proposal by analyzing the distribution of collected geolocated tweets over a two-week period in the summer of 2017 in Lisbon, London, and Manhattan. Our proposal considers a negative binomial regression analysis for the time series of counts of tweets as a first step. We further estimate a functional principal component analysis of second-order summary statistics of the hourly spatial point patterns formed by the locations of the tweets. Finally, we find groups of hours with a similar spatial arrangement of places where humans develop their activities through hierarchical clustering over the principal scores. Social media events are found to show strong temporal trends such as seasonal variation due to the hour of the day and the day of the week in addition to autoregressive schemas. We have also identified spatio-temporal patterns of clustering, i.e., groups of hours of the day that present a similar spatial distribution of human activities.

KW - Functional principal component analysis

KW - Human activity

KW - Multitype spatial point patterns

KW - Negative binomial regression

KW - Spatio-temporal statistics

UR - http://www.scopus.com/inward/record.url?scp=85060500301&partnerID=8YFLogxK

UR - http://gateway.webofknowledge.com/gateway/Gateway.cgi?GWVersion=2&SrcAuth=Alerting&SrcApp=Alerting&DestApp=WOS_CPL&DestLinkType=FullRecord&UT=WOS:000458929500040

U2 - 10.3390/su11030595

DO - 10.3390/su11030595

M3 - Article

VL - 11

JO - Sustainability (Switzerland)

JF - Sustainability (Switzerland)

SN - 2071-1050

IS - 3

M1 - 595

ER -