TY - GEN
T1 - kd-SNN: A Metric Data Structure Seconding the Clustering of Spatial Data
AU - Faustino, Bruno Filipe
AU - Moura-Pires, Joao
AU - Santos, Maribel Yasmina
AU - Moreira, Guilherme
PY - 2014
Y1 - 2014
N2 - Large amounts of spatio-temporal data are continuously collected through the use of location devices or sensor technologies. One of the techniques usually used to obtain a first insight on data is clustering. The Shared Nearest Neighbour (SNN) is a clustering algorithm that finds clusters with different densities, shapes and sizes, and also identifies noise in data, making it a good candidate to deal with spatial data. However, its time complexity is, in the worst case, O(n(2)), compromising its scalability. This paper presents the use of a metric data structure, the kd-Tree, to index spatial data and support the SNN in querying for the k-nearest neighbours, improving the time complexity in the average case of the algorithm, when dealing with low dimensional data, to at most O(n x log n). The proposed algorithm, the kd-SNN, was evaluated in terms of performance, showing huge improvements over existing approaches, allowing the identification of the main traffic routes by completely clustering a maritime data set.
AB - Large amounts of spatio-temporal data are continuously collected through the use of location devices or sensor technologies. One of the techniques usually used to obtain a first insight on data is clustering. The Shared Nearest Neighbour (SNN) is a clustering algorithm that finds clusters with different densities, shapes and sizes, and also identifies noise in data, making it a good candidate to deal with spatial data. However, its time complexity is, in the worst case, O(n(2)), compromising its scalability. This paper presents the use of a metric data structure, the kd-Tree, to index spatial data and support the SNN in querying for the k-nearest neighbours, improving the time complexity in the average case of the algorithm, when dealing with low dimensional data, to at most O(n x log n). The proposed algorithm, the kd-SNN, was evaluated in terms of performance, showing huge improvements over existing approaches, allowing the identification of the main traffic routes by completely clustering a maritime data set.
KW - kd-tree
KW - SNN
KW - spatial data
KW - movement data
KW - route identification
M3 - Conference contribution
SN - 978-3-319-09143-3
T3 - Lecture Notes in Computer Science
SP - 312
EP - 327
BT - COMPUTATIONAL SCIENCE AND ITS APPLICATIONS - ICCSA 2014, PT 1
A2 - Murgante, B
A2 - Misra, S
A2 - Rocha, AMAC
A2 - Torre, C
A2 - Rocha, JG
A2 - Falcao, MI
A2 - Taniar, D
A2 - Apduhan, BO
A2 - Gervasi, O
PB - SPRINGER-VERLAG BERLIN
T2 - 14th International Conference on Computational Science and Its Applications (ICCSA)
Y2 - 30 June 2014 through 3 July 2014
ER -