Dynamic Analytics for Spatial Data with an Incremental Clustering Approach

Fernando Mendes, Maribel Yasmina Santos, Joao Moura-Pires

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

4 Citations (Scopus)

Abstract

Several clustering algorithms have been extensively used to analyze vast amounts of spatial data. One of these algorithms is the SNN (Shared Nearest Neighbor), a density-based algorithm, which has several advantages when analyzing this type of data due to its ability of identifying clusters of different shapes, sizes and densities, as well as the capability to deal with noise. Having into account that data are usually progressively collected as time passes, incremental clustering approaches are required when there is the need to update the clustering results as new data become available. This paper proposes SNN++, an incremental clustering algorithm based on the SNN. Its performance and the quality of the resulting clusters are compared with the SNN and the results show that the SNN++ yields the same result as the SNN and show that the incremental feature was added to the SNN without any computational penalty. Moreover, the experimental results also show that processing huge amounts of data using increments considerably decreases the number of distances that need to be computed to identify the points' nearest neighbors.

Original languageEnglish
Title of host publication2013 IEEE 13TH INTERNATIONAL CONFERENCE ON DATA MINING WORKSHOPS (ICDMW)
EditorsW Ding, T Washio, H Xiong, G Karypis, B Thuraisingham, D Cook, Wu
PublisherIEEE
Pages552-559
Number of pages8
DOIs
Publication statusPublished - 2013
EventIEEE 13th International Conference on Data Mining (ICDM) - Dallas
Duration: 7 Dec 201310 Dec 2013

Publication series

NameIEEE International Conference on Data Mining
PublisherIEEE
ISSN (Print)1550-4786

Conference

ConferenceIEEE 13th International Conference on Data Mining (ICDM)
CityDallas
Period7/12/1310/12/13

Keywords

  • clustering
  • incremental clustering
  • shared nearest neighbor
  • spatial data

Cite this