A distributed shared nearest neighbors clustering algorithm

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

1 Scopus citations


Current data processing tasks require efficient approaches capable of dealing with large databases. A promising strategy consists in distributing the data along several computers that partially solves the undertaken problem. Then, these partial answers are integrated in order to obtain a final solution. We introduce the Distributed Shared Nearest Neighbor based clustering algorithm (D-SNN) which is able to work with disjoint partitions of data producing a global clustering solution that achieves a competitive performance regarding centralized approaches. Our algorithm is suited for large scale problems (e.g, text clustering) where data cannot be handled by a single machine due to memory size constraints. Experimental results over five data sets show that our proposal is competitive in terms of standard clustering quality performance measures.

Original languageEnglish
Title of host publicationProgress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 22nd Iberoamerican Congress, CIARP 2017, Proceedings
EditorsSergio Velastin, Marcelo Mendoza
PublisherSpringer Verlag
Number of pages9
ISBN (Print)9783319751924
StatePublished - 2018
Event22nd Iberoamerican Congress on Pattern Recognition, CIARP 2017 - Valparaiso, Chile
Duration: 7 Nov 201710 Nov 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10657 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349


Conference22nd Iberoamerican Congress on Pattern Recognition, CIARP 2017


  • Clustering
  • Distributed algorithm
  • Shared nearest neighbors


Dive into the research topics of 'A distributed shared nearest neighbors clustering algorithm'. Together they form a unique fingerprint.

Cite this