A distributed shared nearest neighbors clustering algorithm

JUAN FRANCISCO ZAMORA OSORIO, Héctor Allende-Cid, Marcelo Mendoza

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

Current data processing tasks require efficient approaches capable of dealing with large databases. A promising strategy consists in distributing the data along several computers that partially solves the undertaken problem. Then, these partial answers are integrated in order to obtain a final solution. We introduce the Distributed Shared Nearest Neighbor based clustering algorithm (D-SNN) which is able to work with disjoint partitions of data producing a global clustering solution that achieves a competitive performance regarding centralized approaches. Our algorithm is suited for large scale problems (e.g, text clustering) where data cannot be handled by a single machine due to memory size constraints. Experimental results over five data sets show that our proposal is competitive in terms of standard clustering quality performance measures.

Original languageEnglish
Title of host publicationProgress in Pattern Recognition, Image Analysis, Computer Vision, and Applications - 22nd Iberoamerican Congress, CIARP 2017, Proceedings
EditorsSergio Velastin, Marcelo Mendoza
PublisherSpringer Verlag
Pages710-718
Number of pages9
ISBN (Print)9783319751924
DOIs
StatePublished - 1 Jan 2018
Event22nd Iberoamerican Congress on Pattern Recognition, CIARP 2017 - Valparaiso, Chile
Duration: 7 Nov 201710 Nov 2017

Publication series

NameLecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics)
Volume10657 LNCS
ISSN (Print)0302-9743
ISSN (Electronic)1611-3349

Conference

Conference22nd Iberoamerican Congress on Pattern Recognition, CIARP 2017
CountryChile
CityValparaiso
Period7/11/1710/11/17

Keywords

  • Clustering
  • Distributed algorithm
  • Shared nearest neighbors

Fingerprint Dive into the research topics of 'A distributed shared nearest neighbors clustering algorithm'. Together they form a unique fingerprint.

Cite this