DHTJoin: Processing continuous join queries using DHT networks

WENCESLAO ENRIQUE PALMA MUÑOZ, Reza Akbarinia, Esther Pacitti, Patrick Valduriez

Research output: Contribution to journalArticlepeer-review

8 Scopus citations

Abstract

Continuous query processing in data stream management systems (DSMS) has received considerable attention recently. Many applications share the same need for processing data streams in a continuous fashion. For most distributed streaming applications, the centralized processing of continuous queries over distributed data is simply not viable. This paper addresses the problem of computing approximate answers to continuous join queries over distributed data streams. We present a new method, called DHTJoin, which combines hash-based placement of tuples in a Distributed Hash Table (DHT) and dissemination of queries by exploiting the embedded trees in the underlying DHT, thereby incurring little overhead. DHTJoin also deals with join attribute value skew which may hurt load balancing and result completeness. We provide a performance evaluation of DHTJoin which shows that it can achieve significant performance gains in terms of network traffic.

Original languageEnglish
Pages (from-to)291-317
Number of pages27
JournalDistributed and Parallel Databases
Volume26
Issue number2-3
DOIs
StatePublished - 1 Jan 2009

Keywords

  • Continuous join queries
  • Data stream management
  • DHT networks
  • Distributed query execution
  • Load balancing
  • Result completeness

Fingerprint Dive into the research topics of 'DHTJoin: Processing continuous join queries using DHT networks'. Together they form a unique fingerprint.

Cite this