DHTJoin: Processing continuous join queries using DHT networks

Wenceslao Palma, Reza Akbarinia, Esther Pacitti, Patrick Valduriez

Research output: Contribution to journalArticlepeer-review

10 Scopus citations

Abstract

Continuous query processing in data stream management systems (DSMS) has received considerable attention recently. Many applications share the same need for processing data streams in a continuous fashion. For most distributed streaming applications, the centralized processing of continuous queries over distributed data is simply not viable. This paper addresses the problem of computing approximate answers to continuous join queries over distributed data streams. We present a new method, called DHTJoin, which combines hash-based placement of tuples in a Distributed Hash Table (DHT) and dissemination of queries by exploiting the embedded trees in the underlying DHT, thereby incurring little overhead. DHTJoin also deals with join attribute value skew which may hurt load balancing and result completeness. We provide a performance evaluation of DHTJoin which shows that it can achieve significant performance gains in terms of network traffic.

Original languageEnglish
Pages (from-to)291-317
Number of pages27
JournalDistributed and Parallel Databases
Volume26
Issue number2-3
DOIs
StatePublished - Dec 2009
Externally publishedYes

Keywords

  • Continuous join queries
  • DHT networks
  • Data stream management
  • Distributed query execution
  • Load balancing
  • Result completeness

Fingerprint

Dive into the research topics of 'DHTJoin: Processing continuous join queries using DHT networks'. Together they form a unique fingerprint.

Cite this