Abstract
When distributed data sources have different contexts the problem of Distributed Regression becomes severe. It is the underlying law of probability that constitutes the context of a source. A new Distributed Regression System is presented, which makes use of a discrete representation of the probability density functions (pdfs). Neighborhoods of similar datasets are detected by comparing their approximated pdfs. This information supports an ensemble-based approach, and the improvement of a second level unit, as it is the case in stacked generalization. Two synthetic and six real data sets are used to compare the proposed method with other state-of-the-art models. The obtained results are positive for most datasets.
Original language | English |
---|---|
Pages (from-to) | 842-855 |
Number of pages | 14 |
Journal | Journal of Universal Computer Science |
Volume | 21 |
Issue number | 6 |
State | Published - 25 Jul 2015 |
Keywords
- Context-aware regression
- Distributed machine learning
- Similarity representation