Leveraging Non-Uniform Resources for Parallel Query Processing
Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Standard
Leveraging Non-Uniform Resources for Parallel Query Processing. / Mayr, Tobias; Bonnet, Philippe; Gehrke, Johannes; Seshadri, Praveen.
Third IEEE International Symposium on Cluster Computing and the Grid. IEEE, 2003. p. 120-127.Research output: Chapter in Book/Report/Conference proceeding › Article in proceedings › Research › peer-review
Harvard
APA
Vancouver
Author
Bibtex
}
RIS
TY - GEN
T1 - Leveraging Non-Uniform Resources for Parallel Query Processing
AU - Mayr, Tobias
AU - Bonnet, Philippe
AU - Gehrke, Johannes
AU - Seshadri, Praveen
PY - 2003
Y1 - 2003
N2 - Modular clusters are now composed of non- uniform nodes with different CPUs, disks or network cards so that customers can adapt the cluster configuration to the changing technologies and to their changing needs. This challenges dataflow parallelism as the primary load balancing technique of existing parallel database systems. We show in this paper that dataflow parallelism alone is ill suited for modular clusters because running the same operation on different subsets of the data can not fully utilize non-uniform hardware resources. We propose and evaluate new load balancing techniques that blend pipeline parallelism with data parallelism. We consider relational operators as pipelines of fine-grained operations that can be located on different cluster nodes and executed in parallel on different data subsets to best exploit non-uniform resources. We present an experimental study that confirms the feasibility and effectiveness of the new techniques in a parallel execution engine prototype based on the open-source DBMS Predator.
AB - Modular clusters are now composed of non- uniform nodes with different CPUs, disks or network cards so that customers can adapt the cluster configuration to the changing technologies and to their changing needs. This challenges dataflow parallelism as the primary load balancing technique of existing parallel database systems. We show in this paper that dataflow parallelism alone is ill suited for modular clusters because running the same operation on different subsets of the data can not fully utilize non-uniform hardware resources. We propose and evaluate new load balancing techniques that blend pipeline parallelism with data parallelism. We consider relational operators as pipelines of fine-grained operations that can be located on different cluster nodes and executed in parallel on different data subsets to best exploit non-uniform resources. We present an experimental study that confirms the feasibility and effectiveness of the new techniques in a parallel execution engine prototype based on the open-source DBMS Predator.
KW - Faculty of Science
KW - query processing
KW - parallel databases
U2 - 10.1109/CCGRID.2003.1199360
DO - 10.1109/CCGRID.2003.1199360
M3 - Article in proceedings
SP - 120
EP - 127
BT - Third IEEE International Symposium on Cluster Computing and the Grid
PB - IEEE
Y2 - 29 November 2010
ER -
ID: 3185413