TY - GEN
T1 - A job dispatcher for large and heterogeneous HPC systems running modern applications
AU - Galleguillos, Cristian
AU - Kiziltan, Zeynep
AU - Soto, Ricardo
N1 - Publisher Copyright:
© Cristian Galleguillos, Zeynep Kiziltan, and Ricardo Soto.
PY - 2021/10/1
Y1 - 2021/10/1
N2 - High-performance Computing (HPC) systems have become essential instruments in our modern society. As they get closer to exascale performance, HPC systems become larger in size and more heterogeneous in their computing resources. With recent advances in AI, HPC systems are also increasingly being used for applications that employ many short jobs with strict timing requirements. HPC job dispatchers need to therefore adopt techniques to go beyond the capabilities of those developed for small or homogeneous systems, or for traditional compute-intensive applications. In this paper, we present a job dispatcher suitable for today's large and heterogeneous systems running modern applications. Unlike its predecessors, our dispatcher solves the entire dispatching problem using Constraint Programming (CP) with a model size independent of the system size. Experimental results based on a simulation study show that our approach can bring about significant performance gains over the existing CP-based dispatchers in a large or heterogeneous system.
AB - High-performance Computing (HPC) systems have become essential instruments in our modern society. As they get closer to exascale performance, HPC systems become larger in size and more heterogeneous in their computing resources. With recent advances in AI, HPC systems are also increasingly being used for applications that employ many short jobs with strict timing requirements. HPC job dispatchers need to therefore adopt techniques to go beyond the capabilities of those developed for small or homogeneous systems, or for traditional compute-intensive applications. In this paper, we present a job dispatcher suitable for today's large and heterogeneous systems running modern applications. Unlike its predecessors, our dispatcher solves the entire dispatching problem using Constraint Programming (CP) with a model size independent of the system size. Experimental results based on a simulation study show that our approach can bring about significant performance gains over the existing CP-based dispatchers in a large or heterogeneous system.
KW - Constraint programming
KW - HPC systems
KW - Heterogeneous systems
KW - Large systems
KW - On-line job dispatching
KW - Resource allocation
UR - http://www.scopus.com/inward/record.url?scp=85118179048&partnerID=8YFLogxK
U2 - 10.4230/LIPIcs.CP.2021.26
DO - 10.4230/LIPIcs.CP.2021.26
M3 - Conference contribution
AN - SCOPUS:85118179048
T3 - Leibniz International Proceedings in Informatics, LIPIcs
BT - 27th International Conference on Principles and Practice of Constraint Programming, CP 2021
A2 - Michel, Laurent D.
PB - Schloss Dagstuhl- Leibniz-Zentrum fur Informatik GmbH, Dagstuhl Publishing
T2 - 27th International Conference on Principles and Practice of Constraint Programming, CP 2021
Y2 - 25 October 2021 through 29 October 2021
ER -