Application of Deep Reinforcement Learning in Sample Selection for Execution Time Estimation in the PaScal Suite
Scalability Analysis, Reinforcement Learning, Performance Estimates, PaScal Suite
The efficient allocation of resources in HPC depends on accurately predicting application behavior. However, generating data for model training is computationally costly. This dissertation proposes a strategy based on Deep Reinforcement Learning to optimize sample collection in performance modeling. The goal is to minimize the number of executions required to train a high-accuracy regression model. The methodology employs a Deep Q-Learning agent that iteratively selects configurations (cores and problem size) to maximize information gain. Experimental analyses will be performed to evaluate the approach in relation to random and heuristic methods, seeking to achieve convergence with fewer samples. When integrated into the PaScal Suite, the solution will validate the use of intelligent agents to reduce time costs in scalability analyses.