Accelerating high-accuracy runtime estimation in adaptive sample selection for parallel scalability analysis with Active Learning
High-Performance Computing, Active Learning, Scalability Analysis, Regression Models.
High-Performance Computing (HPC) scalability analysis is conventionally bottlenecked by the prohibitive computational cost of exhaustive empirical profiling. This work proposes accelerating this process by integrating Active Learning strategies, specifically Deep Reinforcement Learning, Query-By-Committee, and improved Greedy Sampling, with machine learning regression architectures including Extra Trees, Gradient Boosting, and Multi-Layer Perceptrons. Evaluated on the Swaptions from PARSEC benchmark and Matrix Inversion via the PaScal Suite, the framework focuses on minimizing execution time. The experimental results demonstrate that the Extra Trees ensemble paired with Query-By-Committee achieves the optimal trade-off, reaching high-fidelity model convergence in only 161.98 seconds, reducing it by up to 92%, while maintaining the visual trend of efficiency compared to the original.