A Process for Performance Evaluation and Change Management of Machine Learning Models in Healthcare applications
ML Evaluation; MLOps; Continuous Monitoring; Continual Learning; Feedback loop
Fostered by hardware and software advances, Machine Learning (ML) started to ramp up exponentially in the last few decades, and has become instrumental for advancing the work in the most varied areas of knowledge. Though generally restricted to controlledspace experiments, over previously obtained and curated data samples, results have been outstanding, which gave rise to such levels of popularity for ML applications that it’s hard to find an area of human knowledge left untouched by Machine Learning. In such context, establishing minimum performance guarantees over unknown, real-world data, becomes paramount, especially in Healthcare applications, where errors can lead to life-threatening situations. There’s an ML discipline, called Machine Learning Operations (or MLOps, for short), which concerns itself with ML Models’ lifecycle management, from conception to deployment in production (real-world) environments, including monitoring its real-world behavior. Once deployed, models are subject to performance decay issues, such as drift, which has motivated recent studies on continual learning and Continuous Monitoring of ML models. The present work focuses on identifying state-of-the-art techniques for evaluating model fitness in real-world usage scenarios, and on how to establish a feedback-loop to include continuous monitoring and change management in the models’ lifecycle. Finally, the present work aims to apply evaluation techniques in a case study of ML models applied to Healthcare, and establish a process for evaluating models. The target models were developed as part of the Remote Assistance Platform (PAR), and is currently in effective use in an oncologic ICU.