Uncovering the Relationship Between Continuous Integration and Machine Learning Projects
Continuous Integration;Machine Learning; Build Duration; Test Coverage.
Continuous Integration (CI) is a cornerstone of modern software development. However, while widely adopted in traditional software projects, applying CI practices to Machine Learning (ML) projects presents distinctive challenges, extending beyond testing and validating code to encompass data validation and model testing. Therefore, this thesis aims to deepen the understanding of the differences, challenges, and strategies associated with CI adoption in ML projects. Through a mixed-methods approach, we conduct four studies that combine quantitative analyses on large-scale open-source repositories and qualitative studies based on practitioner surveys. Study 1, based on an analysis of 93 ML and 92 non-ML projects from GITHUB, reveals that ML projects tend to exhibit longer build durations and lower test coverage rates. Study 2, surveying 155 practitioners from 47 ML projects, identifies eight key differences in CI adoption, including test complexity, infrastructure requirements, and build stability. Challenges such as high project complexity, extensive data handling, increased computational demands, and dependency management contribute to these differences. Furthermore, ML systems’ non-deterministic nature, data dependencies, and computational constraints were identified as significant barriers to effective testing. Study 3 shifts the focus to general open-source projects to establish a baseline for how CI influences pull request (PR) delivery time. Our findings indicate that CI does not necessarily accelerate PR delivery but streamlines contribution processing, facilitating PR review decisions, and maintaining project quality without overburdening maintainers. As the next step of this thesis, Study 4 aims to extend the findings of Study 3 by comparing PR delivery time in ML and non-ML projects to determine whether ML-related delays reflect broader trends or ML-specific constraints. The key takeaway from this thesis is that while foundational CI principles remain valuable, ML projects require tailored CI approaches. To bridge this gap, we propose a set of ML-specific CI practices, including tracking model performance metrics and prioritizing test execution. Our findings emphasize the need for standardized guidelines to address critical CI challenges in ML workflows, such as dependency management. By integrating quantitative data and practitioner insights, this thesis advances the understanding of CI in ML, paving the way for more effective and robust CI strategies in the ML domain.