On the Use of Machine Learning to Identify Null Pointer Exceptions in Static Java Code Analysis
Java, Null Pointer Exception, static analysis, Machine Learning
Mainstream object-oriented programming languages admit null values for references for the sake of flexibility. In the Java programming language, attempting to use an object reference with a null value throws a Null Pointer Exception (NPE), one of the most frequent causes of crashes in Java applications. Static analysis has been used to inspect the source or binary code to locate the origin of the exception by analyzing these artifacts without debugging-oriented program executions. Despite its effectiveness, static analysis relies on a fixed, static set of rules describing violation patterns, and it is known for a significant number of false positives. This study investigates how the use of Machine Learning (ML) techniques can improve the precision of detecting NPE-related faults through static analysis, a branch still unexplored in the literature and the software industry. The main goal is to propose, implement, and evaluate a classification-based approach to address the detection of NPE-related faults in Java code. The expected contributions from this work are: (i) understanding how ML techniques can be used to detect those faults via static analysis; (ii) providing a static analysis tool to detect NPE-related faults powered by ML; and (iii) an assessment of the performance of ML techniques in comparison to traditional static analysis tools.