Improving Bug Localization by Mining Crash Reports: an Industrial Case Study
Crashing fault localization, stack trace, software crash
In this dissertation, we investigated whether it is possible to identify buggy files in industrial large-scale Java web information systems from stack traces automatically reported for those systems. For this, we select and adapt some approaches described in the literature about error log grouping and suspicious file ranking of crashing the system. Next, we design and implement a software tool to identify and rank buggy files using stack traces from crash reports. Lastly, we apply them in an empirical study conducted with large scale web information systems. We analyze resolved issues and error logs of three industrial systems, grouped the logs based on the stack traces, and suggested a list of suspected buggy files.
The results of our study presented recall values of 57.08%, 64.1%, 68.06%, and 72.22%, and the mean average precision of 51.32%, 65.43%, 67.66%, and 68.02% considering the top 1, 3, 5 and 10 suspicious files suggested by our approach, respectively. We also found that changed methods to correct bugs are present in the stack traces of the error log groups in approximately 80% of the time. Besides, we identified that between 66.7% and 72.28% of the cases, the methods modified to fix a bug are the most often found in stack traces of the error log group related to that bug. We conclude that stack traces from the crash reports can help identify files responsible for crashing the Java web information systems.