Data Mining for Defects in Multicore Applications: An Entropy-Based Call-Graph Technique
Concurrency and Computation: Practice and Experience 26(1), 1-20, 2014
Multicore computers are ubiquitous. Expert developers as well as developers with little experience in parallelism are now asked to create multithreaded software in order to exploit parallelism in mainstream shared-memory hardware. However, finding and fixing parallel programming errors is a complex and arduous task. Programmers thus rely on tools such as race detectors that typically focus on reporting errors due to incorrect usage of synchronization constructs or due to missing synchronization. This arsenal of debugging techniques, however, is incomplete. This article presents a new perspective and addresses a largely unexplored direction of defect localization where a wrong usage of non-parallel programming constructs might cause wrong parallel application behavior. In particular, we make a contribution by showing how to use data-mining techniques to locate defects in multithreaded shared-memory programs. Our technique analyzes execution anomalies in a condensed representation of the dynamic call graphs of a multithreaded object-oriented application and identifies methods that contain a defect. Compared to race detectors that concentrate on finding incorrect synchronization, our method is able to reveal a wider range of defects that affect the control flow of a parallel program. Results from controlled experiments show that our data-mining approach not only finds race conditions in different types of multicore applications, but also other errors that cause incorrect parallel program behavior. Data-mining techniques offer a fruitful new ground for parallel program debugging, and we also discuss long-term directions for this interesting field.