Predicting Accurate and Actionable Static Analysis Warnings: An Experimental Approach J. Ruthruff et al., University of Nebraska-Lincoln, NE U.S.A, Google

Predicting Accurate and Actionable Static Analysis Warn-ings: An Experimental Approach

J. Ruthruff et al. , University of Nebraska-Lincoln, NE U.S.A, Google Inc. ICSE 2008.

2015. 06. 02

박 종 화[email protected]

컴퓨터 보안 및 운영체제 연구실

Computer Security & OS Lab.

IndexIndex

2

IntroductionBackgroundLogistic regression modelsCase studyConclusions


IntroductionIntroduction

3

Static analysis tools detect software defects by analyzing a system without actually executing it.

There are well-known two challgenge. One challenge involves the accuracy of reported warnings A second challenge receiving less attention is that warnings are not al-

ways acted on by developers even if they reveal true defects The core elements of our approach are statistical models They are built using screening, an incremental statistical

process to quickly discard factors with low predictive power


BackgroundBackground

4

FindBugs at Google• FindBugs is an open-source static analysis tool for Java programs• The tool analyzes Java bytecode to issue reports for 487 bug patterns• These patterns are organized into seven categories:

• Bad Practice, Correctness, Internationalization, Malicious Code Vulnerability, Multi-threaded Correctness, Performance, and Dodgy

• At Google, we have deployed FindBugs using an enterprise-wide service model.

• We performed a cost/benefits analysis identifying this as a cost-effective approach for determining sufficiently interesting defects to report to developers.


BackgroundBackground

5

Logistic Regression Analysis• Logistic regression analysis is a type of categorical data analysis for

predicting dependent variable values that follow binomial distributions.

• Logistic regression measures the relationship between the categorical dependent variable and one or more independent variables, which are usually (but not necessarily) continuous, by estimating probabilities.

- Wikipedia


Logistic regression modelsLogistic regression models

6

We aim to build statistical models that classify incoming static analysis warnings to reduce the cost of this process.

Logistic Regression Model Factors• We selected 33 factors to incorporate

into the experimental screening methodology for generating our required models.


Experimental Screening ProcessExperimental Screening Process

7

Screening experiments are designed to quickly yet systematically narrow down large groups. To focus the direction of research Used to discover the most significant factors

we consider a screening methodology with up to four stages that attempts to identify at least six predictive factors for a predictive model. Four stages to Ranges of 5%, 25%, 50%, 100% of the total warnings


Experimental Screening ProcessExperimental Screening Process

8

The first stage of the screening methodology eliminate factors that appear to have little of the predictive power needed to

build accurate models.

In a second stage, Additional 20% of the static analysis warnings, bringing the total number of con-

sidered warnings to 25%

The third stage of our screening methodology considers the next 25% of warnings, for a total of

half of all warnings.

Final stage The last 50% of the data


Building Models From Screening FactorsBuilding Models From Screening Factors

9

Model for Predicting False Positives Examining just 5% in the first stage

Screening experiment eliminated 15 of the 33 factors

Examining just 25% in the second stage Five factors were eliminated 5 of the 18

factors Examining just 50% in the Third stage

Two factors were eliminated 2 of the 13 factors

Examining just 50% in the Fourth stage Two factors were eliminated 2 of the 11

factors

Values close to 0.0 correspond to false positive predictions, while values close to 1.0 correspond to true defects


Building Models From Screening FactorsBuilding Models From Screening Factors

10

Models for Actionable Warnings Our first model is built using only those warnings identified as true defects. ( 13 factors ) Our second model is designed to predict actionable defects from all warnings (i.e., both false positives and legitimate

warnings). ( 15 factors )


Case studyCase study

11

The data set consists of 1,652 unique warning selected from a population of tens of thousands of warnings seen over a nine-month period

The warnings in the data set were manually examined and classified as either false positives or true defects

Screening model Classifying warnings that ere built from out screening methodology

All-Data model To collect data for every factor, for every sampled warning

BOW model Work of Bell et al. Ostrand et al. BOW+ model is added ‘bug pattern’ and ‘priority’


Results and DiscussionResults and Discussion

12


ConclusionsConclusions

13

The proposed screening approach for model building accomplishes this by quickly discarding metrics with low predictive power

The screening-based models were able to accurately predict false positive warnings over 85% of the time on average, and actionable warnings over 70% of the time

This work also indicates that regression models may be effective in settings involving static analysis warnings, and shows promise for future work in this area


ReferencesReferences

14

FindBugs. http://findbugs.sourceforge.net/. N. Ayewah, W. Pugh, J. D. Morgenthaler, J. Penix, and Y. Zhou. Evaluating

static analysis defect warnings on production software. In Proc. 7thACM Workshop on Prog. Analysis for Softw. Tools and Eng., pages 168–179, 2007

en.wikipedia.org/wiki/ .

Computer Security & OS Lab.15

Thank You !

Documents

Predicting Accurate and Actionable Static Analysis Warnings: An Experimental Approach J. Ruthruff et al., University of Nebraska-Lincoln, NE U.S.A, Google