SympAB

Embed Size (px)

Citation preview

  • 8/3/2019 SympAB

    1/33

    Effort Estimation in IterativeDevelopment:

    Learning using Bayesian Networks

    Abou Bakar Nauman

    (PhD Scholar CIIT, Islamabad)Assistant Professor

    Sarhad University of Science and IT, Peshawar, Pakistan

  • 8/3/2019 SympAB

    2/33

    Outline Introduction to Bayesian Networks

    Existing models in software engineering Research designs

    Software effort estimation

    Proposed model

  • 8/3/2019 SympAB

    3/33

    Bayesian Network Bayesian Inference

    Inference of probability distribution based

    on Likelihood The process to learn the posterior

    probability of a variable is inference

    Bayesian Network A network containing probability distribution

    of nodes, the posterior probability iscalculated based on Bayesian inference.

    X2

    X5X4

    Xn

    X1

    X3

    Xq

  • 8/3/2019 SympAB

    4/33

    Inference Mechanism Bucket Elimination

    We start with a set V of tables, and whenever we wish tomarginalize a variable X, we take from V all tables with Xin their domains. Calculate the product of them,marginalize X out of it, and place the resulting table in V.

    A

    B Y|X

    X

  • 8/3/2019 SympAB

    5/33

    Inference algorithms Joint Tree Algorithm

    A large tree is developed

    Joint distribution calculated based on Bucket elimination

    Frontier Algorithm JPD of first time slice t0

    Then children of t1 are added

    Parent t0 is marginalized

    Forward-Backward Pass

    Boyen-Koller Algorithm

  • 8/3/2019 SympAB

    6/33

    Learning

    Categories and

    Techniques

    Data

    Observable Partial

    Structure

    Known Maximum Likelihood

    Log likelihood

    Gaussian approximation

    MAP.

    EM (expectation

    maximization)

    Gradient ascent.

    Unknown Heuristic Search, greedysearch, Monte Carlo

    methods,

    Best first search

    Scoring

    Search + EMHidden nodes.

    Marginal likelihood, local criteria, relative posterior

    probability

  • 8/3/2019 SympAB

    7/33

    Parameter Learning Multiple Evidences

    Batch learningAdaptation

    Temporal Data

    Kalman Filter Markov Models

    Hidden Markov Models

    Dynamic Bayesian Networks

  • 8/3/2019 SympAB

    8/33

    Software Project

    Estimation and Plan Intangible factors

    Virtual product Complexity

    Estimate >>> Plan

    Assessment >>> Tracking Re Estimate >>>> New Plan

    Uncertainty >>> Wrong Plan

  • 8/3/2019 SympAB

    9/33

    Iterative and Incremental

    development

    Mini Projects

    Interdependence Multiple work flows

    Project planning after each iteration

    Productivity

  • 8/3/2019 SympAB

    10/33

    Curve of Uncertainty The Ideal situation

  • 8/3/2019 SympAB

    11/33

    Problem Identification

  • 8/3/2019 SympAB

    12/33

    Software Effort Estimation Multiple Process Cycles

    Uncertainty Historical data

    Organizational Data

    Expert Opinion Multiple Evidences

    Project Related Data

  • 8/3/2019 SympAB

    13/33

    Research Question

    H

    ow can we successfully apply theBBN approach to learn and estimatedevelopment effort in softwareprojects that follow an Iterative and

    Incremental approach?

  • 8/3/2019 SympAB

    14/33

    Research Objectives Learning team productivity from completed iterations and using the

    learned information to predict for future can improve effortestimation in Iterative and Incremental development software

    development project. Bayesian Networks Provides the practical technique to estimate

    effort, incorporate both uncertainty and learning from pastexperience.

    Bayesian networks can be designed for effort estimation in Iterative

    software development using knowledge of the causal relationshipsbetween attributes.

  • 8/3/2019 SympAB

    15/33

    Research Method Structural Development

    To identify the factors and develop a structural graph to showthe dependencies among different factors.

    Parameter Estimation This step represents the quantitative component of a BN,

    which results in conditional probabilities,

    obtained via Expert Elicitation or automatically, which

    quantify the relationships between variables

    Model Validation Test the real data of projects

    Implement in real time projects

  • 8/3/2019 SympAB

    16/33

    Model Evolution Not one model

    A series of models Each model is enhancement of earlier

    Different features of BN explored

    Variety of data M1, M2, M3, M4, M5, M6, MMMI

  • 8/3/2019 SympAB

    17/33

    Proposed Modeling Approach

    Sr # Title Objectives

    1 Composition Identification of units of proposed model to be used in IID effort

    estimation model development.

    2 Structuring How different units of the model can be joined together.

    3 Project Level Estimate How project level estimates can be collected before start of

    project.

    4 Evidence Recording How evidence of latest actual effort can be submitted in model

    5 Results Interpretation How results from the model can be interpreted.

    6 Results forwarding How results from one time slice can be forwarded to next slices.

    7 Unrolling The procedure to use multiple instances of basic model.

    8 Learning Options to control the degree of learning from latest evidences.

    9 Project Scenarios To explore how model behaves in different scenarios.

    10 Validation To analyze capability of model to work in industrial project.

  • 8/3/2019 SympAB

    18/33

    M1: A fuzzy logic and

    Classification Model

  • 8/3/2019 SympAB

    19/33

    Proposed Model: M2

  • 8/3/2019 SympAB

    20/33

    ISBSG data set, Productivity

    120100806040200

    Observed Value

    120

    100

    80

    60

    40

    20

    0

    ExpectedGammaValue

    Gamma Q-Q Plot of Productivity

  • 8/3/2019 SympAB

    21/33

    Continuous/ Gamma

    Distribution

  • 8/3/2019 SympAB

    22/33

    Proposed Model: M31 2 3 4

    1 2 3 4( ( ))y x

    b D b L b A b O C

    w D w w L w w A w O C

    !

  • 8/3/2019 SympAB

    23/33

  • 8/3/2019 SympAB

    24/33

    Results

  • 8/3/2019 SympAB

    25/33

  • 8/3/2019 SympAB

    26/33

    MRE resultsSum MRE Bias

    -100 -80 -60 -40 -20 0

    Estimate Checker

    M3-Median

    M3-Mean

    M2-Median

    M2-Mean

    Model

    Bias

    Sum MRE Bias

  • 8/3/2019 SympAB

    27/33

    M3-MedianM3-MeanEstimate CheckerM2-MeanM2-Median

    12.5

    10.0

    7.5

    5.0

    2.5

    0.0

    62

    45

    40

    1

    8

    40

    45

    30

    19

    5

    1

    28

    69

    45

    30

    19

    5

    1

    69

    30

    19

    45

    5

    69

    30

    5

    19

    1

    69

    45

    30

    5

    19

    1

    69

  • 8/3/2019 SympAB

    28/33

    Proposed Model: M4

  • 8/3/2019 SympAB

    29/33

    Proposed Model: M5

  • 8/3/2019 SympAB

    30/33

    M6: One Iteration

  • 8/3/2019 SympAB

    31/33

    MMMI: Project level

  • 8/3/2019 SympAB

    32/33

    Publications Abou Bakar Nauman, Open Source Based Generic Solution for E-Health and m-health in Pakistan: Concept

    Models and Research Roadmap, Accepted at ICOSST 2011(23-25 Dec), UET Lahore, Pakistan

    Abou Bakar Nauman, Software Effort Estimation Framework in Iterative and Incremental Development;Issues and Proposed Solution accepted at 3rd SAICON: International Conference on Management,Business Ethics and Economics (ICMBEE), December 28-29, 2011. Pearl-Continental Hotel Lahore, Pakistan

  • 8/3/2019 SympAB

    33/33

    Summary Benefits of Bayesian Networks

    Increased application is softwareengineering

    Need of multi-iteration estimationmodel

    Proposed model