6
The 9 th  International Scientific Conference eLearning and software for Education Bucharest, April 25-26, 2013 10.12753/2066-026X-13-063 AUTOMATED UML MODEL COMPARISON FOR QUALITY ASSURANCE IN SOFTWARE ENGINEERING EDUCATION Anca Daniela IONITA, Alexandra CERNIAN, Stefan FLOREA  Faculty of Automation Control and Computer Science, Universi ty POLITEHNICA of Bucharest, Spl. Independentei 313, 060042, Bucharest, Romania  Anca.Ionita @ aii.pub.ro, [email protected]  Abstract : Quality assurance in higher education includes methods and tools for preventing students  from copying from each other. Apart from general systems for detecting plagiarism, each discipline may require specific electronic aids for identifying similarities between similar assessment results. The  paper considers the case of Software Engineering an d, more specifically, Object-Oriented Modeling. It  proposes solutions for automatically comparison of models conforming to the Unified Modeling  Language. The presented case study was based on developing tools for: identifying modeling elements  from standard representations of models; assessing the degree of similarity; eliminating false positive results; checking for synonyms within a Romanian language dictionary.  Keywords: Unified Modeling Language, computer aided education, quality assurance, model comparison I. INTRODUCTION Quality management generally consists of three important activities: assurance, planning and control [1]. Quality assurance deals with the creation of an appropriate framework for managing quality, including the definition of standards, procedures, criteria and t ools [2]. For integrating into the network of higher education at European level, it is essential to gain recognition and to implement mechanisms for assuring the quality of curriculum, students’ and teachers’ evaluation, learning content, and academic research. For allowing the mobility between institutions, one of the important elements is to have well defined qualifications, which may be mapped for assessing the equivalence  between educational programs. Romania has a national registry of qualifications [3], which describes their competences, learning results and the specific study offers. This is a high level approach that needs to be supported by processes that give guarantees t hat those competences are really attained. A big challenge comes with the current trend to mix traditional learning methods with e- learning. This is due to the flexibility offered by the up-to-date Learning Management Systems, but also due to a shift in the young generations’ values and habits, highly oriented towards communication  by means of Internet. An important criterion in evaluating e-learning degree programmes concerns the assessment system [4]. Quality assurance in educational institutions should include mechanisms for detecting any student attempt to cheat at his or her written assessments. Due to an increasing number of incidents, it is necessary to define holistic approaches at institutional levels, where management, culture and infrastructure should be taken into account [5]. For this purpose, universities generally use tools for detecting plagiarism, based on comparing text and grammatical style, as well as searching for similar materials available on the Internet. Apart from that, Carroll and Appleton, from Oxford Brooks University, state that discipline-specific aids are also necessary [5]. Object-oriented modeling is included in the curriculum of all the computer science faculty  programs; it is important for preparing students to analyze, design and implement software in a rigorous way. Generally, the subjects that treat this concern contain the study of a standard semi- 394

Uml

Embed Size (px)

Citation preview

  • The 9th International Scientific Conference eLearning and software for Education

    Bucharest, April 25-26, 2013 10.12753/2066-026X-13-063

    AUTOMATED UML MODEL COMPARISON FOR QUALITY ASSURANCE IN

    SOFTWARE ENGINEERING EDUCATION

    Anca Daniela IONITA, Alexandra CERNIAN, Stefan FLOREA Faculty of Automation Control and Computer Science, University POLITEHNICA of Bucharest, Spl. Independentei 313,

    060042, Bucharest, Romania

    Anca.Ionita @ aii.pub.ro, [email protected]

    Abstract: Quality assurance in higher education includes methods and tools for preventing students

    from copying from each other. Apart from general systems for detecting plagiarism, each discipline

    may require specific electronic aids for identifying similarities between similar assessment results. The

    paper considers the case of Software Engineering and, more specifically, Object-Oriented Modeling. It

    proposes solutions for automatically comparison of models conforming to the Unified Modeling

    Language. The presented case study was based on developing tools for: identifying modeling elements

    from standard representations of models; assessing the degree of similarity; eliminating false positive

    results; checking for synonyms within a Romanian language dictionary.

    Keywords: Unified Modeling Language, computer aided education, quality assurance, model

    comparison

    I. INTRODUCTION

    Quality management generally consists of three important activities: assurance, planning and

    control [1]. Quality assurance deals with the creation of an appropriate framework for managing

    quality, including the definition of standards, procedures, criteria and tools [2]. For integrating into the

    network of higher education at European level, it is essential to gain recognition and to implement

    mechanisms for assuring the quality of curriculum, students and teachers evaluation, learning

    content, and academic research. For allowing the mobility between institutions, one of the important

    elements is to have well defined qualifications, which may be mapped for assessing the equivalence

    between educational programs. Romania has a national registry of qualifications [3], which describes

    their competences, learning results and the specific study offers. This is a high level approach that

    needs to be supported by processes that give guarantees that those competences are really attained.

    A big challenge comes with the current trend to mix traditional learning methods with e-

    learning. This is due to the flexibility offered by the up-to-date Learning Management Systems, but

    also due to a shift in the young generations values and habits, highly oriented towards communication

    by means of Internet. An important criterion in evaluating e-learning degree programmes concerns the

    assessment system [4]. Quality assurance in educational institutions should include mechanisms for

    detecting any student attempt to cheat at his or her written assessments. Due to an increasing number

    of incidents, it is necessary to define holistic approaches at institutional levels, where management,

    culture and infrastructure should be taken into account [5]. For this purpose, universities generally use

    tools for detecting plagiarism, based on comparing text and grammatical style, as well as searching for

    similar materials available on the Internet. Apart from that, Carroll and Appleton, from Oxford Brooks

    University, state that discipline-specific aids are also necessary [5].

    Object-oriented modeling is included in the curriculum of all the computer science faculty

    programs; it is important for preparing students to analyze, design and implement software in a

    rigorous way. Generally, the subjects that treat this concern contain the study of a standard semi-

    394

  • formal language, called Unified Modeling Language (UML), which is used not only in academic

    environments, but also in industry. As part of their practical work, students have to develop projects

    that use all kinds of UML diagrams, for modeling complex software systems. The problem is to be

    able to check the originality of the proposed models, taking into account that there is a recurrence of

    project themes, and a large scale of possibilities to share them. Therefore, there is a necessity to

    automatically check the diagram plagiarism, in a similar way as one verifies simple text. For this

    purpose, one needs to:

    Define methods for comparing all types of diagrams;

    Develop a system that stores students models;

    Create tools that check a new project upon all the previous submissions, in order to

    detect similarities;

    Introduce the facility to detect synonyms at the level of modeling element names;

    Define metrics for the similarity degree;

    Validate the methods and tools in the classroom.

    Chapter II presents the theoretical background regarding object-oriented modeling, necessary

    for understanding why and how we built our tools. Chapter III describes the system for comparing

    UML models and for estimating the similarity degree - from a software engineering point of view.

    Chapter IV presents a case study inspired by the evaluation of two students class diagrams elaborated

    for their homework.

    II. THEORETICAL BACKGROUND

    Models are extensively used in engineering for descriptive purposes. In the case of software,

    models are essential for many stakeholders, like clients, project managers, system integrators, software

    architects, analysts. They all need specific representations of the system, therefore the existence of a

    large variety of diagrams, representing specific points of view and including the necessary abstractions

    for simplifying the system and facilitating the understanding of a certain stakeholder [7].

    The Unified Modeling Language [8] is a standard adopted by Object Management Group

    (OMG) for representing real or conceptual systems, i.e. their structure, behavior, implementation and

    interaction with the outer world. The language supports the definition of multiple models, like:

    use case diagrams showing the actors interacting with the system;

    class diagrams composed of classes characterized by attributes and operations, as well

    as relationships between them;

    object diagrams showing objects and links used in particular scenarios or test cases;

    statechart diagrams representing the states of dynamic classes and the transitions

    between them, triggered by external events;

    activity diagrams generally presenting processes, with the possibility to assign the

    responsibilities of activities towards particular classes of business entities;

    sequence diagrams illustrating the sequencing of messages between objects, within

    certain interactions;

    communication diagrams having a structural insight over an interaction, and outlining

    which objects exchange messages between them;

    component diagrams describing the system architecture, its decomposition into parts

    and their connecting modes;

    deployment diagrams visualizing the way a distributed system is assigned to different

    processing nodes and the communication lines between them.

    Each diagram is represented separately, within its own drawing area, and has specific

    notations. However, there are multiple correlations between them, for instance objects that appear in

    sequence diagrams have to be instances of existing classes, and their interaction may also be illustrated

    with communication diagrams. A valid and consistent model takes into account these issues, and also

    adds constraints that cannot be expressed in a graphical way, but need a specific functional language

    called OCL (Object Constraint Language).

    395

  • A modeling attempt is relative to its specific purpose; there is not an absolute and generally

    valid model for a certain system; it always depends of they way the model is going to be used, and on

    the vision of its author.

    Besides the descriptive power, models have also been used for executive purposes. The

    capability to automatically extract information from them has been proven important for multiple

    goals: rapidly configuring new applications, discovering software services at run-time, generating

    code, performing automate deployment for the maintenance of widely distributed systems, expressing

    processes in an explicit way. This large spectrum of applications stands under the umbrella of Model

    Driven Engineering (MDE), concerned with model interpretation, transformation and composition [9].

    III. SYSTEM FOR MODEL STORAGE AND COMPARISON

    The system presented here is intended to store the projects realized for the Software

    Engineering subject, at the Faculty of Automatic Control and Computers, University POLITEHNICA

    of Bucharest. The actors are:

    Students - who submit their UML diagrams and obtain a preliminary verification of their

    originality degree and

    Teachers - who access them and perform supplementary verifications and statistics.

    The UML diagrams are supposed to be represented with the StarUML open source platform

    [10]; then, they are exported to XML format [11] for being interpreted and analyzed in comparison

    with the models of the central repository.

    An important aspect in detecting plagiarism in UML diagrams is semantics. For example, two

    students can share a diagram, but use synonyms for describing elements such as class name, attributes,

    operations, actors, parameters, and so on. Therefore, overlooking this semantics issue could lead to

    inaccurate originality scores. The application presented in this paper incorporates a synonyms

    dictionary for the Romanian language.

    The plagiarism detection algorithm uses a combination of detection methods based on tokens

    and similarity metrics, storing in the database the significant results obtained by comparing each pair

    of files. In addition, a semantics module has been integrated, based on the Romanian language

    dictionary of synonyms provided by the http://www.dexonline.ro website, which provides support for

    comparing words and detecting synonymy in the names of entities in UML diagrams. These

    components are integrated into a Web application that has the primary role of providing different types

    of plagiarism reports for teachers and of supporting them in finding similarities between assignments

    and projects uploaded by students.

    The central idea of the application is the combined use of multiple techniques to detect

    plagiarism - previously adapted to be used with XML diagrams. The hereby proposed detection

    method uses both techniques based on comparing dictionary words and classic techniques for software

    clone detection based on metrics and tokens. In the tokens based detection, the entire system or source

    is parsed into a sequence of tokens, which is then scanned to identify duplicate subsequent tokens, and

    thus the original portions of code. Unlike the text comparison approach, the technique based on tokens

    is generally more robust in terms of changes in the code (such as formatting and spacing).

    The architecture of the application is based on one of the currently recommended Web

    programming models, namely the MVC concept (Model View Controller) [12], which separates the

    application workflow (Controller) from information processing (Model) and from the user interface

    (View). The chosen programming language (Perl) and the programming model led to the use of the

    Catalyst framework [13] to support the development of the application.

    In addition to a semantically enriched algorithm for detecting plagiarism within UML

    diagrams, this application provides specific features and functionalities of e-learning tools. This allows

    free registration for students, facilitates loading of assignments and projects for the Software

    Engineering course, and offers a range of tools for teaching activities management.

    The user interface provides teachers the opportunity to study the similarity scores between

    UML model assignments uploaded by students. For each comparison performed, the results are

    396

  • graphically displayed to allow rapid assessment of the situation. The graphical interface is

    complemented by comprehensive reports that allow storing of statistics based on various criteria.

    IV. CASE STUDY SCENARIO AND EXPERIMENTAL VALIDATION

    The previous section presented an overview of the automated UML model comparison and

    plagiarism detection application proposed in this manuscript. After having taken a look at the main

    functionalities and features of the application, this section presents a case study scenario, which proves

    the robustness of the application in computing similarity scores for UML diagrams, while also taking

    into account synonymy aspects of the Romanian language.

    Let us consider the following two UML class diagrams, depicted in Figure 1.

    Figure 1. Case study UML class diagrams

    The two diagrams contain three classes, one generalisation relationship and one association

    relationship, illustrating a class representing an employee (Angajat and Salariat classes) with a

    subclass for a temporary employee (Angajat Temporar and Salariat Temporar classes), and a class for

    the department to whom the employee belongs. In Romanian, Angajat and Salariat are synonyms.

    Moreover, the names of several attributes and methods make use of synonyms. A human expert can

    easily detect the similarity between the two diagrams based on his or her natural language knowledge.

    However, for a software application, these aspects are not so obvious without a thorough and robust

    semantics based comparison algorithm.

    Figure 2 depicts the comparison report generated by the application for the two diagrams

    above. The computed similarity scores are the following: Operations (Methods): 38%, Associations:

    60%, Elements: 46%, Generalizations: 80%, Attributes: 80%. A more detailed analysis of the

    similarity scores is presented in Figure 3, assessing the synonymy based approach.

    The results obtained experimentally had a particularly important role in providing the optimal

    configuration of the parameters taken into account by the algorithm, in order to produce relevant

    results. The experimental results showed that the proposed algorithm can be successfully implemented

    for verifying the students assignments for the Software Engineering course. The UML model

    comparison and plagiarism detection application is able to compute similarity scores between UML

    diagrams, thus proving to be a trust-worthy tool for quality assurance in software engineering

    education.

    1

    397

  • Figure 2. Automated UML Model Comparison Report

    Figure 3. Detailed similarity report for 2 UML diagrams

    V. CONCLUSIONS

    The verification of students work authenticity has become an important issue, and the high

    amount of information to be analyzed imposes the development of electronic tools for automate

    detection of plagiarism. The paper presented an application for checking UML models, based on

    correspondences between modeling elements characteristic to each diagram, and on analysis of

    synonyms. Further work should define more precise similarity metrics, and should integrate these

    tools with classical anti-plagiarism systems, which check against similar text fragments available on

    the Internet.

    398

  • References

    [1] Jan Sommerville, (2006). Software Engineering, 8th Edition, Pearson Education.

    [2] S. Mishra, (2006). Quality Assurance in Higher Education. An Introduction, National Printing Press.

    [3] S. Zaharia, M. Mocanu, Al. Enescu, (2011). National Register Of Qualifications In Higher Education

    (Rncis) The Facebook Of Romanian Universities, European Journal of Qualifications, No. 5, August,

    2011, pp. 45-53.

    [4] E. Huertas, A. Prades and S. Rodrguez, (2009). How to assess an e-learning institution: Methodology,

    design and implementation, In J. Grifoll et al. Eds., Quality Assurance of E-learning, ENQA Workshop

    Report, European Association for Quality Assurance in Higher Education 2009, Helsinki, pp. 11-17.

    [5] N.W. Heap, I. Martin, J.P. Williams, (2006). Issues of quality assurance in the management of plagiarism in

    blended learning environments, In: EADTU (European Association of Distance Teaching Universities)

    Annual Conference 2006, 23-24 Nov 2006, Tallinn, Estonia.

    [6] J. Carroll and J. Appleton, (2011). Plagiarism. A good Practice Guide, Joint Information Systems

    Committee Report, Oxford Brooks University, May 2001.

    [7] A. D. Ionita, M. Florea, (2010). Applying View Models in SOA: A Case Study, Lecture Notes in Electrical

    Engineering, 1, Volume 60, Electronic Engineering and Computing Technology, Springer, pp. 483-493

    [8] OMG Object Management Group (2005). UML 2.0 Superstructure Specification

    http://www.omg.org.[9] A.D. Ionita, A. Olteanu, T. Ionescu, L. Dobrica, (2011). Automatic Transformations for Integrating

    Instrument Models across Technological Spaces, Romanian Journal of Information Science and

    Technology, Volume 14, Number 1, ISSN: 1453-8245, The Publishing House of the Romanian Academy,

    pp. 51-66.

    [10] StarUML Official Web Page: http://staruml.sourceforge.net/en/, last accessed on 04.03.2013.

    [11] Rusty Harrold, Elliotte, Scott Means. (2001). XML in a Nutshell. O'Reilly, 2001.

    [12] Solar John, Antano. Catalyst 5.8 - The Perl MVC Framework. Packt Publishing, 2009.

    [13] Diment, Kiren, Matt S. Trout. (2009). The Definitive Guide to Catalyst Writing Extensible, Scalable, and

    Maintainable Perl-Based Web Applications. Apress Media, 2009.

    399