Upload
ionut-alexandru-marinescu
View
9
Download
0
Embed Size (px)
Citation preview
The 9th International Scientific Conference eLearning and software for Education
Bucharest, April 25-26, 2013 10.12753/2066-026X-13-063
AUTOMATED UML MODEL COMPARISON FOR QUALITY ASSURANCE IN
SOFTWARE ENGINEERING EDUCATION
Anca Daniela IONITA, Alexandra CERNIAN, Stefan FLOREA Faculty of Automation Control and Computer Science, University POLITEHNICA of Bucharest, Spl. Independentei 313,
060042, Bucharest, Romania
Anca.Ionita @ aii.pub.ro, [email protected]
Abstract: Quality assurance in higher education includes methods and tools for preventing students
from copying from each other. Apart from general systems for detecting plagiarism, each discipline
may require specific electronic aids for identifying similarities between similar assessment results. The
paper considers the case of Software Engineering and, more specifically, Object-Oriented Modeling. It
proposes solutions for automatically comparison of models conforming to the Unified Modeling
Language. The presented case study was based on developing tools for: identifying modeling elements
from standard representations of models; assessing the degree of similarity; eliminating false positive
results; checking for synonyms within a Romanian language dictionary.
Keywords: Unified Modeling Language, computer aided education, quality assurance, model
comparison
I. INTRODUCTION
Quality management generally consists of three important activities: assurance, planning and
control [1]. Quality assurance deals with the creation of an appropriate framework for managing
quality, including the definition of standards, procedures, criteria and tools [2]. For integrating into the
network of higher education at European level, it is essential to gain recognition and to implement
mechanisms for assuring the quality of curriculum, students and teachers evaluation, learning
content, and academic research. For allowing the mobility between institutions, one of the important
elements is to have well defined qualifications, which may be mapped for assessing the equivalence
between educational programs. Romania has a national registry of qualifications [3], which describes
their competences, learning results and the specific study offers. This is a high level approach that
needs to be supported by processes that give guarantees that those competences are really attained.
A big challenge comes with the current trend to mix traditional learning methods with e-
learning. This is due to the flexibility offered by the up-to-date Learning Management Systems, but
also due to a shift in the young generations values and habits, highly oriented towards communication
by means of Internet. An important criterion in evaluating e-learning degree programmes concerns the
assessment system [4]. Quality assurance in educational institutions should include mechanisms for
detecting any student attempt to cheat at his or her written assessments. Due to an increasing number
of incidents, it is necessary to define holistic approaches at institutional levels, where management,
culture and infrastructure should be taken into account [5]. For this purpose, universities generally use
tools for detecting plagiarism, based on comparing text and grammatical style, as well as searching for
similar materials available on the Internet. Apart from that, Carroll and Appleton, from Oxford Brooks
University, state that discipline-specific aids are also necessary [5].
Object-oriented modeling is included in the curriculum of all the computer science faculty
programs; it is important for preparing students to analyze, design and implement software in a
rigorous way. Generally, the subjects that treat this concern contain the study of a standard semi-
394
formal language, called Unified Modeling Language (UML), which is used not only in academic
environments, but also in industry. As part of their practical work, students have to develop projects
that use all kinds of UML diagrams, for modeling complex software systems. The problem is to be
able to check the originality of the proposed models, taking into account that there is a recurrence of
project themes, and a large scale of possibilities to share them. Therefore, there is a necessity to
automatically check the diagram plagiarism, in a similar way as one verifies simple text. For this
purpose, one needs to:
Define methods for comparing all types of diagrams;
Develop a system that stores students models;
Create tools that check a new project upon all the previous submissions, in order to
detect similarities;
Introduce the facility to detect synonyms at the level of modeling element names;
Define metrics for the similarity degree;
Validate the methods and tools in the classroom.
Chapter II presents the theoretical background regarding object-oriented modeling, necessary
for understanding why and how we built our tools. Chapter III describes the system for comparing
UML models and for estimating the similarity degree - from a software engineering point of view.
Chapter IV presents a case study inspired by the evaluation of two students class diagrams elaborated
for their homework.
II. THEORETICAL BACKGROUND
Models are extensively used in engineering for descriptive purposes. In the case of software,
models are essential for many stakeholders, like clients, project managers, system integrators, software
architects, analysts. They all need specific representations of the system, therefore the existence of a
large variety of diagrams, representing specific points of view and including the necessary abstractions
for simplifying the system and facilitating the understanding of a certain stakeholder [7].
The Unified Modeling Language [8] is a standard adopted by Object Management Group
(OMG) for representing real or conceptual systems, i.e. their structure, behavior, implementation and
interaction with the outer world. The language supports the definition of multiple models, like:
use case diagrams showing the actors interacting with the system;
class diagrams composed of classes characterized by attributes and operations, as well
as relationships between them;
object diagrams showing objects and links used in particular scenarios or test cases;
statechart diagrams representing the states of dynamic classes and the transitions
between them, triggered by external events;
activity diagrams generally presenting processes, with the possibility to assign the
responsibilities of activities towards particular classes of business entities;
sequence diagrams illustrating the sequencing of messages between objects, within
certain interactions;
communication diagrams having a structural insight over an interaction, and outlining
which objects exchange messages between them;
component diagrams describing the system architecture, its decomposition into parts
and their connecting modes;
deployment diagrams visualizing the way a distributed system is assigned to different
processing nodes and the communication lines between them.
Each diagram is represented separately, within its own drawing area, and has specific
notations. However, there are multiple correlations between them, for instance objects that appear in
sequence diagrams have to be instances of existing classes, and their interaction may also be illustrated
with communication diagrams. A valid and consistent model takes into account these issues, and also
adds constraints that cannot be expressed in a graphical way, but need a specific functional language
called OCL (Object Constraint Language).
395
A modeling attempt is relative to its specific purpose; there is not an absolute and generally
valid model for a certain system; it always depends of they way the model is going to be used, and on
the vision of its author.
Besides the descriptive power, models have also been used for executive purposes. The
capability to automatically extract information from them has been proven important for multiple
goals: rapidly configuring new applications, discovering software services at run-time, generating
code, performing automate deployment for the maintenance of widely distributed systems, expressing
processes in an explicit way. This large spectrum of applications stands under the umbrella of Model
Driven Engineering (MDE), concerned with model interpretation, transformation and composition [9].
III. SYSTEM FOR MODEL STORAGE AND COMPARISON
The system presented here is intended to store the projects realized for the Software
Engineering subject, at the Faculty of Automatic Control and Computers, University POLITEHNICA
of Bucharest. The actors are:
Students - who submit their UML diagrams and obtain a preliminary verification of their
originality degree and
Teachers - who access them and perform supplementary verifications and statistics.
The UML diagrams are supposed to be represented with the StarUML open source platform
[10]; then, they are exported to XML format [11] for being interpreted and analyzed in comparison
with the models of the central repository.
An important aspect in detecting plagiarism in UML diagrams is semantics. For example, two
students can share a diagram, but use synonyms for describing elements such as class name, attributes,
operations, actors, parameters, and so on. Therefore, overlooking this semantics issue could lead to
inaccurate originality scores. The application presented in this paper incorporates a synonyms
dictionary for the Romanian language.
The plagiarism detection algorithm uses a combination of detection methods based on tokens
and similarity metrics, storing in the database the significant results obtained by comparing each pair
of files. In addition, a semantics module has been integrated, based on the Romanian language
dictionary of synonyms provided by the http://www.dexonline.ro website, which provides support for
comparing words and detecting synonymy in the names of entities in UML diagrams. These
components are integrated into a Web application that has the primary role of providing different types
of plagiarism reports for teachers and of supporting them in finding similarities between assignments
and projects uploaded by students.
The central idea of the application is the combined use of multiple techniques to detect
plagiarism - previously adapted to be used with XML diagrams. The hereby proposed detection
method uses both techniques based on comparing dictionary words and classic techniques for software
clone detection based on metrics and tokens. In the tokens based detection, the entire system or source
is parsed into a sequence of tokens, which is then scanned to identify duplicate subsequent tokens, and
thus the original portions of code. Unlike the text comparison approach, the technique based on tokens
is generally more robust in terms of changes in the code (such as formatting and spacing).
The architecture of the application is based on one of the currently recommended Web
programming models, namely the MVC concept (Model View Controller) [12], which separates the
application workflow (Controller) from information processing (Model) and from the user interface
(View). The chosen programming language (Perl) and the programming model led to the use of the
Catalyst framework [13] to support the development of the application.
In addition to a semantically enriched algorithm for detecting plagiarism within UML
diagrams, this application provides specific features and functionalities of e-learning tools. This allows
free registration for students, facilitates loading of assignments and projects for the Software
Engineering course, and offers a range of tools for teaching activities management.
The user interface provides teachers the opportunity to study the similarity scores between
UML model assignments uploaded by students. For each comparison performed, the results are
396
graphically displayed to allow rapid assessment of the situation. The graphical interface is
complemented by comprehensive reports that allow storing of statistics based on various criteria.
IV. CASE STUDY SCENARIO AND EXPERIMENTAL VALIDATION
The previous section presented an overview of the automated UML model comparison and
plagiarism detection application proposed in this manuscript. After having taken a look at the main
functionalities and features of the application, this section presents a case study scenario, which proves
the robustness of the application in computing similarity scores for UML diagrams, while also taking
into account synonymy aspects of the Romanian language.
Let us consider the following two UML class diagrams, depicted in Figure 1.
Figure 1. Case study UML class diagrams
The two diagrams contain three classes, one generalisation relationship and one association
relationship, illustrating a class representing an employee (Angajat and Salariat classes) with a
subclass for a temporary employee (Angajat Temporar and Salariat Temporar classes), and a class for
the department to whom the employee belongs. In Romanian, Angajat and Salariat are synonyms.
Moreover, the names of several attributes and methods make use of synonyms. A human expert can
easily detect the similarity between the two diagrams based on his or her natural language knowledge.
However, for a software application, these aspects are not so obvious without a thorough and robust
semantics based comparison algorithm.
Figure 2 depicts the comparison report generated by the application for the two diagrams
above. The computed similarity scores are the following: Operations (Methods): 38%, Associations:
60%, Elements: 46%, Generalizations: 80%, Attributes: 80%. A more detailed analysis of the
similarity scores is presented in Figure 3, assessing the synonymy based approach.
The results obtained experimentally had a particularly important role in providing the optimal
configuration of the parameters taken into account by the algorithm, in order to produce relevant
results. The experimental results showed that the proposed algorithm can be successfully implemented
for verifying the students assignments for the Software Engineering course. The UML model
comparison and plagiarism detection application is able to compute similarity scores between UML
diagrams, thus proving to be a trust-worthy tool for quality assurance in software engineering
education.
1
397
Figure 2. Automated UML Model Comparison Report
Figure 3. Detailed similarity report for 2 UML diagrams
V. CONCLUSIONS
The verification of students work authenticity has become an important issue, and the high
amount of information to be analyzed imposes the development of electronic tools for automate
detection of plagiarism. The paper presented an application for checking UML models, based on
correspondences between modeling elements characteristic to each diagram, and on analysis of
synonyms. Further work should define more precise similarity metrics, and should integrate these
tools with classical anti-plagiarism systems, which check against similar text fragments available on
the Internet.
398
References
[1] Jan Sommerville, (2006). Software Engineering, 8th Edition, Pearson Education.
[2] S. Mishra, (2006). Quality Assurance in Higher Education. An Introduction, National Printing Press.
[3] S. Zaharia, M. Mocanu, Al. Enescu, (2011). National Register Of Qualifications In Higher Education
(Rncis) The Facebook Of Romanian Universities, European Journal of Qualifications, No. 5, August,
2011, pp. 45-53.
[4] E. Huertas, A. Prades and S. Rodrguez, (2009). How to assess an e-learning institution: Methodology,
design and implementation, In J. Grifoll et al. Eds., Quality Assurance of E-learning, ENQA Workshop
Report, European Association for Quality Assurance in Higher Education 2009, Helsinki, pp. 11-17.
[5] N.W. Heap, I. Martin, J.P. Williams, (2006). Issues of quality assurance in the management of plagiarism in
blended learning environments, In: EADTU (European Association of Distance Teaching Universities)
Annual Conference 2006, 23-24 Nov 2006, Tallinn, Estonia.
[6] J. Carroll and J. Appleton, (2011). Plagiarism. A good Practice Guide, Joint Information Systems
Committee Report, Oxford Brooks University, May 2001.
[7] A. D. Ionita, M. Florea, (2010). Applying View Models in SOA: A Case Study, Lecture Notes in Electrical
Engineering, 1, Volume 60, Electronic Engineering and Computing Technology, Springer, pp. 483-493
[8] OMG Object Management Group (2005). UML 2.0 Superstructure Specification
http://www.omg.org.[9] A.D. Ionita, A. Olteanu, T. Ionescu, L. Dobrica, (2011). Automatic Transformations for Integrating
Instrument Models across Technological Spaces, Romanian Journal of Information Science and
Technology, Volume 14, Number 1, ISSN: 1453-8245, The Publishing House of the Romanian Academy,
pp. 51-66.
[10] StarUML Official Web Page: http://staruml.sourceforge.net/en/, last accessed on 04.03.2013.
[11] Rusty Harrold, Elliotte, Scott Means. (2001). XML in a Nutshell. O'Reilly, 2001.
[12] Solar John, Antano. Catalyst 5.8 - The Perl MVC Framework. Packt Publishing, 2009.
[13] Diment, Kiren, Matt S. Trout. (2009). The Definitive Guide to Catalyst Writing Extensible, Scalable, and
Maintainable Perl-Based Web Applications. Apress Media, 2009.
399