86
1 ETHIOPIAN INFORMATION COMMUNICATION TECHNOLOGY DEVELOPMENT AGENCY (EICTDA) Information Exchange Standard: Final Document By Standardization and Regulatory Project October 2007 Addis Ababa

ETHIOPIAN INFORMATION COMMUNICATION TECHNOLOGY DEVELOPMENT AGENCY …unpan1.un.org/intradoc/groups/public/documents/un-dp… ·  · 2013-03-04ETHIOPIAN INFORMATION COMMUNICATION

Embed Size (px)

Citation preview

  • 1

    ETHIOPIAN INFORMATION COMMUNICATION TECHNOLOGY DEVELOPMENT AGENCY

    (EICTDA)

    Information Exchange Standard: Final Document

    By

    Standardization and Regulatory Project

    October 2007 Addis Ababa

  • 2

    No Content Gender, Economy, and Globalization Page1 Introduction.. 1

    2 Objective.. 2

    3 Scope 2

    4 Methodology 2

    5 Benefits of the Project. 3

    6 Information Exchange Standard World Experience 3

    7 Information Exchange Practices in Ethiopia.. 5

    8 Partner Organizations. 10

    9 Information Exchange Standard Components. 11

    9.1 Data and Metadata Standard

    11

    9.1.1 Extensible Markup Language (XML) 11

    9.1.2 XML schema... 11

    9.1.3 Resource description framework (RDF) 12

    9.1.4 The metadata element. 12

    9.1.4.1 ETGLS metadata elements.

    16

    9.2 Metadata Registries (MDR) 29

    9.2.1 Fundamental model of data elements.. 30

    9.2.2 Data elements in data management and interchange.. 32

    9.2.3 Fundamental model of value domains.. 33

    9.2.4 Fundamentals of classification schemes... 37

    9.2.5 Attributes of a classification scheme 39

    9.2.6. Structure of a metadata registry.. 41

    9.2.6.1 metamodel for a metadata registry.. 41

    9.2.6.2 Application of the metamodel. 42

    9.2.6.3 Specification of the metamodel.. 42 9.2.6.4 Types, instances and values 43

    9.2.6.5 Date references. 43

    9.2.6.6 Data definition requirements and recommendations 43

    9.2.6.6.1 Requirements. 43

  • 3

    9.2.6.6.2 Recommendations.. 44

    9.2.6.6.3 Provisions... 44

    9.2.6.6.3.1 Premises.. 44

    9.2.6.6.3.2 Requirements.. 44

    9.2.6.6.3.3 Recommendations.. 48

    9.3 Interoperability Standards. 52 9.3.1 Background Information. 52

    9.3.2 Layer Models

    53

    9.3.2.1 Network Layer

    54

    9.3.2.1.1 Network Protocols..

    54

    9.3.2.1.2 Transmission Control Protocol (TCP) 55

    9.3.2.1.3 User Data Protocol (UDP). 55

    9.3.2.1.4 File Transfer Protocols (FTP) 55

    9.3.2.1.5 Mail Transfer Protocol (MTP)

    56

    9.3.2.1.6 Registry Protocols... 57

    9.3.2.1.7 Directory Protocols. 57

    9.3.2.1.8 Messaging Protocols. 58

    9.3.2.2 Data Integration Layer.. 59

    9.3.2.2.1 Primary Character Set. 59 9.3.2.2.2 Universal Character Code (UTC) 60

    9.3.2.2.3 Structured Web Document Language.. 61

    9.3.2.2.3.1 Hyper Text Interchange Format

    - (HTML v4.01).......

    61

    9.3.2.2.3.2 Hypertext Interchange Formats XML

    v1.0

    61

    9.3.2.3 Web Services Layer.. 64

    9.3.2.3.1 Universal Description, Discovery, and Integration (UDDI

    v3)

    65

    9.3.2.3.2 Web Services Description Language (WSDL). 65

  • 4

    9.3.2.3.3 Simple Object Access Protocol SOAP). 66

    9.3.2.4 Access and Presentation Layer. 66

    9.3.2.4.1 Presentations. 67

    9.3.2.4.2 Data Modeling 69

    9.3.2.5 Security Layer. 70

    9.3.2.5.1 Internet Protocol Security Protocol (IPSEC) 70

    9.3.2.5.2 Transport Layer Security (TLS).. 72

    Use of Open Standards 73

    Updating the Document.. 74

    Bibliography. 75

  • 5

    Abbreviations AES Advanced Encryption Standard AGLS Australian Government Locator Service ASCII American Standard Code for Information Interchange DES Data Encryption Standard DNS Domain name System DSA Digital Signature Algorithm DTD Document Type Definition ETGLS Ethiopian Government Locator Service FTP File Transfer Protocol HTTP Hyper Text Transfer Protocol IEC International Electrotechnican Commission IP Internet Protocol IPSEC Internet Protocol Security Protocol IP v4 Internet Protocol version 4 IP v6 Internet Protocol version 6 ISO International Organization for Standardization MDR Metadata registries MIME Multipurpose Internet Mail Extension NZGLS New Zealand Government Locator Service PDF Portable Document Format RDF Resource Description Framework RFC Request for Comments SMTP Simple Main Transfer Protocol -S-MIME Secure MIME SOAP Simple Object Access Protocol TCP Transmission Control Protocol TLS Transport Layer Security UDDI Universal Description Discovery and Integration WSDL Web Services Description Language W3C World Wide Web Consortium XML Extensible Markup Language

  • 6

    1. INTRODUCTION

    Information is a resource that activates various sectors of the economy, making it

    possible for producers and consumers to be linked to markets. Availability of information

    allows for the public to participate meaningfully in governance, through engaging in

    public discussions and contributing to decision-making.

    If the national development programmes of poverty eradication, decentralization, the Plan

    for Modernization of Agriculture (PMA), etc and above all the plan to leap-frog to the

    Information society are to succeed, information has to be availed at all levels of society,

    right from the national, districts, sub-counties down to the grass roots. It is through

    having open communication channels that allow for information exchange in all

    directions that the information needs of various interest groups can be identified and

    fulfilled.

    The ability of government organizations to share/exchange information and integrate

    information and business by use of common standards, which is termed as

    interoperability, is critical for the achievement of E-government goals. It provides the

    capability for any agency to join with other electronically using known and agreed

    approaches.

    It is known that organizations cannot operate in isolation. They are dependent on each

    other and resource sharing is a must for survival. However, this co-operation needs some

    sort of conformity or standard, i.e., the success of information exchange, whether it is

    practiced nationally, regionally or internationally, depends on the availability of

    commonly agreed standards.

    It is with this understanding that the Ethiopian government gave high priority for the

    Preparation of information exchange standard that will bind government agencies

    together. This is clearly depicted in the countrys Draft ICT Policy as there is a need for

    Standardized data collection, processing and data exchange procedures.

    An electronic document exchange, which is one form of information exchange, provides

    companies with the ability to:

  • 7

    Once connected to a central hub allowing quick and easy integration to

    multiple trading partners1;

    Focus on building their business by connecting quickly and cost-effectively

    with their trading partners;

    Extend the return on investment of their existing systems;

    Drive down the operating and maintenance costs of multiple trading partner

    integration; and

    Utilize data from external sources that is transformed into formats that can be

    understood by internal applications.

    2. OBJECTIVE To formulate/develop an information exchange standard that government organizations

    will use to exchange information.

    3. SCOPE

    The scope of the project is limited to preparation of information exchange standard which

    will bind government organizations/agencies to exchange information i.e. the information

    exchange standard is for creating and managing information resources and services that

    are locatable via the internet.

    4. METHODOLOGY Survey was made visualize the existing information exchange situation in Ethiopia. As

    part of methodology that is used to realize this project is reviewing literature on the

    subject. The sources were mainly Internet sources and ISO/IEC standards on the subject.

    Questionnaire is also used to gather primary data on information exchange/transfer

    situation among some focal government organizations. Local and foreign experts in the

    area were also consulted/ interviewed in the course of the process.

    1 Agencies that are sharing/exchanging information among themselves are said to be trading partners

  • 8

    5. BENEFITS OF THE PROJECT

    This project has at least the following benefits:

    Avoiding the problem of rebuilding or significantly altering their systems to

    share information.

    Information exchange modules across the government organizations will have

    a common interface ;

    Information exchange module development becomes faster and more efficient

    as more organizations engage in it;

    Reuse of software developed to communicate with the interface of a previous

    organization will be possible; and

    Information exchange between ICT applications and users will be developed.

    Enable different enterprise to share and exchange information irrespective of

    the particular technology in use.

    6. INFORMATION EXCHANGE STANDARD WORLD EXPERIENCE

    Standards enable different information systems to share and exchange information

    irrespective of the particular technologies at use. Moreover, creating and adopting

    information exchange standards means that local, state, tribal, and federal organizations

    avoid the problem of rebuilding or significantly altering their systems to share

    information.

    One good example of Information Exchange model Standard is that of the National

    Information Exchange Model (NIEM), a partnership of the U.S. Department of Justice

    (DOJ) and U.S. Department of Homeland Security (DHS) (www.NIEM.org). It is

    designed to develop, disseminate, and support enterprise wide Information Exchange

    Package Documentation (IEPD) and processes that will enable jurisdictions and agencies

    http://www.niem.org/

  • 9

    throughout the nation to effectively share critical information in both emergency and

    routine situations.

    NIEM focuses on discrete information exchanges between agency information systems.

    NIEM will provide the information sharing structure necessary for first responders and

    decision makers to have the right information to prepare for, prevent, and respond to

    major terrorist events and natural disasters.

    Also it will enhance the day today capabilities of practitioners at all levels of government

    in making crucial decisions about border enforcement, passenger screening, port security,

    intelligence analysis, local law enforcement and judicial processing, correctional

    supervision and release, and a variety of other governmental functions.

    NIEM Primary objectives are to:

    Bring stakeholders together to identify information sharing requirements for

    operational and emergency situations.

    Maintain a national model containing universal, common, and domain specific

    data components that pertain to agency information needs in order to facilitate

    development of an IEPD.

    Develop standards, a common vocabulary, and an online repository of IEPDs to

    support information sharing.

    Provide technical tools to support development, discovery, dissemination, and

    reuse of IEPDs.

    Provide training, technical assistance, and implementation support services, as

    appropriate.

    Developing and implementing NIEM exchange standards means that the major

    investments local, state, tribal, and federal governments have made in existing

    information systems can be leveraged and that these governments can efficiently

    participate in a truly national information sharing environment. NIEM standards enable

    different information systems to share and exchange information, irrespective of the

    particular technologies at use. Moreover, creating and adopting NIEM standards means

  • 10

    that local, state, tribal, and federal organizations avoid the Problem of rebuilding or

    significantly altering their systems to share information.

    The case of New Zealand e-Government initiative and that of Australian experience can

    also be sited as having comprehensive information exchange/sharing standards

    (hhtp://www.e.govt.nz,& http://www.agls.gov.au ) and guidelines which can be adapted

    to our situation.

    7. INFORMATION EXCHANGE PRACTICES IN ETHIOPIA

    Information exchange in Ethiopia can be seen from three perspectives: the traditional era

    which is before the Menilik II regime, the Semi modern era which is after the Menilik II,

    and the modern era which is mainly after the introduction of the Internet technology in

    the country.

    During the traditional era transmission and exchange of information was made mainly

    among people and from monasteries to their followers. Mosques also have a share in this

    traditional dissemination of information.

    During this period methods like stories, Nagarits and messengers are used to pass over

    cultural histories and heritages from generation to generation and also passing

    information that some thing urgent happened at a specific place. As any one can

    understand, this method of passing over of information is time taking, prone to errors and

    causes significantly delay in decision making which in turn resulted in underdevelopment

    of the society.

    The second era, which is the semi modern period, is the time during which other methods

    of information dissemination came in to being like telegram, postal mail, fax, etc. With

    these technologies, there have been rapid revolutions in information dissemination and

    exchange at least among urban settlers and some times with people in the Diasporas. The

    coverage of these technologies is very limited that the majority of the citizens didnt take

    advantages of it. The postal mail system of passing information from one place to another

    is some how widely spread than others during this period.

    http://www.e.govt.nz,&/http://www.agls.gov.au/

  • 11

    The third period can be seen with the introduction of computer technology, specifically

    the coming into being of the Internet to the country, which revolutionized the way people

    are communicating, exchanging at least personal messages, if not corporate data. The

    sole Internet service provider in the country, the Ethiopian telecommunications

    Corporations, of course facilitates this. The expansion of coverage of the fixed telephone,

    the introduction of mobile and wireless telephone and its penetration into the rural village

    totally changed the way we communicate and exchange information. Still the coverage is

    at a very low stage that the majority of the populations do not have access to such

    technology.

    Questionnaires were distributed to different government organizations in order to assess

    the existing information exchange situation. Among the specific objectives of the

    questionnaires are:

    Determine the level and status of information exchange in different organization

    Identify the problems encountered by users due to the lack of information

    exchange standard.

    Identify what information exchange methods are used, etc.

    Questionnaires were distributed to forty (40) organizations from which 27(i.e. 70%) of

    the organizations responded. This suffices to analyze the data and conclude what

    information exchange situation look like in government organizations in Ethiopia.

    Communication Methods Used Organizations were asked what communication methods they employ to exchange data

    within and outside their organizations. According to their response, 81.48% of them use

    both office boys and /or LAN and WAN, 85.18% of them use postal mail and/or fax

    while 88.88% of the respondent organizations make use of telephone. Therefore, the data

    shows us that the majority of the organizations are making use of telephone as the main

    communications means both within and outside their organization. The Result is shown

    in the following table.

  • 12

    Communication Methods Number Percentage

    Office boy 22 81.48

    LAN/WAN 22 81.48

    Postal Mail 23 85.18

    Telephone 24 88.8

    Fax 23 58.18

    Others 0 0

    Organizations that are making use of LAN as a means of communications with other

    similar organizations were asked what information exchange technologies they are using.

    According to their response, the Internet using dial up method, digital data

    communication technologies such as DDN with frame rely configuration and broadband

    wireless and ADSL are mentioned with very few number of respondents while the

    majority did not react to the question.

    Existence of Rules, Regulations and Procedures for Data Exchange

    Concerning existence of rules, regulations and procedures for accessing data and

    information, Organizations were asked if they have rules and regulations or standards

    defined by their organizations that oblige them to follow when exchanging information

    with other organizations. The majority of the organizations (63%) replied that they do not

    have any rules and regulations while 37% of the respondents do have rules, concerning

    personnel data, financial information exchange and other means of access control to

    sensitive information and resources of the organization. The result is shown in the

    following table:

    Existence of Rules and regulation

    Number

    Percentage

    37 10

    NO 17 63

  • 13

    Total 27 100

    Use of the Internet

    The majority of the organization, 70.37%, is using Internet heavily in their day-to-day

    activities while 29.63% of them are using Internet but not much. From the respondents,

    there is no organization that does not have Internet connection. This could make easy the

    plan to interconnect organizations so that they will be able to exchange information.

    Level of Internet Usage Number Percentage

    Heavily 19 70.37

    Not much 8 29.63

    None 0 0

    Total 27 100

    E-mail Usage

    It is noted that almost all the respondent organizations are using e-mail to communicate

    within and outside their organization the purpose of which vary based on the areas of

    work that the organization is engaged in.

    Link with other organizations

    Organizations were asked with which types of organizations they do communicate. It is

    noted that organization communicate and exchange information mainly with different

    organizations which have links, directly or indirectly (i.e. with organizations that have

    similar work, regional bureaus, higher learning institutions, different branch

    organizations, with government media, agencies and international organizations etc)

    Software Usage

    Most of the organizations use Microsoft Word when exchanging documents through e-

    mail. While Adobe Acrobat, HTML and others like Adobe Page Maker, Microsoft Excel

    and Microsoft Outlook and other application packages are also used for some purpose.

  • 14

    Type of Software Number Percentage

    Microsoft Word 23 85.18

    Adobe Acrobat 15 55.56

    HTML 12 44.44

    Others 4 14.81

    Graphic and Other Types of File Formats

    Organizations make use of different types of file formats for exchanging graphic files.

    Accordingly, the majority of the organizations use JPEG file format, which is 77.78% of

    the respondents. While most of them also use BMP and Gif file formats. Other formats

    such as TIFF file formats are also used.

    File Format Number Percentage

    JPEG 21 77.78

    BMP 15 55.56

    GIF 13 48.15

    Others 1 3.7

    For Accounting and Database file exchange most of the organizations use MS- SQL,

    Oracle, ADA, Peachtree Accounting and MS-Excel are the widely used type of Soft

    wares used by the respondent organizations and other Database Management soft wares

    are used based on their specifications. This shows that Software products are vary and

    complicated which results in difficulties in to Share and exchange information and other

    necessary documents within and outside the organization.

    Organizational Website

    Among the respondents, majority of them (89.89%) have their own

    website and only 11.11% of the respondents dont have website.

  • 15

    Data Storage Mechanism

    Organizations were asked about their data storage mechanisms for which the majority of

    the respondents, 68.4%, keep their organizational data using both semi-computerized and

    computerized systems.

    Methods of Data storage Number Percentage

    Manually 12 63%

    Semi-Computerized 13 68%

    Computerized 13 68%

    This shows that there is a high trend of using computer based systems for storing

    organizational data, which only need to be consistently defined, and designed to make it

    interoperable and hence exchange information with other organizations.

    Data Security (Protection)

    92.5% of the respondents said that they use different mechanisms to protect and secure

    sensitive information and other organizational resources. Among the mechanisms used to

    protect data and information, physical security, password, access privilege, firewall,

    backup systems and using antivirus are the most common mechanisms mentioned by the

    respondents.

    Request for Information Exchange Standard

    A very high number of respondents said that standardization of information exchange is

    essential to simplify electronic information exchange and requested the development of

    the standard as urgently as possible.

    8. PARTNER ORGANIZATIONS Organizations that are to share/exchange information among them are partners. In

    Ethiopian context, its mainly the selected government organizations that are going to

    exchange information, as there is a project to develop contents and applications for these

    government organizations. These include Central Statistics Authority, Ministry of

  • 16

    Agriculture, Ministry of Education, Ministry of Trade and Industry, Ministry of Finance

    and Economic Development, National Bank of Ethiopia, National Library & Archives,

    Federal Inland Revenue, Ministry of Health, Customs Authority and Federal Civil

    Service Commission are among others. Other government and non-government

    Organizations can join the partners and use the information exchange standard as a guide.

    9. INFORMATION EXCHANGE STANDARD COMPONENTS The Information Exchange Standards we suggested in this document comprise of

    components like data and metadata standards, interoperability Standards, Security

    Standards, and other Standards.

    9.1 DATA AND METADATA STANDARD

    9.1.1 EXTENSIBLE MARKUP LANGUAGE (XML)

    Extensible Markup Language (XML) Standard describes the basic format for data

    transport. XML documents are both platform and hardware independent and can be used

    asglue holding the relatively independent network applications together. XML is the

    subset of SGML (Standard Generalized Markup Language). It is a markup Meta

    language, designed for rapid simplification of data interchange on the Web.

    XML (in contrast to HTML) is not a really markup language, but a Meta language for

    creating once own markup languages (data dictionaries). Thus, entire XML set of tags

    does not exist; conversely, XML enables creating of sets of arbitrary tags with arbitrary

    semantics.

    The primary goal of XML is strict separation of data from data processing. The XML

    document carries only pure data, without information about data processing (or

    presentation).

    9.1.2 XML SCHEMA

    The purpose of an XML Schema: Structures schema is to define and describe a class of

    XML documents by using schema components to constrain and document the meaning,

  • 17

    usage and relationships of their constituent parts: datatypes, elements and their content

    and attributes and their values. Schemas may also provide for the specification of

    additional document information, such as normalization and defaulting of attribute and

    element values. Schemas have facilities for self-documentation. Thus, XML Schema:

    Structures can be used to define, describe and catalogue XML vocabularies for classes of

    XML documents.

    Any application that consumes well-formed XML can use the XML Schema: Structures

    formalism to express syntactic, structural and value constraints applicable to its document

    instances. The XML Schema: Structures formalism allows a useful level of constraint

    checking to be described and implemented for a wide spectrum of XML applications.

    However, the language defined by this specification does not attempt to provide all the

    facilities that might be needed by any application. Some applications may require

    constraint capabilities not expressible in this language, and so may need to perform their

    own additional validations.

    9.1.3 RESOURCE DESCRIPTION FRAMEWORK (RDF) RDF Standard defines common language for purposes of Web information

    representation. RDF is designed especially for description of metadata about Web

    resources. It is focused to automatic (machine-to-machine) exchange of information

    about resources, without loss or modification of meaning of information during Data

    exchange.

    RDF presents common frame for data exchange and supports extensibility. The

    developers can include to RDF its own data dictionaries.

    9.1.4 THE METADATA ELEMENT

    The Dublin Core metadata Standard element set is a standard ( NISO Standard Z39-85-

    2001) for cross-domain information resource description. In other words, it provides a

    simple and standardized set of conventions for describing things online in ways that make

    them easier to find. Dublin Core is widely used to describe digital materials such as

    http://www.niso.org/standards/standard_detail.cfm?std_id=725

  • 18

    video, sound, image, text, and composite media like web pages. Implementations of

    Dublin Core are typically XML and Resource Description Framework based.

    The set of metadata elements, which would improve the visibility, accessibility, and

    interoperability of government information and services through the provision of

    standardized Web-based resource descriptions which enable users to locate the

    information or service that they require.

    The Ethiopian Government Locator Service (ETGLS) element set described later in this

    document is a set of 19 descriptors that resulted from review of International best

    Practices like that of New Zealand, Australian and the Dublin Core Data and Metadata

    Standards. The Australian Government Locator Service (AUGLS) and the New Zealand

    Government Locator Service Data and metadata Standards are more complex element set

    than the Dublin Core Standard in the sense that they contain a number of element

    qualifiers which enable them to describe more categories of resources and allow richer

    description of resources; the ETGLS is more adopted from AGLS & NZGLS.

    Despite this, ETGLS is entirely compatible and interoperable with the Dublin Core

    element set. It is envisaged that ETGLS can coexist with other metadata standards based

    on different semantics.

    The ETGLS Metadata Standard will be an Ethiopian standard for cross-domain resource

    description.

    The ETGLS metadata set is intended for use by any organisation or individual creating or

    managing information sources or services that are locatable via the Internet. In particular,

    it is intended for information about resources and services on the World Wide Web. For

    the purposes of ETGLS metadata, a resource will typically be an online information or

    service resource, but may be applied more broadly to people and organisations, and

    information or services that are not available online.

    This standard describes the ETGLS element set and qualifiers. It does not define or

    describe the detailed criteria by which the element set and qualifiers will be implemented

    in specific projects and applications by individuals and organizations.

    The ETGLS 19-element set is described below with:

  • 19

    a unique, machine-understandable, single-word element name intended for

    use in the computer programming rules (syntactic use) which is intended

    to make the specification of elements simpler for encoding schemes;

    a label, which is intended to convey a common understanding of the

    element;

    a definition, the semantics or meaning of the element;

    obligation, an indication of whether the element must be used to comply

    with the ETGLS standard; and

    a comment, which further expands or refines the meaning of the element

    and how it may be used, and may include examples.

    Where qualifiers are used, they are described in much the same pattern below the element

    description with the addition of qualifier type.

    Five metadata elements must be present for compliance with this standard.

    The mandatory elements are: Creator, Title, Date, Subject OR Function, Identifier OR

    Availability

    In the case of Subject or Function, this standard specifies that at least one of those two

    elements must appear in a metadata description.

    The obligation applicable to the last element changes depending on whether the resource

    described is available online or offline. If the resource is available online, then Identifier

    is mandatory. If it is a resource only available offline, then Availability is mandatory.

    In addition, this standard requires the use of the Publisher element for descriptions of

    information resources, but not for resources which are transactional services.

    All other elements are optional, and all elements are repeatable. Metadata elements may

    appear in any order. It is assumed that metadata instances based on this standard will

    specify the encoding scheme used for any element where this is appropriate. This

    standard cannot specify the use of any particular schemes with specific elements.

    Although some environments, such as HTML, are not case-sensitive, it is recommended

    as a best practice to always adhere to the case conventions in the element and qualifier

  • 20

    names given in the standard to avoid conflicts if the metadata is subsequently extracted

    and converted to a case-sensitive syntax, such as XML (eXtensible Markup Language).

    Qualifiers

    Qualifiers are additions and extensions to the metadata elements that provide information

    about how the semantics (meaning) of an element have been refined, or about how the

    value (specific content) of an element should be interpreted.

    The guiding principle for using qualifiers with ETGLS elements is that a client (eg a

    person or software) should be able to ignore any qualifier and use the description

    (element content) as if it were unqualified. The remaining element value without the

    qualifier should continue to be generally correct and useful for discovery and other

    management purposes.

    ETGLS uses two types of qualifiers: Element refinements, and encoding schemes

    Element refinements

    Element refinements refine the semantics (meaning) of the element by further specifying

    the relationship of the element value to the resource itself. A refined element shares the

    meaning of the unqualified element, but with a more restricted scope. For example, the

    element Coverage can refer to legal or administrative scope (jurisdiction), to the

    geographical scope (spatial), to the period of time covered by the resource (temporal)2.

    The element refinements, which may be used in ETGLS, are listed in the description of

    each element. It is expected that the element refinements will continue to change over

    time. The ETGLS metadata set will be modified from time to time to specify the element

    refinements, which may be used for each element.

    Encoding schemes

    Encoding schemes indicate how the value of an element is to be interpreted if it has been

    chosen from a controlled vocabulary or is encoded if an externally defined standard is

    used. A value expressed using an encoding scheme will be either selected from a

    2 Note that the qualifier, coverage, refers to the resource content, not to the date(s) for which the resource is valid or useable or available.

  • 21

    controlled vocabulary (eg a term from a classification system or set of subject headings)

    or a string formatted in accordance with a formal notation (eg "2000-01-01" as the

    standard expression of a date). This standard is not prescriptive about available encoding

    schemes for particular elements and does not attempt to specify available schemes for

    each element. Most elements in the ETGLS element set may be qualified with an

    encoding scheme.

    9.1.4.1 ETGLS METADATA ELEMENTS

    Obligation: Mandatory

    1. Element Name: Creator

    Label: Creator

    Definition: An entity primarily responsible for making the content of

    the resource.

    Obligation: Mandatory if known

    Comment: Examples of a Creator include a person, or an organisation.

    2. Element Name: Date

    Label: Date

    Definition: A date of an event in the lifecycle of the resource.

    Obligation: Mandatory

    Comment: Typically, Date will be associated with the creation or

    availability of the resource. Recommended best practice

    for encoding the date value is defined in a profile of ISO

    8601 and follows the YYYY-MM-DD format for materials

    written in foreign language. For Local languages it will be

    as per Localization standard format. For Example,

    DD-MM-YY format is used for Amharic, Afan Oromo, and

    Tigrinya languages.

    ____________________________________________________________________

  • 22

    Qualifiers

    Qualifier Name: created

    Label: Created

    Qualifier Type: element refinement

    Definition: Creation date of the resource.

    Qualifier Name: modified

    Label: Modified

    Qualifier Type: element refinement

    Definition: Modification date of the resource.

    Qualifier Name: valid

    Label: Valid

    Qualifier Type: element refinement

    Definition: A date (often a range) of validity of a resource.

    Comment: Typically, a date the resource becomes valid or ceases to be

    valid, or the date range for which the resource is valid.

    Qualifier Name: issued

    Label: Issued

    Qualifier Type: element refinement

    Definition: A date on which the resource was made formally available

    in its current form.

    3. Element Name: Title

    Label: Title

    Definition: A name given to the resource.

    Obligation: Mandatory

    Comment: Typically, the name by which the resource is formally

    known.

  • 23

    Qualifiers

    Qualifier Name: alternative

    Label: Alternative

    Qualifier Type: element refinement

    Definition: Any form of the title used as a substitute or alternative to

    the formal title of the resource

    Comment: This qualifier could include abbreviations and acronyms by

    which a resource may be known.

    Obligation: Conditional

    4. Element Name: Availability

    Label: Availability

    Definition: How the resource can be obtained or contact information

    for obtaining the resource.

    Obligation: Mandatory for offline resources.

    Comment: The Availability element is primarily used for non-

    electronic resources to provide information on how to

    obtain physical access to the resource.

    5. Element Name: Function

    Label: Function

    Definition: The business function of the organisation to which the

    resource relates.

    Obligation: Mandatory if no Subject element specified.

    Comment: Used to indicate the business role of the resource in terms

    of business functions and activities. Functions are the major

    units of activity which organisations pursue in order to

    meet the mission and goals of the organisation.

  • 24

    Recommended best practice is to select a value from a

    controlled vocabulary or formal classification scheme.

    6. Element Name: Identifier

    Label: Resource Identifier

    Definition: An unambiguous reference to the resource within a given

    context.

    Obligation: Mandatory for online resources.

    Comment: Recommended best practice is to identify the resource by

    means of a string or number conforming to a formal

    identification system. Example formal identification

    systems include the Uniform Resource Identifier (URI)

    (including the Uniform Resource Locator (URL)), the

    Digital Object Identifier (DOI) and the International

    Standard Book Number (ISBN).

    7. Element Name: Publisher

    Label: Publisher

    Definition: An entity responsible for making the resource available.

    Obligation: Mandatory for information resources.

    Comment: This field is often the name of the organisation that owns or

    controls or publishes the resource. It is not recommended

    that this element be used for the name of the entity which

    merely acts as the host for a website.

    8. Element Name: Subject

    Label: Subject and Keywords

    Definition: A subject and/or topic of the content of the resource.

    Obligation: Mandatory if no Function element specified.

    Comment: Typically, a Subject will be expressed as keywords, key

    phrases or classification codes that describe a topic of the

    resource content. Recommended best practice is to select a

  • 25

    value from a controlled vocabulary or formal classification

    scheme.

    Obligation: Optional

    9. Element Name: Audience

    Label: Audience

    Definition: A target audience of the resource.

    Obligation: Optional

    Comment: Types of audiences commonly used in this element include

    particular industry sectors, education levels, skill levels,

    occupations, and EEO categories. Recommended best

    practice is to select a value from a controlled vocabulary or

    formal classification scheme.

    10. Element Name: Contributor

    Label: Contributor

    Definition: An entity responsible for making a contribution to the

    content of the resource.

    Obligation: Optional

    Comment: Typically, a contributor will be an entity that has played an

    important but secondary role in creating the content of the

    resource and is not specified in the creator element.

    11. Element Name: Coverage

    Label: Coverage

    Definition: The extent or scope of the content of the resource.

    Obligation: Optional

    Comment: Coverage will typically include spatial location (a place

    name or geographic coordinates), temporal period (a period

    label, date, or date range) or jurisdiction (such as a named

    administrative entity). Recommended best practice is to

  • 26

    select a value from a controlled vocabulary (for example,

    the Thesaurus of Geographic Names [TGN]) and that,

    where appropriate, named places or time periods be used in

    preference to numeric identifiers such as sets of coordinates

    or date ranges.

    Qualifiers

    Qualifier Name: jurisdiction

    Label: Jurisdiction

    Qualifier Type: element refinement

    Definition: The name of the political/administrative entity covered by

    the content of the resource.

    Comment: Jurisdiction is a description of the territory over which a

    particular government exercises its authority or a particular

    business transacts its operations, to which the resource

    content is applicable.

    Qualifier Name: spatial

    Label: Spatial

    Qualifier Type: element refinement

    Definition: Spatial (geographic) characteristics of the intellectual

    content of the resource.

    Qualifier Name: temporal

    Label: Temporal

    Qualifier Type: element refinement

    Definition: Temporal characteristics of the intellectual content of the

    resource.

    Qualifier Name: postcode

    Label: Postcode

    Qualifier Type: element refinement

  • 27

    Definition: Ethiopian postcode(s) applicable to the spatial coverage of

    the resource content.

    Comment: Postcode refers to the actual Ethiopian postcode(s) which is

    relevant to the spatial coverage of the resource content.

    This qualifier will be of particular use in describing

    services.

    12. Element Name: Description

    Label: Description

    Definition: An account of the content of the resource.

    Obligation: Optional

    Comment: Description may include but is not limited to: an abstract,

    table of contents, reference to a graphical representation of

    content (eg a thumbnail of an image), or a free-text account

    of the content.

    13. Element Name: Format

    Label: Format

    Definition: The physical or digital manifestation of the resource.

    Obligation: Optional

    Comment: Typically, Format may include the media-type or

    dimensions of the resource. Format may be used to

    determine the software, hardware or other equipment

    needed to display or operate the resource. Examples of

    dimensions include size and duration. Recommended best

    practice is to select a value from a controlled vocabulary

    (for example, the list of Internet Media Types [MIME]

    defining computer media formats).

    Qualifiers

    Qualifier Name: extent

    Label: Extent

    http://dublincore.org/documents/dces/#mime

  • 28

    Qualifier Type: Element refinement

    Definition: The size or duration of the resource.

    Comment: The extent qualifier allows the description of the physical

    dimensions, file size or duration of the resource.

    Qualifier Name: medium

    Label: Medium

    Qualifier Type: Element refinement

    Definition: The material or physical carrier of the resource.

    14. Element Name: Language

    Label: Language

    Definition: A language of the intellectual content of the resource.

    Obligation: Optional

    Comment: Recommended best practice for the values of the Language

    element is defined by using a two-letter Language Code

    (taken from the ISO 639 standard [ISO 639]), followed

    optionally, by a two-letter Country Code (taken from the

    ISO 3166 standard [ISO 3166]). For example AM for

    Amharic, OR for Ormomiffa, 'en' for English, 'fr' for

    French, or 'en-uk' for English used in the United Kingdom,

    etc.

    15. Element Name: Mandate

    Label: Mandate

    Definition: A specific warrant which requires the resource to be

    created or provided.

    Obligation: Optional

    Comment: The element is useful to indicate the specific legal mandate

    which requires the resource being described to be created

    or provided to the public. The content of this element will

    usually be a reference to a specific Act, Regulation or Case,

    http://www.oasis-open.org/cover/iso639a.htmlhttp://www.oasis-open.org/cover/country3166.html

  • 29

    but may be a URI pointing to the legal instrument in

    question.

    Qualifiers

    Qualifier Name: act

    Label: Act

    Qualifier Type: Element refinement

    Definition: A reference to a specific State or Federal Act which

    requires the creation or provision of the resource.

    Qualifier Name: regulation

    Label: Regulation

    Qualifier Type: Element refinement

    Definition: A reference to a specific regulation which requires the

    creation or provision of the resource.

    Qualifier Name: case

    Label: Case Law

    Qualifier Type: Element refinement

    Definition: A reference to a specific case which requires the creation or

    provision of the resource

    16. Element Name: Relation

    Label: Relation

    Definition: A reference to a related resource.

    Obligation: Optional

    Comment: Recommended best practice is to reference the resource by

    means of a string or number conforming to a formal

    identification system.

  • 30

    Qualifiers

    Qualifier Name: isVersionOf

    Label: Is Version Of

    Qualifier Type: element refinement

    Definition: The described resource is a version, edition, or adaptation

    of the referenced resource. Changes in version imply

    substantive changes in content rather than differences in

    format.

    Qualifier Name: hasVersion

    Label: Has Version

    Qualifier Type: element refinement

    Definition: The described resource has a version, edition, or

    adaptation, namely, the referenced resource.

    Qualifier Name: isReplacedBy

    Label: Is Replaced By

    Qualifier Type: element refinement

    Definition: The described resource is supplanted, displaced, or

    superseded by the referenced resource.

    Qualifier Name: replaces

    Label: Replaces

    Qualifier Type: element refinement

    Definition: The described resource supplants, displaces, or supersedes

    the referenced resource.

    Qualifier Name: isRequiredBy

    Label: Is Required By

  • 31

    Qualifier Type: element refinement

    Definition: The described resource is required by the referenced

    resource, either physically or logically.

    Qualifier Name: requires

    Label: Requires

    Qualifier Type: element refinement

    Definition: The described resource requires the referenced resource to

    support its function, delivery, or coherence of content.

    Qualifier Name: isPartOf

    Label: Is Part Of

    Qualifier Type: element refinement

    Definition: The described resource is a physical or logical part of the

    referenced resource.

    Qualifier Name: hasPart

    Label: Has Part

    Qualifier Type: element refinement

    Definition: The described resource includes the referenced resource

    either physically or logically.

    Qualifier Name: isReferencedBy

    Label: Is Referenced By

    Qualifier Type: element refinement

    Definition: The described resource is referenced, cited, or otherwise

    pointed to by the referenced resource.

    Qualifier Name: references

    Label: References

    Qualifier Type: element refinement

    Definition: The described resource references, cites, or otherwise

    points to the referenced resource.

    Qualifier Name: isFormatOf

  • 32

    Label: Is Format Of

    Qualifier Type: element refinement

    Definition: The described resource is the same intellectual content of

    the referenced resource, but presented in another format.

    Qualifier Name: hasFormat

    Label: Has Format

    Qualifier Type: element refinement

    Definition: The described resource pre-existed the referenced resource,

    which is essentially the same intellectual content presented

    in another format.

    Qualifier Name: isBasisFor

    Label: Is Basis For

    Qualifier Type: element refinement

    Definition: The described resource pre-existed the referenced resource,

    which is a performance, production, derivation, translation,

    or interpretation of the described resource.

    Qualifier Name: isBasedOn

    Label: Is Based On

    Qualifier Type: element refinement

    Definition: The described resource is a performance, production,

    derivation, translation, or interpretation of the referenced

    resource.

    17. Element Name: Rights

    Label: Rights Management

    Definition: Information about rights held in and over the resource.

    Obligation: Optional

    Comment: Typically, the Rights element will contain a rights

    management statement for the resource, or refer to a service

    providing such information. Rights information often

  • 33

    encompasses Intellectual Property Rights (IPR), Copyright,

    and various Property Rights. If the Rights element is

    absent, no assumptions can be made about the status of

    these and other rights with respect to the resource.

    18. Element Name: Source

    Label: Source

    Definition: A reference to a resource from which the present resource

    is derived.

    Obligation: Optional

    Comment: The present resource may be derived from the Source

    resource in whole or in part. Recommended best practice is

    to reference the source resource by means of a string or

    number conforming to a formal identification system.

    19. Element Name: Type

    Label: Resource Type

    Definition: The nature or genre of the content of the resource.

    Obligation: Optional

    Comment: Type includes terms describing general categories, genres,

    or aggregation levels for content. Recommended best

    practice is to select a value from a controlled vocabulary

    (for example, the working draft list of Dublin Core Types

    [DCT1]). To describe the physical or digital manifestation

    of the resource, use the FORMAT element.

    Qualifiers

    Qualifier Name: category

    Label: Category

    Qualifier Type: Element refinement

    Definition: The generic type of the resource being described.

    http://dublincore.org/documents/dces/#dct1

  • 34

    Comment: The value for this qualifier must be one of either service,

    document, or agency.

    Qualifier Name: aggregationLevel

    Label: Aggregation Level

    Qualifier Type: Element refinement

    Definition: The level of aggregation of the resource being described.

    Comment: There are only two values possible for this qualifier, either

    item or collection.

    Qualifier Name: documentType

    Label: Document Type

    Qualifier Type: Element refinement

    Definition: The form of the resource where category = document.

    Comment: Document is used in its widest sense and includes such

    things as software sound files and images.

    Qualifier Name: serviceType

    Label: Service Type

    Qualifier Type: Element refinement

    Definition: The type of service being offered where category = service.

    9.2 METADATA REGISTRIES (MDR)

    Metadata registries (MDR) address the semantics of data, the representation of data, and

    the registration of the descriptions of that data. It is through these descriptions that an

    accurate understanding of the semantics and a useful depiction of the data are found.

    An MDR is a database of metadata that supports the functionality of registration.

    Registration accomplishes three main goals: identification, provenance, and monitoring

    quality. Identification is accomplished by assigning a unique identifier (within the

    registry) to each object registered there. Provenance addresses the source of the metadata

    and the object described. Monitoring quality ensures that the metadata does the job it is

    designed to do.

  • 35

    An MDR manages the semantics of data. Understanding data is fundamental to its design,

    harmonization, standardization, use, re-use, and interchange. The underlying model for

    an MDR is designed to capture all the basic components of the semantics of data,

    independent of any application or subject matter area.

    MDR's are organized so that those designing applications can ascertain whether a suitable

    object described in the MDR already exists. Where it is established that a new object is

    essential, its derivation from an existing description with appropriate modifications is

    encouraged, thus avoiding unnecessary variations in the way similar objects are

    described. Registration will also allow two or more administered items describing

    identical objects to be identified, and more importantly, it will identify situations where

    similar or identical names are in use for administered items that are significantly different

    in one or more respects.

    The increased use of data processing and electronic data interchange heavily relies on

    accurate, reliable, controllable, and verifiable data recorded in databases. One of the

    prerequisites for a correct and proper use and interpretation of data is that both users and

    owners of data have a common understanding of the meaning and descriptive

    characteristics (e.g., representation) of that data.

    9.2.1 FUNDAMENTAL MODEL OF DATA ELEMENTS Data element is composed of two parts:

    Data element concept (DEC) A DEC is concept that can be represented in the

    form of a data element, described independently of any particular representation.

    Representation The representation is composed of a value domain, data type,

    units of measure (if necessary), and representation class (optionally).

    From a data modeling perspective, a DEC should be composed of two parts: Object Class

    and Property.

    The object class is a set of ideas, abstractions, or things in the real world that can

    be identified with explicit boundaries and meaning and whose properties and

    behavior follow the same rules. Object classes are the things for which we wish to

  • 36

    collect and store data. They are concepts, and they correspond to the notions

    embodied in classes in object-oriented models and entities in entity-relationship

    models. Examples are cars, persons, households, employees, and orders.

    The property is a characteristic common to all members of an object class.

    Properties are what humans use to distinguish or describe objects. They are

    characteristics, not necessarily essential ones, of the object class and form its

    intension. They are also concepts, and they correspond to the notions embodied in

    attributes (without associated datatypes) in object-oriented or entity-relationship

    models. Examples of properties are color, model, sex, age, income, address, or

    price.

    It is important to distinguish an actual object class or property from its name. This is the

    distinction between concepts and their designations. Object classes and properties are

    concepts; their names are designations.

    Complications arise because people convey concepts through words (designations), and it

    is easy to confuse a concept with the designation used to represent it. For example, most

    people will read the word income and be certain they have unambiguously interpreted it.

    But, the designation income may not convey the same concept to all readers, and, more

    importantly, each instance of income may not designate the same concept.

    Not all ideas are named or expressed in simple natural language, either. For example,

    "women between the ages of 15 and 45 who have had at least one live birth in the last 12

    months" is a valid object class not easily named in English. Some ideas may be more

    easily expressed in one language than in another. The German word Gtterdmmerung

    has no simple English equivalent.

    A data element is produced when a representation is associated with a data element

    concept. The representation describes the form of the data, including a value domain,

    datatype, representation class (optionally), and, if necessary, a unit of measure. Value

    domains are sets of permissible values for data elements. For example, the data element

    representing annual household income may have the set of nonnegative integers (with

    units of dollars) as a set of valid values. This is its value domain. A data element

  • 37

    concept may be associated with different value domains as needed to form conceptually

    similar data elements. There are many ways to represent similar facts about the world, but

    the concept for which the facts are examples is the same. An example is the Data Element

    Concept country of person's birth. ISO 3166-1 Country Codes contains seven different

    representations for countries of the world. Each one of these seven representations

    contains a set of values that may be used in the value domain associated with the DEC.

    Each one of the seven associations is a data element. For each representation of the data,

    the permissible values, the datatype, the representation class, and possibly the units of

    measure, are altered.

    See ISO/IEC 20943-1:2003, Information technology Procedures for achieving

    metadata registry content consistency Part 1: Data elements for details about the

    registration and management of descriptions of data elements.

    9.2.2 DATA ELEMENTS IN DATA MANAGEMENT AND

    INTERCHANGE Data elements appear in databases, files, and transaction sets. Data elements are the

    fundamental units of data an organization manages, therefore they must be part of the

    design of databases and files within the organization and all transaction sets the

    organization builds to communicate data to other organizations.

    Within the organization, databases or files are composed of records, segments, tuples,

    etc., which are composed of data elements. The data elements themselves contain various

    kinds of data that include characters, images, sound, etc. When the organization needs to

    transfer data to another organization, data elements are the fundamental units that make

    up the transaction sets. Transactions occur primarily between databases or files, but the

    structure (i.e. the records or tuples) of the files and databases don't have to be the same

    across organizations.

    So, the common unit for transferring information (data plus understanding) is the data

    element.

  • 38

    Figure 1: Data elements and other data concepts

    Transaction Exchange Unit, Etc.

    Database, file,

    Etc.

    Data Element

    Identifier Definition Name Value Domain Etc

    Database, file, etc

    Class, Tuple, Etc

    Field, column,

  • 39

    9.2.3 FUNDAMENTAL MODEL OF VALUE DOMAINS

    A value domain is a set of permissible values. A permissible value is a combination of

    some value and the meaning for that value. The associated meaning is called the value

    meaning. A value domain is the set of valid values for one or more data elements. It is

    used for validation of data in information systems and in data exchange. It is also an

    integral part of the metadata needed to describe a data element. In particular, a value

    domain is a guide to the content, form, and structure of the data represented by a data

    element.

    Value domains come in two (non-exclusive) sub-types: Enumerated Value domain and

    Non-enumerated value Domain.

    An Enumerated value domain is a value domain specified as a list of permissible values

    (values & their meanings). It contains a list of all its values and their associated

    meanings. Each value and meaning pair is called a permissible value. The meaning for

    each value is called the value meaning.

    A non-enumerated value domain is specified by a description. The non-enumerated

    value domain description describes precisely which permissible values belong and

    which do not belong to the value domain. An example of such description is the phrase

    "Every real number greater than 0 and less than 1".

    Each value domain is a member of the extension of a concept, called the conceptual

    domain. A conceptual domain is a set of value meanings. The intension of a conceptual

    domain is its value meanings. Many value domains may be in the extension of the same

    conceptual domain, but a value domain is associated with one conceptual domain.

    Conceptual domains may have relationships with other conceptual domains, so it is

    possible to create a concept system of conceptual domains. Value domains may have

    relationships with other value domains, which provide the framework to capture the

    structure of sets of related value domains and their associated concepts.

    Conceptual domains, too, come in two (non-exclusive) sub-types:

  • 40

    Enumerated conceptual domain A conceptual domain specified as a list of

    value meanings.

    Non-enumerated conceptual domain A conceptual domain specified by a

    description.

    The value meanings for an enumerated conceptual domain are listed explicitly. This

    conceptual domain type corresponds to the enumerated type for value domains. The value

    meanings for a non-enumerated conceptual domain are expressed using a rule, called a

    non-enumerated conceptual domain description.

    Thus, the value meanings are listed implicitly. This rule describes the meaning of

    permissible values in a non-enumerated value domain. This conceptual domain type

    corresponds to the non-enumerated type for value domains. See ISO/IEC TR 20943-3,

    Information technology Procedures for achieving metadata registry content

    consistency Part 3: Value domains for detailed examples.

    A unit of measure is sometimes required to describe data. If temperature readings are

    recorded in a database, then the temperature scale (e.g., Fahrenheit or Celsius) is

    necessary to understand the meaning of the values. Another example is the mass of rocks

    found on Mars, measured in grams. However, units of measure are not limited to physical

    quantities, as currencies (e.g., US dollars, Lire, British pounds) and other socio-economic

    measures are units of measure, too.

    Some units of measure are equivalent to each other in the following sense: Any quantity

    in one unit of measure can be transformed to the same quantity in another unit of

    measure. All equivalent units of measure are said to have the same dimensionality. For

    example, currencies all have the same dimensionality. Measures of speed, such as miles

    per hour or meters per second, have the same dimensionality. Two units of measure that

    are often erroneously seen as having the same dimensionality are pounds (as in weight)

    and grams. Pounds is a measure of force, and grams is a measure of mass.

    A unit of measure is associated with a value domain, and the dimensionality is associated

    with the conceptual domain.

  • 41

    Some value domains contain very similar values from one domain to another. Either the

    values themselves are similar or the meanings of the values are the same. When these

    similarities occur, the value domains may be in the extension of one conceptual domain.

    The following examples illustrate this and the other ideas in this sub-clause:

    EXAMPLE 1 Similar non-enumerated value domains

    Conceptual domain name Probabilities

    Conceptual domain definition

    Real numbers greater than 0 and less than 1

    Value domain name (1) Probabilities 2 significant digits

    Value domain description All real numbers greater than 0 and less than 1

    represented with 2-digit precision.

    Unit of measure precision 2 digits to the right of the decimal point

    Value domain name (2) Probabilities 5 significant digits

    Value domain description All real numbers greater than 0 and less than 1

    represented with 5-digit

    Precision Unit of measure precision: 5 digits to the

    right of the decimal point.

    EXAMPLE 2 Similar enumerated value domains

    Conceptual domain name Countries of the world

    Conceptual domain definition Lists of current countries of the world represented

    as names or codes

    Value domain name (1) Country codes 2 character alpha

    Permissible values

  • 42

    Figure 2: Fundamental Model for Value Domains

    9.2.4 FUNDAMENTALS OF CLASSIFICATION SCHEMES

    A classification scheme is a concept system intended to classify objects.

    It is organized in some specified structure, limited in content by a scope, and designed for

    assigning objects to concepts defined within it. Concepts are assigned to an object, and

    this process is called classification. The relationships linking concepts in the concept

    system link objects that the related concepts classify. In general, any concept system is a

    classification scheme if it is used for classifying objects.

    The content scope of the classification scheme circumscribes the subject matter area

    covered by the classification scheme. The scope of the classification scheme is the

    broadest concept contained in the concept system of the scheme. It determines,

    theoretically, whether an object can be classified within that scheme or not.

    Concept systems and classification schemes in particular, can be structured in many

    ways. The structure defines the types of relationships that may exist between concepts,

    and each classification scheme can be used for the purpose of linking concepts to objects.

    In a particular classification scheme, the linked concepts together with the other concepts

    Value Domain

    ConceptualDomain

    (1:N) (1:N)

    (1:N)

    Non-exclusive sub-type Non-exclusive sub-types

    Conceptual Domain

    Value Domain

    NON-ENUMERATED CONCEPTUAL

    ENUMERATEDVALUE DOMAIN

    NON- ENUMERATED VALUE DOMAIN

    VALUE MEANING

    PERMISSIBLE VALUES

    VALUE DOMAIN

    (1:N) (1:N) (1:N)

    ENUMERATED CONCEPTUAL

    DOMAIN

    Conceptual Domain

  • 43

    related to the linked concept in the scheme provide a conceptual framework in which to

    understand the meaning of the object. The framework is limited by the scope of the

    classification scheme.

    A concept system may be represented by a terminological system. The designations are

    used to represent each of the concepts in the system and are used as key words linked to

    objects for searching, indexing, or other purposes.

    A special kind of concept system is a relationship system. There, the concepts are

    relationship types. A relationship type has N arguments, and it is called an n-ary

    relationship type. The statement "a set of N objects is classified by an n-ary relationship

    type" means that the N objects have a relationship among them of the given relationship

    type.

    The classification region permits the registration and administration of all or part of a

    classification scheme. Optionally, a classification scheme may be used to classify

    administered items, the registered artifacts in a metadata registry.

    Classification schemes of varying discriminatory power: key words, thesauri,

    taxonomies, and ontologies could be used. These classification schemes have potentially

    great utility for documenting objects in the real world, including administered items in an

    MDR.

    There are several purposes for applying classification to real world objects. Classification

    assists users to find a single object from among a large collection of objects, facilitates

    the administration and analysis of a collection of objects, and, through inheritance,

    conveys semantic content that is often only incompletely specified by other attributes,

    such as names and definitions.

    Each type of classification scheme mentioned above has particular strengths and

    weaknesses, and provides the foundation upon which particular capabilities can be built.

    Keywords, for example, are a quick way to provide users some assistance in locating

    potentially useful administered items. A thesaurus provides a more structured approach,

    arranging descriptive terms in a structure of broader, narrower, and related classification

    categories. A taxonomy provides a classification structure that adds the power of

  • 44

    inheritance of meaning from generalized taxa to specialized taxa. Ontologies, with

    associated epistemologies, can provide rich, rigorously defined structures (e.g. directed

    acyclic graphs with multiple inheritance) that can convey information needed by

    software, such as intelligent agents and mediators that are useful in the provision of

    intelligent information services. 9.2.5 ATTRIBUTES OF A CLASSIFICATION SCHEME

    Classification schemes shall be registered in an MDR by recording their attributes.

    Minimally, a registered classification scheme shall have an administration record and a

    classification scheme type name. The following table lists the attributes of a classification

    system that shall be recorded in an MDR.

    Attribute Occurrences

    Designation name One per Terminology Entry language

    Designation preferred designation Zero or one per Terminological Entry

    section

    Designation language identifier One per Language section in each

    Terminological Entry

    Definition - definition text One per Terminology Entry language

    section

    Definition preferred definition Zero or one per Terminological Entry

    section

    Definition source reference Zero or one per Terminological Entry

    section

    Definition language identifier One per Language section in each

    Terminological Entry

    Context - administration record One per context

    Context description One per context

    Context description language identifier Zero or one per context

    Classification scheme- type name One per classification scheme

    Classification scheme- value One per classification scheme item

    Classification scheme- type name One per classification scheme item

  • 45

    Classification scheme item relationship-

    type description

    One per classification scheme item

    relationship type description

    Administration Record item identifier One per classification scheme

    Administration Record registration

    status

    One per classification scheme

    Administration Record administrative

    status

    One per classification scheme

    Administration Record creation date One per classification scheme

    Administration Record last change date Zero or one per classification scheme

    Administration Record effective date Zero or one per classification scheme

    Administration Record until date Zero or one per classification scheme

    Administration Record change

    description

    Zero or one per classification scheme

    Administration Record administrative

    note

    Zero or one per classification scheme

    Administration Record explanatory

    comment

    Zero or one per classification scheme

    Administration Record unresolved

    issue

    Zero or one per classification scheme

    Administration Record origin Zero or one per classification scheme

    Reference Document identifier One per reference document

    Reference Document type description Zero or one per reference document

    Reference Document language

    identifier

    Zero or one per reference document

    Reference Document title Zero or one per reference document

    Reference Document organization

    name

    one or more per reference document

    Reference Document organization mail

    address

    Zero or one per reference document

    Submission organization name One per classification scheme

    Submission organization mail address Zero or one per classification scheme

  • 46

    Submission - contact One per classification scheme

    Stewardship - organization name One per classification scheme

    Stewardship organization mail address Zero or one per classification scheme

    Stewardship - contact One per classification scheme

    Registration Authority Organization

    name

    One per classification scheme

    Registration Authority Organization

    mail address

    Zero or one per classification scheme

    Registration Authority Registration

    Authority identifier

    One per classification scheme

    Registration Authority Documentation

    language identifier

    One or more per classification scheme

    Registrar identifier One or more per classification scheme

    Registrar contact One or more per classification scheme

    9.2.6. STRUCTURE OF A METADATA REGISTRY

    9.2.6.1 METAMODEL FOR A METADATA REGISTRY A metamodel is a model that describes other models. A metamodel provides a

    mechanism for understanding the precise structure and components of the specified

    models, which are needed for the successful sharing of the models by users and/or

    software facilities.

    The registry metamodel is specified as a conceptual data model, i.e. one that describes

    how relevant information is structured in the natural world. In other words, it is how the

    human mind is accustomed to thinking of the information.

    As a conceptual data model, there need be no one-to-one match between the attributes in

    the model and fields, columns, objects, et cetera in a database. There may be more than

    one field per attribute and some entities and relationships may be implemented as fields.

    The model shows constraints on minimum and maximum occurrences of attributes. The

    constraints on maximum occurrences are to be enforced at all times. The constraints on

    minimum occurrences are to be enforced when the registration status for the metadata

  • 47

    item is "recorded" or higher. In other words, a registration status of "recorded" indicates

    that all mandatory attributes have been documented.

    9.2.6.2 APPLICATION OF THE METAMODEL Some of the objectives of the metamodel for a Metadata Registry are to:

    provide a unified view of concepts, terms, value domains and value

    meanings;

    promote a common understanding of the data described;

    enable the sharing and reuse of the contents of implementations.

    A metamodel is necessary for coordination of data representation between persons and/or

    systems that store, manipulate and exchange data. The metamodel will assist registrars in

    maintaining consistency among different registries. The metamodel enables systems tools

    and information registries to store, manipulate and exchange the metadata for data

    attribution, classification, definition, naming, identification, and registration. In this

    manner, consistency of data content supports interoperability among systems tools and

    information registries.

    9.2.6.3 SPECIFICATION OF THE METAMODEL When using a model to specify another model, it is easy for the reader to become

    confused about which model is being referred to at any particular point. To minimize this

    confusion, this document deliberately uses different terms in the model being specified

    from those used to do the specification.

    The registry metamodel is specified using a subset of the Unified Modelling Language

    (UML). This document uses the term "metamodel construct" for the model constructs it

    uses, but "metadata objects" for the model constructs it specifies. The metamodel

    constructs used are: classes, relationships, association classes, attributes, composite

    attributes and composite data types.

    However, there are certain parallels between the two models. For example, the "Object

    Class" specified in the model is equivalent to the metamodel construct class used to

    specify the model, and the Property specified in the model is equivalent to the

    metamodel construct attribute used to specify the model. The different terms are used

    to make it clear which model is being referred to, not because they represent different

  • 48

    concepts. One term that this document uses at both levels is data type, but the level to

    which it applies should be apparent from the context in which it is used.

    9.2.6.4 TYPES, INSTANCES AND VALUES When considering data and metadata, it is important to distinguish between types of

    data/metadata, and instances of these types and their associated values. The metamodel

    specifies types of classes, attributes and relationships. Any particular instance of one of

    these will be of a specific type and at any point in time, that instance will have a specific

    value.

    A metadata registry will be populated with instances of these metadata objects (metadata

    items), which in turn define types of data, e.g. in an application database. In other words,

    instances of metadata specify types of application level data. In turn, the application

    database will be populated by the real world data as instances of those defined data types.

    9.2.6.5 DATE REFERENCES Dates are important attributes of an Administration Record and of operations of a

    registry. For Example, for the Gregorian calendar date {see ISO 8601:2000} and the

    associated default representation is YYYY-MM-DD (i.e. Year-Month-Day). For

    example, 12 October, 2001 if referenced in numeric form should be 2001-10-12 and not,

    for example, as 12-10-2001 (which might be confused with 10 December, 2001).

    Standardized Ethiopian Date representation System should be used where data and

    metadata consisting of Ethiopian Calendar system is encountered.

    9.2.6.6 DATA DEFINITION REQUIREMENTS AND

    RECOMMENDATIONS

    A listing of the requirements and recommendations without explanations is provided in

    this clause for convenience of the user. The intent is to facilitate ease of use of this

    document once an understanding of the requirements and recommendations is achieved.

    9.2.6.6.1 REQUIREMENTS A data definition shall:

  • 49

    a) Be stated in the singular

    b) State what the concept is, not only what it is not

    c) Be stated as a descriptive phrase or sentence(s)

    d) Contain only commonly understood abbreviations

    e) Be expressed without embedding definitions of other data or underlying concepts

    9.2.6.6.2 RECOMMENDATIONS A data definition should:

    a) State the essential meaning of the concept

    b) Be precise and unambiguous

    c) Be concise

    d) Be able to stand alone

    e) Be expressed without embedding rationale, functional usage, or procedural information

    f) Avoid circular reasoning

    g) Use the same terminology and consistent logical structure for related definitions

    h) Be appropriate for the type of metadata item being defined

    9.2.6.6.3 PROVISIONS

    9.2.6.6.3.1 PREMISES Data is used for specific purposes. Differences in use require different operational

    manifestations of some requirements and recommendations. The primary characteristics

    deemed necessary to convey the essential meaning of a particular definition will vary

    according to the level of generalization or specialization of the data. Primary and

    essential characteristics for defining concepts such as airport in the commercial air

    transportation industry might be specific, where a more general definition may be

    adequate in a different context. Within a metadata registry, multiple equivalent

    definitions may be written in different languages or, within a single language, for

    different audiences such as children, general public, or subject area specialists. For a

    discussion of relationships between concepts in different contexts and how characteristics

    are used to differentiate concepts, see ISO 704, Clause 5. Definitions should be written to

    facilitate understanding by any user and by recipients of shared data.

    9.2.6.6.3.2 REQUIREMENTS

  • 50

    To facilitate understanding of the requirements for construction of well-formed data

    definitions, explanations and examples are provided below. Each requirement is followed

    by a short explanation of its meaning.

    Examples are given to support the explanations. In all cases, a good example is provided

    to exemplify the explanation. When deemed beneficial, a poor, but commonly used

    example is given to show how a definition should NOT be constructed. To further

    explain the differences between the good and poor examples, examples are followed by a

    statement of rationale behind them. Note that the examples below are definitions for data

    elements and these definitions are illustrative.

    A data definition shall:

    A) BE STATED IN THE SINGULAR

    EXPLANATION - The concept expressed by the data definition shall be expressed in

    the singular. (An exception is made if the concept itself is plural.)

    EXAMPLE - Article Number

    1) Good definition: A reference number that identifies an article.

    2) Poor definition: Reference number identifying articles.

    REASON - The poor definition uses the plural word articles, which is ambiguous,

    since it could imply that an article number refers to more than one article.

    B) STATE WHAT THE CONCEPT IS, NOT ONLY WHAT IT IS NOT

    EXPLANATION - When constructing definitions, the concept cannot be defined

    exclusively by stating what the concept is not.

    EXAMPLE - Freight Cost Amount

    1) Good definition: Cost amount incurred by a shipper in moving goods from one place

    to another.

    2) Poor definition: Costs which are not related to packing, documentation, loading,

    unloading, and insurance.

    REASON - The poor definition does not specify what is included in the meaning of the

    data.

    C) BE STATED AS A DESCRIPTIVE PHRASE OR SENTENCE(S) (in most

    languages)

  • 51

    EXPLANATION - A phrase is necessary (in most languages) to form a precise

    definition that includes the essential characteristics of the concept. Simply stating one or

    more synonym(s) is insufficient. Simply restating the words of the name in a different

    order is insufficient. If more than a descriptive phrase is needed, use complete,

    grammatically correct sentences.

    EXAMPLE - Agent Name

    1) Good definition: Name of party authorized to act on behalf of another party.

    2) Poor definition: Representative.

    REASON - Representative is a near-synonym of the data element name, which is not

    adequate for a definition.

    D) CONTAIN ONLY COMMONLY UNDERSTOOD ABBREVIATIONS

    EXPLANATION - Understanding the meaning of an abbreviation, including acronyms

    and initialisms, is usually confined to a certain environment. In other environments the

    same abbreviation can cause misinterpretation or confusion. Therefore, to avoid

    ambiguity, full words, not abbreviations, shall be used in the definition.

    Exceptions to this requirement may be made if an abbreviation is commonly understood

    such as i.e. and e.g. or if an abbreviation is more readily understood than the full

    form of a complex term and has been adopted as a term in its own right such as radar

    standing for radio detecting and ranging.

    All acronyms must be expanded on the first occurrence.

    EXAMPLE 1 - Tide Height

    1) Good definition: The vertical distance from mean sea level (MSL) to a specific tide

    level.

    2) Poor definition: The vertical distance from MSL to a specific tide level.

    REASON - The poor definition is unclear because the acronym, MSL, is not commonly

    understood and some users may need to refer to other sources to determine what it

    represents. Without the full word, finding the term in a glossary may be difficult or

    impossible.

    EXAMPLE 2 - Unit of Density Measurement

  • 52

    1) Good definition: The unit employed in measuring the concentration of matter in terms

    of mass per unit (m.p.u.) volume (e.g., pound per cubic foot; kilogram per cubic meter).

    2) Poor definition: The unit employed in measuring the concentration of matter in terms

    of m.p.u. volume (e.g., pound per cubic foot; kilogram per cubic meter).

    REASON - m.p.u. is not a common abbreviation, and its meaning may not be understood

    by some users. The abbreviation should be expanded to full words.

    E) BE EXPRESSED WITHOUT EMBEDDING DEFINITIONS OF OTHER DATA

    OR UNDERLYING CONCEPTS

    EXPLANATION - As shown in the following example, the definition of a second data

    element or related concept should not appear in the definition proper of the primary data

    element. Definitions of terms should be provided in an associated glossary. If the second

    definition is necessary, it may be attached by a note at the end of the primary definition's

    main text or as a separate entry in the dictionary. Related definitions can be accessed

    through relational attributes (e.g., cross-reference).

    EXAMPLE 1- Sample Type Code

    1) Good definition: A code identifying the kind of sample.

    2) Poor definition: A code identifying the kind of sample collected. A sample is a small

    specimen taken for testing. It can be either an actual sample for testing, or a quality

    control surrogate sample. A quality control sample is a surrogate sample taken to verify

    results of actual samples.

    REASON - The poor definition contains two extraneous definitions embedded in it. They

    are definitions of sample and of quality control sample.

    EXAMPLE 2 - "Issuing Bank Documentary Credit Number"

    1) Good definition: Reference number assigned by issuing bank to a documentary credit.

    2) Poor definition: Reference number assigned by issuing bank to a documentary credit.

    A documentary credit is a document in which a bank states that it has issued a

    documentary credit under which the beneficiary is to obtain payment, acceptance, or

    negotiation on compliance with certain terms and conditions and against presentation of

    stipulated documents and such drafts as may be specified.

  • 53

    REASON - The poor definitio