Upload
dangdiep
View
218
Download
1
Embed Size (px)
Citation preview
1
ETHIOPIAN INFORMATION COMMUNICATION TECHNOLOGY DEVELOPMENT AGENCY
(EICTDA)
Information Exchange Standard: Final Document
By
Standardization and Regulatory Project
October 2007 Addis Ababa
2
No Content Gender, Economy, and Globalization Page1 Introduction.. 1
2 Objective.. 2
3 Scope 2
4 Methodology 2
5 Benefits of the Project. 3
6 Information Exchange Standard World Experience 3
7 Information Exchange Practices in Ethiopia.. 5
8 Partner Organizations. 10
9 Information Exchange Standard Components. 11
9.1 Data and Metadata Standard
11
9.1.1 Extensible Markup Language (XML) 11
9.1.2 XML schema... 11
9.1.3 Resource description framework (RDF) 12
9.1.4 The metadata element. 12
9.1.4.1 ETGLS metadata elements.
16
9.2 Metadata Registries (MDR) 29
9.2.1 Fundamental model of data elements.. 30
9.2.2 Data elements in data management and interchange.. 32
9.2.3 Fundamental model of value domains.. 33
9.2.4 Fundamentals of classification schemes... 37
9.2.5 Attributes of a classification scheme 39
9.2.6. Structure of a metadata registry.. 41
9.2.6.1 metamodel for a metadata registry.. 41
9.2.6.2 Application of the metamodel. 42
9.2.6.3 Specification of the metamodel.. 42 9.2.6.4 Types, instances and values 43
9.2.6.5 Date references. 43
9.2.6.6 Data definition requirements and recommendations 43
9.2.6.6.1 Requirements. 43
3
9.2.6.6.2 Recommendations.. 44
9.2.6.6.3 Provisions... 44
9.2.6.6.3.1 Premises.. 44
9.2.6.6.3.2 Requirements.. 44
9.2.6.6.3.3 Recommendations.. 48
9.3 Interoperability Standards. 52 9.3.1 Background Information. 52
9.3.2 Layer Models
53
9.3.2.1 Network Layer
54
9.3.2.1.1 Network Protocols..
54
9.3.2.1.2 Transmission Control Protocol (TCP) 55
9.3.2.1.3 User Data Protocol (UDP). 55
9.3.2.1.4 File Transfer Protocols (FTP) 55
9.3.2.1.5 Mail Transfer Protocol (MTP)
56
9.3.2.1.6 Registry Protocols... 57
9.3.2.1.7 Directory Protocols. 57
9.3.2.1.8 Messaging Protocols. 58
9.3.2.2 Data Integration Layer.. 59
9.3.2.2.1 Primary Character Set. 59 9.3.2.2.2 Universal Character Code (UTC) 60
9.3.2.2.3 Structured Web Document Language.. 61
9.3.2.2.3.1 Hyper Text Interchange Format
- (HTML v4.01).......
61
9.3.2.2.3.2 Hypertext Interchange Formats XML
v1.0
61
9.3.2.3 Web Services Layer.. 64
9.3.2.3.1 Universal Description, Discovery, and Integration (UDDI
v3)
65
9.3.2.3.2 Web Services Description Language (WSDL). 65
4
9.3.2.3.3 Simple Object Access Protocol SOAP). 66
9.3.2.4 Access and Presentation Layer. 66
9.3.2.4.1 Presentations. 67
9.3.2.4.2 Data Modeling 69
9.3.2.5 Security Layer. 70
9.3.2.5.1 Internet Protocol Security Protocol (IPSEC) 70
9.3.2.5.2 Transport Layer Security (TLS).. 72
Use of Open Standards 73
Updating the Document.. 74
Bibliography. 75
5
Abbreviations AES Advanced Encryption Standard AGLS Australian Government Locator Service ASCII American Standard Code for Information Interchange DES Data Encryption Standard DNS Domain name System DSA Digital Signature Algorithm DTD Document Type Definition ETGLS Ethiopian Government Locator Service FTP File Transfer Protocol HTTP Hyper Text Transfer Protocol IEC International Electrotechnican Commission IP Internet Protocol IPSEC Internet Protocol Security Protocol IP v4 Internet Protocol version 4 IP v6 Internet Protocol version 6 ISO International Organization for Standardization MDR Metadata registries MIME Multipurpose Internet Mail Extension NZGLS New Zealand Government Locator Service PDF Portable Document Format RDF Resource Description Framework RFC Request for Comments SMTP Simple Main Transfer Protocol -S-MIME Secure MIME SOAP Simple Object Access Protocol TCP Transmission Control Protocol TLS Transport Layer Security UDDI Universal Description Discovery and Integration WSDL Web Services Description Language W3C World Wide Web Consortium XML Extensible Markup Language
6
1. INTRODUCTION
Information is a resource that activates various sectors of the economy, making it
possible for producers and consumers to be linked to markets. Availability of information
allows for the public to participate meaningfully in governance, through engaging in
public discussions and contributing to decision-making.
If the national development programmes of poverty eradication, decentralization, the Plan
for Modernization of Agriculture (PMA), etc and above all the plan to leap-frog to the
Information society are to succeed, information has to be availed at all levels of society,
right from the national, districts, sub-counties down to the grass roots. It is through
having open communication channels that allow for information exchange in all
directions that the information needs of various interest groups can be identified and
fulfilled.
The ability of government organizations to share/exchange information and integrate
information and business by use of common standards, which is termed as
interoperability, is critical for the achievement of E-government goals. It provides the
capability for any agency to join with other electronically using known and agreed
approaches.
It is known that organizations cannot operate in isolation. They are dependent on each
other and resource sharing is a must for survival. However, this co-operation needs some
sort of conformity or standard, i.e., the success of information exchange, whether it is
practiced nationally, regionally or internationally, depends on the availability of
commonly agreed standards.
It is with this understanding that the Ethiopian government gave high priority for the
Preparation of information exchange standard that will bind government agencies
together. This is clearly depicted in the countrys Draft ICT Policy as there is a need for
Standardized data collection, processing and data exchange procedures.
An electronic document exchange, which is one form of information exchange, provides
companies with the ability to:
7
Once connected to a central hub allowing quick and easy integration to
multiple trading partners1;
Focus on building their business by connecting quickly and cost-effectively
with their trading partners;
Extend the return on investment of their existing systems;
Drive down the operating and maintenance costs of multiple trading partner
integration; and
Utilize data from external sources that is transformed into formats that can be
understood by internal applications.
2. OBJECTIVE To formulate/develop an information exchange standard that government organizations
will use to exchange information.
3. SCOPE
The scope of the project is limited to preparation of information exchange standard which
will bind government organizations/agencies to exchange information i.e. the information
exchange standard is for creating and managing information resources and services that
are locatable via the internet.
4. METHODOLOGY Survey was made visualize the existing information exchange situation in Ethiopia. As
part of methodology that is used to realize this project is reviewing literature on the
subject. The sources were mainly Internet sources and ISO/IEC standards on the subject.
Questionnaire is also used to gather primary data on information exchange/transfer
situation among some focal government organizations. Local and foreign experts in the
area were also consulted/ interviewed in the course of the process.
1 Agencies that are sharing/exchanging information among themselves are said to be trading partners
8
5. BENEFITS OF THE PROJECT
This project has at least the following benefits:
Avoiding the problem of rebuilding or significantly altering their systems to
share information.
Information exchange modules across the government organizations will have
a common interface ;
Information exchange module development becomes faster and more efficient
as more organizations engage in it;
Reuse of software developed to communicate with the interface of a previous
organization will be possible; and
Information exchange between ICT applications and users will be developed.
Enable different enterprise to share and exchange information irrespective of
the particular technology in use.
6. INFORMATION EXCHANGE STANDARD WORLD EXPERIENCE
Standards enable different information systems to share and exchange information
irrespective of the particular technologies at use. Moreover, creating and adopting
information exchange standards means that local, state, tribal, and federal organizations
avoid the problem of rebuilding or significantly altering their systems to share
information.
One good example of Information Exchange model Standard is that of the National
Information Exchange Model (NIEM), a partnership of the U.S. Department of Justice
(DOJ) and U.S. Department of Homeland Security (DHS) (www.NIEM.org). It is
designed to develop, disseminate, and support enterprise wide Information Exchange
Package Documentation (IEPD) and processes that will enable jurisdictions and agencies
http://www.niem.org/
9
throughout the nation to effectively share critical information in both emergency and
routine situations.
NIEM focuses on discrete information exchanges between agency information systems.
NIEM will provide the information sharing structure necessary for first responders and
decision makers to have the right information to prepare for, prevent, and respond to
major terrorist events and natural disasters.
Also it will enhance the day today capabilities of practitioners at all levels of government
in making crucial decisions about border enforcement, passenger screening, port security,
intelligence analysis, local law enforcement and judicial processing, correctional
supervision and release, and a variety of other governmental functions.
NIEM Primary objectives are to:
Bring stakeholders together to identify information sharing requirements for
operational and emergency situations.
Maintain a national model containing universal, common, and domain specific
data components that pertain to agency information needs in order to facilitate
development of an IEPD.
Develop standards, a common vocabulary, and an online repository of IEPDs to
support information sharing.
Provide technical tools to support development, discovery, dissemination, and
reuse of IEPDs.
Provide training, technical assistance, and implementation support services, as
appropriate.
Developing and implementing NIEM exchange standards means that the major
investments local, state, tribal, and federal governments have made in existing
information systems can be leveraged and that these governments can efficiently
participate in a truly national information sharing environment. NIEM standards enable
different information systems to share and exchange information, irrespective of the
particular technologies at use. Moreover, creating and adopting NIEM standards means
10
that local, state, tribal, and federal organizations avoid the Problem of rebuilding or
significantly altering their systems to share information.
The case of New Zealand e-Government initiative and that of Australian experience can
also be sited as having comprehensive information exchange/sharing standards
(hhtp://www.e.govt.nz,& http://www.agls.gov.au ) and guidelines which can be adapted
to our situation.
7. INFORMATION EXCHANGE PRACTICES IN ETHIOPIA
Information exchange in Ethiopia can be seen from three perspectives: the traditional era
which is before the Menilik II regime, the Semi modern era which is after the Menilik II,
and the modern era which is mainly after the introduction of the Internet technology in
the country.
During the traditional era transmission and exchange of information was made mainly
among people and from monasteries to their followers. Mosques also have a share in this
traditional dissemination of information.
During this period methods like stories, Nagarits and messengers are used to pass over
cultural histories and heritages from generation to generation and also passing
information that some thing urgent happened at a specific place. As any one can
understand, this method of passing over of information is time taking, prone to errors and
causes significantly delay in decision making which in turn resulted in underdevelopment
of the society.
The second era, which is the semi modern period, is the time during which other methods
of information dissemination came in to being like telegram, postal mail, fax, etc. With
these technologies, there have been rapid revolutions in information dissemination and
exchange at least among urban settlers and some times with people in the Diasporas. The
coverage of these technologies is very limited that the majority of the citizens didnt take
advantages of it. The postal mail system of passing information from one place to another
is some how widely spread than others during this period.
http://www.e.govt.nz,&/http://www.agls.gov.au/
11
The third period can be seen with the introduction of computer technology, specifically
the coming into being of the Internet to the country, which revolutionized the way people
are communicating, exchanging at least personal messages, if not corporate data. The
sole Internet service provider in the country, the Ethiopian telecommunications
Corporations, of course facilitates this. The expansion of coverage of the fixed telephone,
the introduction of mobile and wireless telephone and its penetration into the rural village
totally changed the way we communicate and exchange information. Still the coverage is
at a very low stage that the majority of the populations do not have access to such
technology.
Questionnaires were distributed to different government organizations in order to assess
the existing information exchange situation. Among the specific objectives of the
questionnaires are:
Determine the level and status of information exchange in different organization
Identify the problems encountered by users due to the lack of information
exchange standard.
Identify what information exchange methods are used, etc.
Questionnaires were distributed to forty (40) organizations from which 27(i.e. 70%) of
the organizations responded. This suffices to analyze the data and conclude what
information exchange situation look like in government organizations in Ethiopia.
Communication Methods Used Organizations were asked what communication methods they employ to exchange data
within and outside their organizations. According to their response, 81.48% of them use
both office boys and /or LAN and WAN, 85.18% of them use postal mail and/or fax
while 88.88% of the respondent organizations make use of telephone. Therefore, the data
shows us that the majority of the organizations are making use of telephone as the main
communications means both within and outside their organization. The Result is shown
in the following table.
12
Communication Methods Number Percentage
Office boy 22 81.48
LAN/WAN 22 81.48
Postal Mail 23 85.18
Telephone 24 88.8
Fax 23 58.18
Others 0 0
Organizations that are making use of LAN as a means of communications with other
similar organizations were asked what information exchange technologies they are using.
According to their response, the Internet using dial up method, digital data
communication technologies such as DDN with frame rely configuration and broadband
wireless and ADSL are mentioned with very few number of respondents while the
majority did not react to the question.
Existence of Rules, Regulations and Procedures for Data Exchange
Concerning existence of rules, regulations and procedures for accessing data and
information, Organizations were asked if they have rules and regulations or standards
defined by their organizations that oblige them to follow when exchanging information
with other organizations. The majority of the organizations (63%) replied that they do not
have any rules and regulations while 37% of the respondents do have rules, concerning
personnel data, financial information exchange and other means of access control to
sensitive information and resources of the organization. The result is shown in the
following table:
Existence of Rules and regulation
Number
Percentage
37 10
NO 17 63
13
Total 27 100
Use of the Internet
The majority of the organization, 70.37%, is using Internet heavily in their day-to-day
activities while 29.63% of them are using Internet but not much. From the respondents,
there is no organization that does not have Internet connection. This could make easy the
plan to interconnect organizations so that they will be able to exchange information.
Level of Internet Usage Number Percentage
Heavily 19 70.37
Not much 8 29.63
None 0 0
Total 27 100
E-mail Usage
It is noted that almost all the respondent organizations are using e-mail to communicate
within and outside their organization the purpose of which vary based on the areas of
work that the organization is engaged in.
Link with other organizations
Organizations were asked with which types of organizations they do communicate. It is
noted that organization communicate and exchange information mainly with different
organizations which have links, directly or indirectly (i.e. with organizations that have
similar work, regional bureaus, higher learning institutions, different branch
organizations, with government media, agencies and international organizations etc)
Software Usage
Most of the organizations use Microsoft Word when exchanging documents through e-
mail. While Adobe Acrobat, HTML and others like Adobe Page Maker, Microsoft Excel
and Microsoft Outlook and other application packages are also used for some purpose.
14
Type of Software Number Percentage
Microsoft Word 23 85.18
Adobe Acrobat 15 55.56
HTML 12 44.44
Others 4 14.81
Graphic and Other Types of File Formats
Organizations make use of different types of file formats for exchanging graphic files.
Accordingly, the majority of the organizations use JPEG file format, which is 77.78% of
the respondents. While most of them also use BMP and Gif file formats. Other formats
such as TIFF file formats are also used.
File Format Number Percentage
JPEG 21 77.78
BMP 15 55.56
GIF 13 48.15
Others 1 3.7
For Accounting and Database file exchange most of the organizations use MS- SQL,
Oracle, ADA, Peachtree Accounting and MS-Excel are the widely used type of Soft
wares used by the respondent organizations and other Database Management soft wares
are used based on their specifications. This shows that Software products are vary and
complicated which results in difficulties in to Share and exchange information and other
necessary documents within and outside the organization.
Organizational Website
Among the respondents, majority of them (89.89%) have their own
website and only 11.11% of the respondents dont have website.
15
Data Storage Mechanism
Organizations were asked about their data storage mechanisms for which the majority of
the respondents, 68.4%, keep their organizational data using both semi-computerized and
computerized systems.
Methods of Data storage Number Percentage
Manually 12 63%
Semi-Computerized 13 68%
Computerized 13 68%
This shows that there is a high trend of using computer based systems for storing
organizational data, which only need to be consistently defined, and designed to make it
interoperable and hence exchange information with other organizations.
Data Security (Protection)
92.5% of the respondents said that they use different mechanisms to protect and secure
sensitive information and other organizational resources. Among the mechanisms used to
protect data and information, physical security, password, access privilege, firewall,
backup systems and using antivirus are the most common mechanisms mentioned by the
respondents.
Request for Information Exchange Standard
A very high number of respondents said that standardization of information exchange is
essential to simplify electronic information exchange and requested the development of
the standard as urgently as possible.
8. PARTNER ORGANIZATIONS Organizations that are to share/exchange information among them are partners. In
Ethiopian context, its mainly the selected government organizations that are going to
exchange information, as there is a project to develop contents and applications for these
government organizations. These include Central Statistics Authority, Ministry of
16
Agriculture, Ministry of Education, Ministry of Trade and Industry, Ministry of Finance
and Economic Development, National Bank of Ethiopia, National Library & Archives,
Federal Inland Revenue, Ministry of Health, Customs Authority and Federal Civil
Service Commission are among others. Other government and non-government
Organizations can join the partners and use the information exchange standard as a guide.
9. INFORMATION EXCHANGE STANDARD COMPONENTS The Information Exchange Standards we suggested in this document comprise of
components like data and metadata standards, interoperability Standards, Security
Standards, and other Standards.
9.1 DATA AND METADATA STANDARD
9.1.1 EXTENSIBLE MARKUP LANGUAGE (XML)
Extensible Markup Language (XML) Standard describes the basic format for data
transport. XML documents are both platform and hardware independent and can be used
asglue holding the relatively independent network applications together. XML is the
subset of SGML (Standard Generalized Markup Language). It is a markup Meta
language, designed for rapid simplification of data interchange on the Web.
XML (in contrast to HTML) is not a really markup language, but a Meta language for
creating once own markup languages (data dictionaries). Thus, entire XML set of tags
does not exist; conversely, XML enables creating of sets of arbitrary tags with arbitrary
semantics.
The primary goal of XML is strict separation of data from data processing. The XML
document carries only pure data, without information about data processing (or
presentation).
9.1.2 XML SCHEMA
The purpose of an XML Schema: Structures schema is to define and describe a class of
XML documents by using schema components to constrain and document the meaning,
17
usage and relationships of their constituent parts: datatypes, elements and their content
and attributes and their values. Schemas may also provide for the specification of
additional document information, such as normalization and defaulting of attribute and
element values. Schemas have facilities for self-documentation. Thus, XML Schema:
Structures can be used to define, describe and catalogue XML vocabularies for classes of
XML documents.
Any application that consumes well-formed XML can use the XML Schema: Structures
formalism to express syntactic, structural and value constraints applicable to its document
instances. The XML Schema: Structures formalism allows a useful level of constraint
checking to be described and implemented for a wide spectrum of XML applications.
However, the language defined by this specification does not attempt to provide all the
facilities that might be needed by any application. Some applications may require
constraint capabilities not expressible in this language, and so may need to perform their
own additional validations.
9.1.3 RESOURCE DESCRIPTION FRAMEWORK (RDF) RDF Standard defines common language for purposes of Web information
representation. RDF is designed especially for description of metadata about Web
resources. It is focused to automatic (machine-to-machine) exchange of information
about resources, without loss or modification of meaning of information during Data
exchange.
RDF presents common frame for data exchange and supports extensibility. The
developers can include to RDF its own data dictionaries.
9.1.4 THE METADATA ELEMENT
The Dublin Core metadata Standard element set is a standard ( NISO Standard Z39-85-
2001) for cross-domain information resource description. In other words, it provides a
simple and standardized set of conventions for describing things online in ways that make
them easier to find. Dublin Core is widely used to describe digital materials such as
http://www.niso.org/standards/standard_detail.cfm?std_id=725
18
video, sound, image, text, and composite media like web pages. Implementations of
Dublin Core are typically XML and Resource Description Framework based.
The set of metadata elements, which would improve the visibility, accessibility, and
interoperability of government information and services through the provision of
standardized Web-based resource descriptions which enable users to locate the
information or service that they require.
The Ethiopian Government Locator Service (ETGLS) element set described later in this
document is a set of 19 descriptors that resulted from review of International best
Practices like that of New Zealand, Australian and the Dublin Core Data and Metadata
Standards. The Australian Government Locator Service (AUGLS) and the New Zealand
Government Locator Service Data and metadata Standards are more complex element set
than the Dublin Core Standard in the sense that they contain a number of element
qualifiers which enable them to describe more categories of resources and allow richer
description of resources; the ETGLS is more adopted from AGLS & NZGLS.
Despite this, ETGLS is entirely compatible and interoperable with the Dublin Core
element set. It is envisaged that ETGLS can coexist with other metadata standards based
on different semantics.
The ETGLS Metadata Standard will be an Ethiopian standard for cross-domain resource
description.
The ETGLS metadata set is intended for use by any organisation or individual creating or
managing information sources or services that are locatable via the Internet. In particular,
it is intended for information about resources and services on the World Wide Web. For
the purposes of ETGLS metadata, a resource will typically be an online information or
service resource, but may be applied more broadly to people and organisations, and
information or services that are not available online.
This standard describes the ETGLS element set and qualifiers. It does not define or
describe the detailed criteria by which the element set and qualifiers will be implemented
in specific projects and applications by individuals and organizations.
The ETGLS 19-element set is described below with:
19
a unique, machine-understandable, single-word element name intended for
use in the computer programming rules (syntactic use) which is intended
to make the specification of elements simpler for encoding schemes;
a label, which is intended to convey a common understanding of the
element;
a definition, the semantics or meaning of the element;
obligation, an indication of whether the element must be used to comply
with the ETGLS standard; and
a comment, which further expands or refines the meaning of the element
and how it may be used, and may include examples.
Where qualifiers are used, they are described in much the same pattern below the element
description with the addition of qualifier type.
Five metadata elements must be present for compliance with this standard.
The mandatory elements are: Creator, Title, Date, Subject OR Function, Identifier OR
Availability
In the case of Subject or Function, this standard specifies that at least one of those two
elements must appear in a metadata description.
The obligation applicable to the last element changes depending on whether the resource
described is available online or offline. If the resource is available online, then Identifier
is mandatory. If it is a resource only available offline, then Availability is mandatory.
In addition, this standard requires the use of the Publisher element for descriptions of
information resources, but not for resources which are transactional services.
All other elements are optional, and all elements are repeatable. Metadata elements may
appear in any order. It is assumed that metadata instances based on this standard will
specify the encoding scheme used for any element where this is appropriate. This
standard cannot specify the use of any particular schemes with specific elements.
Although some environments, such as HTML, are not case-sensitive, it is recommended
as a best practice to always adhere to the case conventions in the element and qualifier
20
names given in the standard to avoid conflicts if the metadata is subsequently extracted
and converted to a case-sensitive syntax, such as XML (eXtensible Markup Language).
Qualifiers
Qualifiers are additions and extensions to the metadata elements that provide information
about how the semantics (meaning) of an element have been refined, or about how the
value (specific content) of an element should be interpreted.
The guiding principle for using qualifiers with ETGLS elements is that a client (eg a
person or software) should be able to ignore any qualifier and use the description
(element content) as if it were unqualified. The remaining element value without the
qualifier should continue to be generally correct and useful for discovery and other
management purposes.
ETGLS uses two types of qualifiers: Element refinements, and encoding schemes
Element refinements
Element refinements refine the semantics (meaning) of the element by further specifying
the relationship of the element value to the resource itself. A refined element shares the
meaning of the unqualified element, but with a more restricted scope. For example, the
element Coverage can refer to legal or administrative scope (jurisdiction), to the
geographical scope (spatial), to the period of time covered by the resource (temporal)2.
The element refinements, which may be used in ETGLS, are listed in the description of
each element. It is expected that the element refinements will continue to change over
time. The ETGLS metadata set will be modified from time to time to specify the element
refinements, which may be used for each element.
Encoding schemes
Encoding schemes indicate how the value of an element is to be interpreted if it has been
chosen from a controlled vocabulary or is encoded if an externally defined standard is
used. A value expressed using an encoding scheme will be either selected from a
2 Note that the qualifier, coverage, refers to the resource content, not to the date(s) for which the resource is valid or useable or available.
21
controlled vocabulary (eg a term from a classification system or set of subject headings)
or a string formatted in accordance with a formal notation (eg "2000-01-01" as the
standard expression of a date). This standard is not prescriptive about available encoding
schemes for particular elements and does not attempt to specify available schemes for
each element. Most elements in the ETGLS element set may be qualified with an
encoding scheme.
9.1.4.1 ETGLS METADATA ELEMENTS
Obligation: Mandatory
1. Element Name: Creator
Label: Creator
Definition: An entity primarily responsible for making the content of
the resource.
Obligation: Mandatory if known
Comment: Examples of a Creator include a person, or an organisation.
2. Element Name: Date
Label: Date
Definition: A date of an event in the lifecycle of the resource.
Obligation: Mandatory
Comment: Typically, Date will be associated with the creation or
availability of the resource. Recommended best practice
for encoding the date value is defined in a profile of ISO
8601 and follows the YYYY-MM-DD format for materials
written in foreign language. For Local languages it will be
as per Localization standard format. For Example,
DD-MM-YY format is used for Amharic, Afan Oromo, and
Tigrinya languages.
____________________________________________________________________
22
Qualifiers
Qualifier Name: created
Label: Created
Qualifier Type: element refinement
Definition: Creation date of the resource.
Qualifier Name: modified
Label: Modified
Qualifier Type: element refinement
Definition: Modification date of the resource.
Qualifier Name: valid
Label: Valid
Qualifier Type: element refinement
Definition: A date (often a range) of validity of a resource.
Comment: Typically, a date the resource becomes valid or ceases to be
valid, or the date range for which the resource is valid.
Qualifier Name: issued
Label: Issued
Qualifier Type: element refinement
Definition: A date on which the resource was made formally available
in its current form.
3. Element Name: Title
Label: Title
Definition: A name given to the resource.
Obligation: Mandatory
Comment: Typically, the name by which the resource is formally
known.
23
Qualifiers
Qualifier Name: alternative
Label: Alternative
Qualifier Type: element refinement
Definition: Any form of the title used as a substitute or alternative to
the formal title of the resource
Comment: This qualifier could include abbreviations and acronyms by
which a resource may be known.
Obligation: Conditional
4. Element Name: Availability
Label: Availability
Definition: How the resource can be obtained or contact information
for obtaining the resource.
Obligation: Mandatory for offline resources.
Comment: The Availability element is primarily used for non-
electronic resources to provide information on how to
obtain physical access to the resource.
5. Element Name: Function
Label: Function
Definition: The business function of the organisation to which the
resource relates.
Obligation: Mandatory if no Subject element specified.
Comment: Used to indicate the business role of the resource in terms
of business functions and activities. Functions are the major
units of activity which organisations pursue in order to
meet the mission and goals of the organisation.
24
Recommended best practice is to select a value from a
controlled vocabulary or formal classification scheme.
6. Element Name: Identifier
Label: Resource Identifier
Definition: An unambiguous reference to the resource within a given
context.
Obligation: Mandatory for online resources.
Comment: Recommended best practice is to identify the resource by
means of a string or number conforming to a formal
identification system. Example formal identification
systems include the Uniform Resource Identifier (URI)
(including the Uniform Resource Locator (URL)), the
Digital Object Identifier (DOI) and the International
Standard Book Number (ISBN).
7. Element Name: Publisher
Label: Publisher
Definition: An entity responsible for making the resource available.
Obligation: Mandatory for information resources.
Comment: This field is often the name of the organisation that owns or
controls or publishes the resource. It is not recommended
that this element be used for the name of the entity which
merely acts as the host for a website.
8. Element Name: Subject
Label: Subject and Keywords
Definition: A subject and/or topic of the content of the resource.
Obligation: Mandatory if no Function element specified.
Comment: Typically, a Subject will be expressed as keywords, key
phrases or classification codes that describe a topic of the
resource content. Recommended best practice is to select a
25
value from a controlled vocabulary or formal classification
scheme.
Obligation: Optional
9. Element Name: Audience
Label: Audience
Definition: A target audience of the resource.
Obligation: Optional
Comment: Types of audiences commonly used in this element include
particular industry sectors, education levels, skill levels,
occupations, and EEO categories. Recommended best
practice is to select a value from a controlled vocabulary or
formal classification scheme.
10. Element Name: Contributor
Label: Contributor
Definition: An entity responsible for making a contribution to the
content of the resource.
Obligation: Optional
Comment: Typically, a contributor will be an entity that has played an
important but secondary role in creating the content of the
resource and is not specified in the creator element.
11. Element Name: Coverage
Label: Coverage
Definition: The extent or scope of the content of the resource.
Obligation: Optional
Comment: Coverage will typically include spatial location (a place
name or geographic coordinates), temporal period (a period
label, date, or date range) or jurisdiction (such as a named
administrative entity). Recommended best practice is to
26
select a value from a controlled vocabulary (for example,
the Thesaurus of Geographic Names [TGN]) and that,
where appropriate, named places or time periods be used in
preference to numeric identifiers such as sets of coordinates
or date ranges.
Qualifiers
Qualifier Name: jurisdiction
Label: Jurisdiction
Qualifier Type: element refinement
Definition: The name of the political/administrative entity covered by
the content of the resource.
Comment: Jurisdiction is a description of the territory over which a
particular government exercises its authority or a particular
business transacts its operations, to which the resource
content is applicable.
Qualifier Name: spatial
Label: Spatial
Qualifier Type: element refinement
Definition: Spatial (geographic) characteristics of the intellectual
content of the resource.
Qualifier Name: temporal
Label: Temporal
Qualifier Type: element refinement
Definition: Temporal characteristics of the intellectual content of the
resource.
Qualifier Name: postcode
Label: Postcode
Qualifier Type: element refinement
27
Definition: Ethiopian postcode(s) applicable to the spatial coverage of
the resource content.
Comment: Postcode refers to the actual Ethiopian postcode(s) which is
relevant to the spatial coverage of the resource content.
This qualifier will be of particular use in describing
services.
12. Element Name: Description
Label: Description
Definition: An account of the content of the resource.
Obligation: Optional
Comment: Description may include but is not limited to: an abstract,
table of contents, reference to a graphical representation of
content (eg a thumbnail of an image), or a free-text account
of the content.
13. Element Name: Format
Label: Format
Definition: The physical or digital manifestation of the resource.
Obligation: Optional
Comment: Typically, Format may include the media-type or
dimensions of the resource. Format may be used to
determine the software, hardware or other equipment
needed to display or operate the resource. Examples of
dimensions include size and duration. Recommended best
practice is to select a value from a controlled vocabulary
(for example, the list of Internet Media Types [MIME]
defining computer media formats).
Qualifiers
Qualifier Name: extent
Label: Extent
http://dublincore.org/documents/dces/#mime
28
Qualifier Type: Element refinement
Definition: The size or duration of the resource.
Comment: The extent qualifier allows the description of the physical
dimensions, file size or duration of the resource.
Qualifier Name: medium
Label: Medium
Qualifier Type: Element refinement
Definition: The material or physical carrier of the resource.
14. Element Name: Language
Label: Language
Definition: A language of the intellectual content of the resource.
Obligation: Optional
Comment: Recommended best practice for the values of the Language
element is defined by using a two-letter Language Code
(taken from the ISO 639 standard [ISO 639]), followed
optionally, by a two-letter Country Code (taken from the
ISO 3166 standard [ISO 3166]). For example AM for
Amharic, OR for Ormomiffa, 'en' for English, 'fr' for
French, or 'en-uk' for English used in the United Kingdom,
etc.
15. Element Name: Mandate
Label: Mandate
Definition: A specific warrant which requires the resource to be
created or provided.
Obligation: Optional
Comment: The element is useful to indicate the specific legal mandate
which requires the resource being described to be created
or provided to the public. The content of this element will
usually be a reference to a specific Act, Regulation or Case,
http://www.oasis-open.org/cover/iso639a.htmlhttp://www.oasis-open.org/cover/country3166.html
29
but may be a URI pointing to the legal instrument in
question.
Qualifiers
Qualifier Name: act
Label: Act
Qualifier Type: Element refinement
Definition: A reference to a specific State or Federal Act which
requires the creation or provision of the resource.
Qualifier Name: regulation
Label: Regulation
Qualifier Type: Element refinement
Definition: A reference to a specific regulation which requires the
creation or provision of the resource.
Qualifier Name: case
Label: Case Law
Qualifier Type: Element refinement
Definition: A reference to a specific case which requires the creation or
provision of the resource
16. Element Name: Relation
Label: Relation
Definition: A reference to a related resource.
Obligation: Optional
Comment: Recommended best practice is to reference the resource by
means of a string or number conforming to a formal
identification system.
30
Qualifiers
Qualifier Name: isVersionOf
Label: Is Version Of
Qualifier Type: element refinement
Definition: The described resource is a version, edition, or adaptation
of the referenced resource. Changes in version imply
substantive changes in content rather than differences in
format.
Qualifier Name: hasVersion
Label: Has Version
Qualifier Type: element refinement
Definition: The described resource has a version, edition, or
adaptation, namely, the referenced resource.
Qualifier Name: isReplacedBy
Label: Is Replaced By
Qualifier Type: element refinement
Definition: The described resource is supplanted, displaced, or
superseded by the referenced resource.
Qualifier Name: replaces
Label: Replaces
Qualifier Type: element refinement
Definition: The described resource supplants, displaces, or supersedes
the referenced resource.
Qualifier Name: isRequiredBy
Label: Is Required By
31
Qualifier Type: element refinement
Definition: The described resource is required by the referenced
resource, either physically or logically.
Qualifier Name: requires
Label: Requires
Qualifier Type: element refinement
Definition: The described resource requires the referenced resource to
support its function, delivery, or coherence of content.
Qualifier Name: isPartOf
Label: Is Part Of
Qualifier Type: element refinement
Definition: The described resource is a physical or logical part of the
referenced resource.
Qualifier Name: hasPart
Label: Has Part
Qualifier Type: element refinement
Definition: The described resource includes the referenced resource
either physically or logically.
Qualifier Name: isReferencedBy
Label: Is Referenced By
Qualifier Type: element refinement
Definition: The described resource is referenced, cited, or otherwise
pointed to by the referenced resource.
Qualifier Name: references
Label: References
Qualifier Type: element refinement
Definition: The described resource references, cites, or otherwise
points to the referenced resource.
Qualifier Name: isFormatOf
32
Label: Is Format Of
Qualifier Type: element refinement
Definition: The described resource is the same intellectual content of
the referenced resource, but presented in another format.
Qualifier Name: hasFormat
Label: Has Format
Qualifier Type: element refinement
Definition: The described resource pre-existed the referenced resource,
which is essentially the same intellectual content presented
in another format.
Qualifier Name: isBasisFor
Label: Is Basis For
Qualifier Type: element refinement
Definition: The described resource pre-existed the referenced resource,
which is a performance, production, derivation, translation,
or interpretation of the described resource.
Qualifier Name: isBasedOn
Label: Is Based On
Qualifier Type: element refinement
Definition: The described resource is a performance, production,
derivation, translation, or interpretation of the referenced
resource.
17. Element Name: Rights
Label: Rights Management
Definition: Information about rights held in and over the resource.
Obligation: Optional
Comment: Typically, the Rights element will contain a rights
management statement for the resource, or refer to a service
providing such information. Rights information often
33
encompasses Intellectual Property Rights (IPR), Copyright,
and various Property Rights. If the Rights element is
absent, no assumptions can be made about the status of
these and other rights with respect to the resource.
18. Element Name: Source
Label: Source
Definition: A reference to a resource from which the present resource
is derived.
Obligation: Optional
Comment: The present resource may be derived from the Source
resource in whole or in part. Recommended best practice is
to reference the source resource by means of a string or
number conforming to a formal identification system.
19. Element Name: Type
Label: Resource Type
Definition: The nature or genre of the content of the resource.
Obligation: Optional
Comment: Type includes terms describing general categories, genres,
or aggregation levels for content. Recommended best
practice is to select a value from a controlled vocabulary
(for example, the working draft list of Dublin Core Types
[DCT1]). To describe the physical or digital manifestation
of the resource, use the FORMAT element.
Qualifiers
Qualifier Name: category
Label: Category
Qualifier Type: Element refinement
Definition: The generic type of the resource being described.
http://dublincore.org/documents/dces/#dct1
34
Comment: The value for this qualifier must be one of either service,
document, or agency.
Qualifier Name: aggregationLevel
Label: Aggregation Level
Qualifier Type: Element refinement
Definition: The level of aggregation of the resource being described.
Comment: There are only two values possible for this qualifier, either
item or collection.
Qualifier Name: documentType
Label: Document Type
Qualifier Type: Element refinement
Definition: The form of the resource where category = document.
Comment: Document is used in its widest sense and includes such
things as software sound files and images.
Qualifier Name: serviceType
Label: Service Type
Qualifier Type: Element refinement
Definition: The type of service being offered where category = service.
9.2 METADATA REGISTRIES (MDR)
Metadata registries (MDR) address the semantics of data, the representation of data, and
the registration of the descriptions of that data. It is through these descriptions that an
accurate understanding of the semantics and a useful depiction of the data are found.
An MDR is a database of metadata that supports the functionality of registration.
Registration accomplishes three main goals: identification, provenance, and monitoring
quality. Identification is accomplished by assigning a unique identifier (within the
registry) to each object registered there. Provenance addresses the source of the metadata
and the object described. Monitoring quality ensures that the metadata does the job it is
designed to do.
35
An MDR manages the semantics of data. Understanding data is fundamental to its design,
harmonization, standardization, use, re-use, and interchange. The underlying model for
an MDR is designed to capture all the basic components of the semantics of data,
independent of any application or subject matter area.
MDR's are organized so that those designing applications can ascertain whether a suitable
object described in the MDR already exists. Where it is established that a new object is
essential, its derivation from an existing description with appropriate modifications is
encouraged, thus avoiding unnecessary variations in the way similar objects are
described. Registration will also allow two or more administered items describing
identical objects to be identified, and more importantly, it will identify situations where
similar or identical names are in use for administered items that are significantly different
in one or more respects.
The increased use of data processing and electronic data interchange heavily relies on
accurate, reliable, controllable, and verifiable data recorded in databases. One of the
prerequisites for a correct and proper use and interpretation of data is that both users and
owners of data have a common understanding of the meaning and descriptive
characteristics (e.g., representation) of that data.
9.2.1 FUNDAMENTAL MODEL OF DATA ELEMENTS Data element is composed of two parts:
Data element concept (DEC) A DEC is concept that can be represented in the
form of a data element, described independently of any particular representation.
Representation The representation is composed of a value domain, data type,
units of measure (if necessary), and representation class (optionally).
From a data modeling perspective, a DEC should be composed of two parts: Object Class
and Property.
The object class is a set of ideas, abstractions, or things in the real world that can
be identified with explicit boundaries and meaning and whose properties and
behavior follow the same rules. Object classes are the things for which we wish to
36
collect and store data. They are concepts, and they correspond to the notions
embodied in classes in object-oriented models and entities in entity-relationship
models. Examples are cars, persons, households, employees, and orders.
The property is a characteristic common to all members of an object class.
Properties are what humans use to distinguish or describe objects. They are
characteristics, not necessarily essential ones, of the object class and form its
intension. They are also concepts, and they correspond to the notions embodied in
attributes (without associated datatypes) in object-oriented or entity-relationship
models. Examples of properties are color, model, sex, age, income, address, or
price.
It is important to distinguish an actual object class or property from its name. This is the
distinction between concepts and their designations. Object classes and properties are
concepts; their names are designations.
Complications arise because people convey concepts through words (designations), and it
is easy to confuse a concept with the designation used to represent it. For example, most
people will read the word income and be certain they have unambiguously interpreted it.
But, the designation income may not convey the same concept to all readers, and, more
importantly, each instance of income may not designate the same concept.
Not all ideas are named or expressed in simple natural language, either. For example,
"women between the ages of 15 and 45 who have had at least one live birth in the last 12
months" is a valid object class not easily named in English. Some ideas may be more
easily expressed in one language than in another. The German word Gtterdmmerung
has no simple English equivalent.
A data element is produced when a representation is associated with a data element
concept. The representation describes the form of the data, including a value domain,
datatype, representation class (optionally), and, if necessary, a unit of measure. Value
domains are sets of permissible values for data elements. For example, the data element
representing annual household income may have the set of nonnegative integers (with
units of dollars) as a set of valid values. This is its value domain. A data element
37
concept may be associated with different value domains as needed to form conceptually
similar data elements. There are many ways to represent similar facts about the world, but
the concept for which the facts are examples is the same. An example is the Data Element
Concept country of person's birth. ISO 3166-1 Country Codes contains seven different
representations for countries of the world. Each one of these seven representations
contains a set of values that may be used in the value domain associated with the DEC.
Each one of the seven associations is a data element. For each representation of the data,
the permissible values, the datatype, the representation class, and possibly the units of
measure, are altered.
See ISO/IEC 20943-1:2003, Information technology Procedures for achieving
metadata registry content consistency Part 1: Data elements for details about the
registration and management of descriptions of data elements.
9.2.2 DATA ELEMENTS IN DATA MANAGEMENT AND
INTERCHANGE Data elements appear in databases, files, and transaction sets. Data elements are the
fundamental units of data an organization manages, therefore they must be part of the
design of databases and files within the organization and all transaction sets the
organization builds to communicate data to other organizations.
Within the organization, databases or files are composed of records, segments, tuples,
etc., which are composed of data elements. The data elements themselves contain various
kinds of data that include characters, images, sound, etc. When the organization needs to
transfer data to another organization, data elements are the fundamental units that make
up the transaction sets. Transactions occur primarily between databases or files, but the
structure (i.e. the records or tuples) of the files and databases don't have to be the same
across organizations.
So, the common unit for transferring information (data plus understanding) is the data
element.
38
Figure 1: Data elements and other data concepts
Transaction Exchange Unit, Etc.
Database, file,
Etc.
Data Element
Identifier Definition Name Value Domain Etc
Database, file, etc
Class, Tuple, Etc
Field, column,
39
9.2.3 FUNDAMENTAL MODEL OF VALUE DOMAINS
A value domain is a set of permissible values. A permissible value is a combination of
some value and the meaning for that value. The associated meaning is called the value
meaning. A value domain is the set of valid values for one or more data elements. It is
used for validation of data in information systems and in data exchange. It is also an
integral part of the metadata needed to describe a data element. In particular, a value
domain is a guide to the content, form, and structure of the data represented by a data
element.
Value domains come in two (non-exclusive) sub-types: Enumerated Value domain and
Non-enumerated value Domain.
An Enumerated value domain is a value domain specified as a list of permissible values
(values & their meanings). It contains a list of all its values and their associated
meanings. Each value and meaning pair is called a permissible value. The meaning for
each value is called the value meaning.
A non-enumerated value domain is specified by a description. The non-enumerated
value domain description describes precisely which permissible values belong and
which do not belong to the value domain. An example of such description is the phrase
"Every real number greater than 0 and less than 1".
Each value domain is a member of the extension of a concept, called the conceptual
domain. A conceptual domain is a set of value meanings. The intension of a conceptual
domain is its value meanings. Many value domains may be in the extension of the same
conceptual domain, but a value domain is associated with one conceptual domain.
Conceptual domains may have relationships with other conceptual domains, so it is
possible to create a concept system of conceptual domains. Value domains may have
relationships with other value domains, which provide the framework to capture the
structure of sets of related value domains and their associated concepts.
Conceptual domains, too, come in two (non-exclusive) sub-types:
40
Enumerated conceptual domain A conceptual domain specified as a list of
value meanings.
Non-enumerated conceptual domain A conceptual domain specified by a
description.
The value meanings for an enumerated conceptual domain are listed explicitly. This
conceptual domain type corresponds to the enumerated type for value domains. The value
meanings for a non-enumerated conceptual domain are expressed using a rule, called a
non-enumerated conceptual domain description.
Thus, the value meanings are listed implicitly. This rule describes the meaning of
permissible values in a non-enumerated value domain. This conceptual domain type
corresponds to the non-enumerated type for value domains. See ISO/IEC TR 20943-3,
Information technology Procedures for achieving metadata registry content
consistency Part 3: Value domains for detailed examples.
A unit of measure is sometimes required to describe data. If temperature readings are
recorded in a database, then the temperature scale (e.g., Fahrenheit or Celsius) is
necessary to understand the meaning of the values. Another example is the mass of rocks
found on Mars, measured in grams. However, units of measure are not limited to physical
quantities, as currencies (e.g., US dollars, Lire, British pounds) and other socio-economic
measures are units of measure, too.
Some units of measure are equivalent to each other in the following sense: Any quantity
in one unit of measure can be transformed to the same quantity in another unit of
measure. All equivalent units of measure are said to have the same dimensionality. For
example, currencies all have the same dimensionality. Measures of speed, such as miles
per hour or meters per second, have the same dimensionality. Two units of measure that
are often erroneously seen as having the same dimensionality are pounds (as in weight)
and grams. Pounds is a measure of force, and grams is a measure of mass.
A unit of measure is associated with a value domain, and the dimensionality is associated
with the conceptual domain.
41
Some value domains contain very similar values from one domain to another. Either the
values themselves are similar or the meanings of the values are the same. When these
similarities occur, the value domains may be in the extension of one conceptual domain.
The following examples illustrate this and the other ideas in this sub-clause:
EXAMPLE 1 Similar non-enumerated value domains
Conceptual domain name Probabilities
Conceptual domain definition
Real numbers greater than 0 and less than 1
Value domain name (1) Probabilities 2 significant digits
Value domain description All real numbers greater than 0 and less than 1
represented with 2-digit precision.
Unit of measure precision 2 digits to the right of the decimal point
Value domain name (2) Probabilities 5 significant digits
Value domain description All real numbers greater than 0 and less than 1
represented with 5-digit
Precision Unit of measure precision: 5 digits to the
right of the decimal point.
EXAMPLE 2 Similar enumerated value domains
Conceptual domain name Countries of the world
Conceptual domain definition Lists of current countries of the world represented
as names or codes
Value domain name (1) Country codes 2 character alpha
Permissible values
42
Figure 2: Fundamental Model for Value Domains
9.2.4 FUNDAMENTALS OF CLASSIFICATION SCHEMES
A classification scheme is a concept system intended to classify objects.
It is organized in some specified structure, limited in content by a scope, and designed for
assigning objects to concepts defined within it. Concepts are assigned to an object, and
this process is called classification. The relationships linking concepts in the concept
system link objects that the related concepts classify. In general, any concept system is a
classification scheme if it is used for classifying objects.
The content scope of the classification scheme circumscribes the subject matter area
covered by the classification scheme. The scope of the classification scheme is the
broadest concept contained in the concept system of the scheme. It determines,
theoretically, whether an object can be classified within that scheme or not.
Concept systems and classification schemes in particular, can be structured in many
ways. The structure defines the types of relationships that may exist between concepts,
and each classification scheme can be used for the purpose of linking concepts to objects.
In a particular classification scheme, the linked concepts together with the other concepts
Value Domain
ConceptualDomain
(1:N) (1:N)
(1:N)
Non-exclusive sub-type Non-exclusive sub-types
Conceptual Domain
Value Domain
NON-ENUMERATED CONCEPTUAL
ENUMERATEDVALUE DOMAIN
NON- ENUMERATED VALUE DOMAIN
VALUE MEANING
PERMISSIBLE VALUES
VALUE DOMAIN
(1:N) (1:N) (1:N)
ENUMERATED CONCEPTUAL
DOMAIN
Conceptual Domain
43
related to the linked concept in the scheme provide a conceptual framework in which to
understand the meaning of the object. The framework is limited by the scope of the
classification scheme.
A concept system may be represented by a terminological system. The designations are
used to represent each of the concepts in the system and are used as key words linked to
objects for searching, indexing, or other purposes.
A special kind of concept system is a relationship system. There, the concepts are
relationship types. A relationship type has N arguments, and it is called an n-ary
relationship type. The statement "a set of N objects is classified by an n-ary relationship
type" means that the N objects have a relationship among them of the given relationship
type.
The classification region permits the registration and administration of all or part of a
classification scheme. Optionally, a classification scheme may be used to classify
administered items, the registered artifacts in a metadata registry.
Classification schemes of varying discriminatory power: key words, thesauri,
taxonomies, and ontologies could be used. These classification schemes have potentially
great utility for documenting objects in the real world, including administered items in an
MDR.
There are several purposes for applying classification to real world objects. Classification
assists users to find a single object from among a large collection of objects, facilitates
the administration and analysis of a collection of objects, and, through inheritance,
conveys semantic content that is often only incompletely specified by other attributes,
such as names and definitions.
Each type of classification scheme mentioned above has particular strengths and
weaknesses, and provides the foundation upon which particular capabilities can be built.
Keywords, for example, are a quick way to provide users some assistance in locating
potentially useful administered items. A thesaurus provides a more structured approach,
arranging descriptive terms in a structure of broader, narrower, and related classification
categories. A taxonomy provides a classification structure that adds the power of
44
inheritance of meaning from generalized taxa to specialized taxa. Ontologies, with
associated epistemologies, can provide rich, rigorously defined structures (e.g. directed
acyclic graphs with multiple inheritance) that can convey information needed by
software, such as intelligent agents and mediators that are useful in the provision of
intelligent information services. 9.2.5 ATTRIBUTES OF A CLASSIFICATION SCHEME
Classification schemes shall be registered in an MDR by recording their attributes.
Minimally, a registered classification scheme shall have an administration record and a
classification scheme type name. The following table lists the attributes of a classification
system that shall be recorded in an MDR.
Attribute Occurrences
Designation name One per Terminology Entry language
Designation preferred designation Zero or one per Terminological Entry
section
Designation language identifier One per Language section in each
Terminological Entry
Definition - definition text One per Terminology Entry language
section
Definition preferred definition Zero or one per Terminological Entry
section
Definition source reference Zero or one per Terminological Entry
section
Definition language identifier One per Language section in each
Terminological Entry
Context - administration record One per context
Context description One per context
Context description language identifier Zero or one per context
Classification scheme- type name One per classification scheme
Classification scheme- value One per classification scheme item
Classification scheme- type name One per classification scheme item
45
Classification scheme item relationship-
type description
One per classification scheme item
relationship type description
Administration Record item identifier One per classification scheme
Administration Record registration
status
One per classification scheme
Administration Record administrative
status
One per classification scheme
Administration Record creation date One per classification scheme
Administration Record last change date Zero or one per classification scheme
Administration Record effective date Zero or one per classification scheme
Administration Record until date Zero or one per classification scheme
Administration Record change
description
Zero or one per classification scheme
Administration Record administrative
note
Zero or one per classification scheme
Administration Record explanatory
comment
Zero or one per classification scheme
Administration Record unresolved
issue
Zero or one per classification scheme
Administration Record origin Zero or one per classification scheme
Reference Document identifier One per reference document
Reference Document type description Zero or one per reference document
Reference Document language
identifier
Zero or one per reference document
Reference Document title Zero or one per reference document
Reference Document organization
name
one or more per reference document
Reference Document organization mail
address
Zero or one per reference document
Submission organization name One per classification scheme
Submission organization mail address Zero or one per classification scheme
46
Submission - contact One per classification scheme
Stewardship - organization name One per classification scheme
Stewardship organization mail address Zero or one per classification scheme
Stewardship - contact One per classification scheme
Registration Authority Organization
name
One per classification scheme
Registration Authority Organization
mail address
Zero or one per classification scheme
Registration Authority Registration
Authority identifier
One per classification scheme
Registration Authority Documentation
language identifier
One or more per classification scheme
Registrar identifier One or more per classification scheme
Registrar contact One or more per classification scheme
9.2.6. STRUCTURE OF A METADATA REGISTRY
9.2.6.1 METAMODEL FOR A METADATA REGISTRY A metamodel is a model that describes other models. A metamodel provides a
mechanism for understanding the precise structure and components of the specified
models, which are needed for the successful sharing of the models by users and/or
software facilities.
The registry metamodel is specified as a conceptual data model, i.e. one that describes
how relevant information is structured in the natural world. In other words, it is how the
human mind is accustomed to thinking of the information.
As a conceptual data model, there need be no one-to-one match between the attributes in
the model and fields, columns, objects, et cetera in a database. There may be more than
one field per attribute and some entities and relationships may be implemented as fields.
The model shows constraints on minimum and maximum occurrences of attributes. The
constraints on maximum occurrences are to be enforced at all times. The constraints on
minimum occurrences are to be enforced when the registration status for the metadata
47
item is "recorded" or higher. In other words, a registration status of "recorded" indicates
that all mandatory attributes have been documented.
9.2.6.2 APPLICATION OF THE METAMODEL Some of the objectives of the metamodel for a Metadata Registry are to:
provide a unified view of concepts, terms, value domains and value
meanings;
promote a common understanding of the data described;
enable the sharing and reuse of the contents of implementations.
A metamodel is necessary for coordination of data representation between persons and/or
systems that store, manipulate and exchange data. The metamodel will assist registrars in
maintaining consistency among different registries. The metamodel enables systems tools
and information registries to store, manipulate and exchange the metadata for data
attribution, classification, definition, naming, identification, and registration. In this
manner, consistency of data content supports interoperability among systems tools and
information registries.
9.2.6.3 SPECIFICATION OF THE METAMODEL When using a model to specify another model, it is easy for the reader to become
confused about which model is being referred to at any particular point. To minimize this
confusion, this document deliberately uses different terms in the model being specified
from those used to do the specification.
The registry metamodel is specified using a subset of the Unified Modelling Language
(UML). This document uses the term "metamodel construct" for the model constructs it
uses, but "metadata objects" for the model constructs it specifies. The metamodel
constructs used are: classes, relationships, association classes, attributes, composite
attributes and composite data types.
However, there are certain parallels between the two models. For example, the "Object
Class" specified in the model is equivalent to the metamodel construct class used to
specify the model, and the Property specified in the model is equivalent to the
metamodel construct attribute used to specify the model. The different terms are used
to make it clear which model is being referred to, not because they represent different
48
concepts. One term that this document uses at both levels is data type, but the level to
which it applies should be apparent from the context in which it is used.
9.2.6.4 TYPES, INSTANCES AND VALUES When considering data and metadata, it is important to distinguish between types of
data/metadata, and instances of these types and their associated values. The metamodel
specifies types of classes, attributes and relationships. Any particular instance of one of
these will be of a specific type and at any point in time, that instance will have a specific
value.
A metadata registry will be populated with instances of these metadata objects (metadata
items), which in turn define types of data, e.g. in an application database. In other words,
instances of metadata specify types of application level data. In turn, the application
database will be populated by the real world data as instances of those defined data types.
9.2.6.5 DATE REFERENCES Dates are important attributes of an Administration Record and of operations of a
registry. For Example, for the Gregorian calendar date {see ISO 8601:2000} and the
associated default representation is YYYY-MM-DD (i.e. Year-Month-Day). For
example, 12 October, 2001 if referenced in numeric form should be 2001-10-12 and not,
for example, as 12-10-2001 (which might be confused with 10 December, 2001).
Standardized Ethiopian Date representation System should be used where data and
metadata consisting of Ethiopian Calendar system is encountered.
9.2.6.6 DATA DEFINITION REQUIREMENTS AND
RECOMMENDATIONS
A listing of the requirements and recommendations without explanations is provided in
this clause for convenience of the user. The intent is to facilitate ease of use of this
document once an understanding of the requirements and recommendations is achieved.
9.2.6.6.1 REQUIREMENTS A data definition shall:
49
a) Be stated in the singular
b) State what the concept is, not only what it is not
c) Be stated as a descriptive phrase or sentence(s)
d) Contain only commonly understood abbreviations
e) Be expressed without embedding definitions of other data or underlying concepts
9.2.6.6.2 RECOMMENDATIONS A data definition should:
a) State the essential meaning of the concept
b) Be precise and unambiguous
c) Be concise
d) Be able to stand alone
e) Be expressed without embedding rationale, functional usage, or procedural information
f) Avoid circular reasoning
g) Use the same terminology and consistent logical structure for related definitions
h) Be appropriate for the type of metadata item being defined
9.2.6.6.3 PROVISIONS
9.2.6.6.3.1 PREMISES Data is used for specific purposes. Differences in use require different operational
manifestations of some requirements and recommendations. The primary characteristics
deemed necessary to convey the essential meaning of a particular definition will vary
according to the level of generalization or specialization of the data. Primary and
essential characteristics for defining concepts such as airport in the commercial air
transportation industry might be specific, where a more general definition may be
adequate in a different context. Within a metadata registry, multiple equivalent
definitions may be written in different languages or, within a single language, for
different audiences such as children, general public, or subject area specialists. For a
discussion of relationships between concepts in different contexts and how characteristics
are used to differentiate concepts, see ISO 704, Clause 5. Definitions should be written to
facilitate understanding by any user and by recipients of shared data.
9.2.6.6.3.2 REQUIREMENTS
50
To facilitate understanding of the requirements for construction of well-formed data
definitions, explanations and examples are provided below. Each requirement is followed
by a short explanation of its meaning.
Examples are given to support the explanations. In all cases, a good example is provided
to exemplify the explanation. When deemed beneficial, a poor, but commonly used
example is given to show how a definition should NOT be constructed. To further
explain the differences between the good and poor examples, examples are followed by a
statement of rationale behind them. Note that the examples below are definitions for data
elements and these definitions are illustrative.
A data definition shall:
A) BE STATED IN THE SINGULAR
EXPLANATION - The concept expressed by the data definition shall be expressed in
the singular. (An exception is made if the concept itself is plural.)
EXAMPLE - Article Number
1) Good definition: A reference number that identifies an article.
2) Poor definition: Reference number identifying articles.
REASON - The poor definition uses the plural word articles, which is ambiguous,
since it could imply that an article number refers to more than one article.
B) STATE WHAT THE CONCEPT IS, NOT ONLY WHAT IT IS NOT
EXPLANATION - When constructing definitions, the concept cannot be defined
exclusively by stating what the concept is not.
EXAMPLE - Freight Cost Amount
1) Good definition: Cost amount incurred by a shipper in moving goods from one place
to another.
2) Poor definition: Costs which are not related to packing, documentation, loading,
unloading, and insurance.
REASON - The poor definition does not specify what is included in the meaning of the
data.
C) BE STATED AS A DESCRIPTIVE PHRASE OR SENTENCE(S) (in most
languages)
51
EXPLANATION - A phrase is necessary (in most languages) to form a precise
definition that includes the essential characteristics of the concept. Simply stating one or
more synonym(s) is insufficient. Simply restating the words of the name in a different
order is insufficient. If more than a descriptive phrase is needed, use complete,
grammatically correct sentences.
EXAMPLE - Agent Name
1) Good definition: Name of party authorized to act on behalf of another party.
2) Poor definition: Representative.
REASON - Representative is a near-synonym of the data element name, which is not
adequate for a definition.
D) CONTAIN ONLY COMMONLY UNDERSTOOD ABBREVIATIONS
EXPLANATION - Understanding the meaning of an abbreviation, including acronyms
and initialisms, is usually confined to a certain environment. In other environments the
same abbreviation can cause misinterpretation or confusion. Therefore, to avoid
ambiguity, full words, not abbreviations, shall be used in the definition.
Exceptions to this requirement may be made if an abbreviation is commonly understood
such as i.e. and e.g. or if an abbreviation is more readily understood than the full
form of a complex term and has been adopted as a term in its own right such as radar
standing for radio detecting and ranging.
All acronyms must be expanded on the first occurrence.
EXAMPLE 1 - Tide Height
1) Good definition: The vertical distance from mean sea level (MSL) to a specific tide
level.
2) Poor definition: The vertical distance from MSL to a specific tide level.
REASON - The poor definition is unclear because the acronym, MSL, is not commonly
understood and some users may need to refer to other sources to determine what it
represents. Without the full word, finding the term in a glossary may be difficult or
impossible.
EXAMPLE 2 - Unit of Density Measurement
52
1) Good definition: The unit employed in measuring the concentration of matter in terms
of mass per unit (m.p.u.) volume (e.g., pound per cubic foot; kilogram per cubic meter).
2) Poor definition: The unit employed in measuring the concentration of matter in terms
of m.p.u. volume (e.g., pound per cubic foot; kilogram per cubic meter).
REASON - m.p.u. is not a common abbreviation, and its meaning may not be understood
by some users. The abbreviation should be expanded to full words.
E) BE EXPRESSED WITHOUT EMBEDDING DEFINITIONS OF OTHER DATA
OR UNDERLYING CONCEPTS
EXPLANATION - As shown in the following example, the definition of a second data
element or related concept should not appear in the definition proper of the primary data
element. Definitions of terms should be provided in an associated glossary. If the second
definition is necessary, it may be attached by a note at the end of the primary definition's
main text or as a separate entry in the dictionary. Related definitions can be accessed
through relational attributes (e.g., cross-reference).
EXAMPLE 1- Sample Type Code
1) Good definition: A code identifying the kind of sample.
2) Poor definition: A code identifying the kind of sample collected. A sample is a small
specimen taken for testing. It can be either an actual sample for testing, or a quality
control surrogate sample. A quality control sample is a surrogate sample taken to verify
results of actual samples.
REASON - The poor definition contains two extraneous definitions embedded in it. They
are definitions of sample and of quality control sample.
EXAMPLE 2 - "Issuing Bank Documentary Credit Number"
1) Good definition: Reference number assigned by issuing bank to a documentary credit.
2) Poor definition: Reference number assigned by issuing bank to a documentary credit.
A documentary credit is a document in which a bank states that it has issued a
documentary credit under which the beneficiary is to obtain payment, acceptance, or
negotiation on compliance with certain terms and conditions and against presentation of
stipulated documents and such drafts as may be specified.
53
REASON - The poor definitio