70
Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data Architecture Subcommittee Chuck Mosher cmosher @ metamatrix.com

Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

Embed Size (px)

Citation preview

Page 1: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

Achieving Information Sharing in Federal Agencies via Data Services, SOA, and

Controlled Vocabularies

October 12, 2006

A Presentation for the Federal Data Architecture Subcommittee

Chuck Moshercmosher @ metamatrix.com

Page 2: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

2

Agenda

• Company Overview & Value Proposition• Data Services Rationale & Best Practices• MetaMatrix Products & Capabilities• Achieving Information Sharing

– Service Enabling Data Assets– Vocabularies & Semantic Interoperability– Bridging Structured/Unstructured Information

• Customer Use Cases• Summary, Q & A

Page 3: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

3

MetaMatrix Company Overview Uniform access to integrated information

Vision – Universal bridge between information-consuming applications and enterprise information resources.

Products – Lightweight design/deploy environment for project use. Enterprise-caliber information access system for enterprise deployments.

Market – Global 5000 Organizations– Government Intelligence Agencies– Homeland Security– Financial Services– Pharmaceutical, Life Sciences– Manufacturing, Telecommunications– Independent Software Companies

(ISVs)

Page 4: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

4

One of the three enablers which drives domain-wide visibility: “… is a standard enterprise data architecture — the foundation for effective and rapid data transfer and the fundamental building block to enable a common logistical picture.”

Army Lt. Gen. Claude Christianson

“If you look at all the trends in the IT arena over the past 30 to 40 years, we’ve moved into an environment where we’ve got faster networks, more powerful processors, but it really comes down to the data”

Michael Todd, DOD CIO office

Data Interoperability Is At The Very Core of The Transformation Sought by the Federal Government

Page 5: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

5

Dr. Linton Wells, as quoted in September’s NDIA Magazine, “…data compatibility may be an issue. Enabling digital interaction with nontraditional partners may require middleware or other programs that convert data from totally different formats …”

Page 6: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

6

NCES & Data Net-Centricity

Application

DBMS

Server

Application

DBMS

Server

“As-is” = Application Silos“To-be” = SOA Stack

XML-centric Information Abstraction (= Data Services)

How do you achieve? Loose coupling Map existing data to XML Multi-source requests Metadata visibility

Information security Service access Service discovery

Page 7: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

7

The Data Challenges

Resolving data semantic and structural mismatches

Web service enabling legacy data systems (i.e., Net Centricity)

Mapping data sources to vocabularies like C2IEDM, NIEM, GJXDM, TWPDES, etc….

Handling multi-source requests (data aggregation, mediation, fusion, federation)

Minimizing development and maintenance cost of custom code

Getting the right information to the right person at the right time requires:

Page 8: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

8

MetaMatrix – Quick Facts

Middle-ware, model-driven, data management

DoD proven (DISA, NSA, TRANSCOM, etc.)

Version 5 – Mature product which is still unique and ahead of the competition

NIAP certified and NSA-credentialed

Can handle the enterprise (or COI) perspective as well as the bottom-up perspective (data service enablement of legacy systems)

Can rapidly implement data integration strategies

Page 9: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

9

Some Key Value Propositions

Lower cost of new application development by 35 to 70% Data interoperability is accomplished using COTS vs code Reduce application maintenance costs

– Enable detection of changes in data structures– No re-coding needed when data structures change– Fewer systems to maintain

Avoid the need for replication (OHIO) Data owners keep control, managed access Data abstractions are reusable components that generate

tremendous value over time. More adaptive computing

Page 10: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

10

Agenda

• Company Overview & Value Proposition• Data Services Rationale & Best Practices• MetaMatrix Products & Capabilities• Achieving Information Sharing

– Service Enabling Data Assets– Vocabularies & Semantic Interoperability– Bridging Structured/Unstructured Information

• Customer Use Cases• Summary, Q & A

Page 11: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

11

Program Challenges• Multiple sources

• Different interfaces/drivers• Different physical structures• Different semantics

• Single interface to data desired• Real-time access to data• Performance• Maintainability as data changes• Maintainability as apps change

Mission Challenges• Time-to-deploy• Agility - Responsiveness to change• Automation – Reduce cost of new development and operations• ROI of enterprise information

Agency Challenges• 100’s/1000’s of data sources• 100’s/1000’s of applications• Multiple access points/modes for apps• Understanding relationships/semantics• Data consistency• Data reuse – bridging data silos• Support for Web Services & SQL• Control & manageability, compliance• Security & auditing

Information Resources

Communities of Interest

Information Challenges

?

Page 12: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

12

Information Virtualization

Information Resources

Communities of Interest

Information Virtualization Layer

Page 13: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

13

Information Virtualization

Unified Semantic Layer

Information Virtualization Layer

Data Federation Layer

Data Access/Connectivity Layer

Enterprise Data Sources

Unification of different concepts across systemsSingle-query access to heterogeneous systemsUniform, standardized access to any system

Page 14: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

14

What is a Data Service?

MasterData

OperationalData Store

AgencyApplication

Data Service

SQL SQL APICall

XML/SOAP

• Decouple data sources from application– Data implementation shielded

from application• Semantic/Format Mediation

– Standard vocabulary • Single access point

– Web Service/XML– SQL

• Federation– Single source or multi-source

• Scalability– Security, performance

Bridge theGap

SQL

Page 15: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

15

FEA DRM View on Data Services

DRM Version 2 Data Access Services• Context Awareness Services• Structural Awareness Services• Transactional Services• Data Query Services• Content Search and Discovery Services• Retrieval Services• Subscription Services• Notification Services

Service Types include:• Metadata / Data

• Structured / Unstructured• Read / Write• Push / Pull

Page 16: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

16

Data Service Layer in SOAClient Process & Applications

Data Sources

Data Services Layer

Message Services (ESB)

Business Services

Business Process Services

App App App App App App

Data Service Data Service Data Service Data Service Data Service

Page 17: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

17

Data Services: Architecture for the Ages

• Data Services Best Practices– Provide transparency across all sources

– Define known relationships today and accommodate future relationships

– Support independence of mission systems

– Support ownership of operational data sources at the source

– Provide accelerated mechanisms for integrating new sources

– Support existing security policy and add degrees of security

• The value of a managed metadata abstraction layer– "Future Proofing" (future standards, exchange models, platforms)

– Limited skill set requirements

– Fixed long term costs for integration middleware

• Building consensus– Assure data owners they will continue to have control, and …

– Vocabulary of existing production systems will not be impacted

– Offer an option where legacy data migration is not 'required' 1st

Page 18: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

18

Data,ContentSources

Logical Data Model

Data Services Approaches

T

Org, Person, Image,

Location

MaterializedLogical Model

<X>

</X>

<X>

</X>

<X>

<X>

<X>

</X>

<X>

</X>

<X>

<X>

Data Services for Multiple Purposes:

• Simplified access to value-added (tagged) data in real-time• Value-added (tagged) data materialized & staged

• Phased-in migration from legacy to new• Managed archiving via classification, retention tags

• Enhanced search via consistent content tags

Model-Driven Integration LayerModel-Driven Integration Layer

Data,ContentSources

Logical Data ModelT

Organization, Customer, Imagery, Location

MaterializedLogical Model

<X>

</X>

<X>

</X>

<X>

<X>

<X>

</X>

<X>

</X>

<X>

<X>

AgileInformation

Services

<X>

</X>

<X>

</X>

<X><X>

<X>

</X>

<X>

</X>

<X><X>

<X>

</X>

<X>

</X>

<X><X>

Enriched Data/Content Store

Page 19: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

19

Search Engine Index / Metadata Catalog

Master Data Person / Facility / Vehicle

Enterprise Data Services

Stage SOAApp’s

Federal Agencies

Data Access Services• SQL, Web Service/XML• Staged Data (optional)

OntologyMgmt /Reasoning

Enterprise Service Bus / Intranet / Extranet

Distributed Data Services

Land/Sea

State/Local• Security/Authentication• Operations Management • Error / Exception Management

• Orchestration• Encryption• High Availability

MediationXSLT, Multi-source

Information Exchange Topology

Page 20: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

20

Agenda

• Company Overview & Value Proposition• Data Services Rationale & Best Practices• MetaMatrix Products & Capabilities• Achieving Information Sharing

– Service Enabling Data Assets– Vocabularies & Semantic Interoperability– Bridging Structured/Unstructured Information

• Customer Use Cases• Summary, Q & A

Page 21: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

21

MetaMatrix I.P.

MetaMatrix has 2 distinct innovations that work in concert to yield significant business benefits:

Model-basedModel-based

ExtensibleExtensible

Sharable, reusableSharable, reusable

Standards-basedStandards-based

Information ModelingInformation Modeling

Federated QueryingFederated Querying

Cost-based optimizerCost-based optimizer

Read/write/transactionsRead/write/transactions

Uniform API, any sourceUniform API, any source

Battle-tested/hardenedBattle-tested/hardened

Page 22: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

22

MetaMatrix Enterprise Data Services

• Project-level or Enterprise-wide data services layer– Integrated views of data from multiple sources– Metadata-driven – Optimized performance– Interoperable security

• Complements BI, ETL, ESB/EAI, DQ, CDI, Search

Page 23: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

23

Designing data services

Modeling Instead of Coding

xml

databases

warehouses

spreadsheets

services

<sale/> <value/></ sale >

geo-spatial

rich media

…Enterprise Enterprise Information Information

Sources (EIS)Sources (EIS)

Information Information ConsumersConsumers

Reusable,Reusable,Integrated Data Integrated Data

ObjectsObjects

ExposedExposedDataData

ServicesServices

<WSDL><WSDL>(contract)

<WSDL><WSDL>(contract)

<WSDL><WSDL>(contract)

Custom Apps

Web Services,Business Processes

Packaged Apps

Reporting, Analytics

EAI, Data warehouses

OD

BC

JDB

CS

OA

P

Logistics

Intelligence

Page 24: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

24

MetaMatrix Designer

• Shows structural transformations from one or more other classifiers

• Defines transformations with– Selects– Joins– Criteria– Functions– Unions– User Defined

Data Service Abstraction Layers:Broker, translate, aggregate, fuse or integrate data.

Virtual ModelsPhysical Models Representing

Actual Data Sources

Page 25: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

25

MetaMatrix Integration Server

Information Consumers

Web Svc XML RDBMSPackaged Connectors

Siebel,SAP

OracleApps

CICSVSAM

MetaMatrix Catalog

MetaMatrix Designer- Design and deploy data services

MetaMatrix Products

JMSODBC JDBC SOAP

QueryProcessor

ProcessorProcessorOptimizerOptimizer

Integration ServerVirtualDataBases

VDBVDBVDBVDB

IntegratedSecurity

UsersUsers

RolesRoles

EntitlementsEntitlements

AccessModels

ViewsViews XMLDocsXMLDocs

<a>

</a>

<b>

</b>…

ServicesServices

in outproc

MetaMatrix Connector Framework

MetaMatrixServer

Page 26: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

26

Secure Access – Accredited

MetaMatrixClient AppClient Appusernamepassword

Membership Provider

Membership Provider

usernamepassword

authenticates

Connector

Connector

Connector

Connector

Data Source

Optionally accessessource-specific information

source-specific

trustedpayload

MetaMatrixClient AppClient App

Membership Provider

Membership Provider

Authentication Service

Authentication Service

logoninfo

authenticates,generates payload

trustedpayload payload

trustedpayload

authenticates,optionally modifies payload

payload

Username/Password Logon • Connector connects with same ID for all queries• Optional: Integrated with existing authentication system

Trusted Payload Logon:• Connector uses different credentials per connection, per query • Optional: Integrated with existing authentication system

Data Source

Page 27: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

27

Process X

Process Y

Processes[BPM/BPEL]

Ontologies[OWL/RDF]Taxonomies

ServiceA

ServiceB

Web Services [WSDL]

Classification Schemes

Taxonomy A

KeyWords B

Relational

XMLXML

XML

XMLTransformations

DatatypesXMLRel

Rel

Domain[UML/ER]

MetaMatrixDesigner

MetaMatrixCatalog

GenericTypedRelationships

Models & Files[versioned]

Models & Files[versioned]

Search Index

Search Index

Web Reporting

Web Reporting<X>

</X>

<X>

</X>

<X>

</X>

<X>

</X>

WSDL

Public Health

Justice

Environment

Geo-spatial

Recreation

Immunization

Warrant

Wildlife

Camping

Public Health

Justice

Environment

Geo-spatial

Recreation

Immunization

Warrant

Wildlife

Camping

Public Health

Justice

Environment

Geo-spatial

Recreation

Immunization

Warrant

Wildlife

Camping

Public Health

Justice

Environment

Geo-spatial

Recreation

Immunization

Warrant

Wildlife

Camping

Public Health

Justice

Environment

Geo-spatial

Recreation

Immunization

Warrant

Wildlife

Camping

Public Health

Justice

Environment

Geo-spatial

Recreation

Immunization

Warrant

Wildlife

Camping

Application/Configuration

Managing Data Service Metadata

Page 28: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

28

MetaMatrixEnterprise

MetaMatrixDimension

MetaMatrixQuery

MetaMatrix Product Lines

MetaMatrix Enterprise • Web services & SQL• Modeling enterprise data• Scalable deployment server• Metadata management• Application/legacy connectors

MetaMatrix Dimension • Web service-enablement of data sources• Expose business views as XML• Lightweight modeling – rapid integration• Standard WAR-based deployment

MetaMatrix Query • Embeddable Java component • Federated query engine• Query optimization• Standard JDBC to all sources• Standard SQL to all sources

En

terp

rise

Pro

ject

, No

de

ISV

/ P

roje

ct

Page 29: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

29

Agenda

• Company Overview & Value Proposition• Data Services Rationale & Best Practices• MetaMatrix Products & Capabilities• Achieving Information Sharing

– Service Enabling Data Assets– Vocabularies & Semantic Interoperability– Bridging Structured/Unstructured Information

• Customer Use Cases• Summary, Q & A

Page 30: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

30

T

«Text File»

«Relational»

«Application»MetaMatrix:Mapping from Data to XML

Source: Data Sources containingInformation to integrate

Target: Fixed (potentially complex) XML SchemaNeed:Data complying to Schema

Mediation: XML From Non-XML Sources

«XML»

<person> <addresses> … </addresses> <accounts> <accountID=…> … </accountID> </accounts></person>

Page 31: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

31

• Model XML Docs, Schemas

• Build XML Doc. models from XML Schemas

• Map XML Doc. models to other data models

• Enable data access via XML

Map Data Sources to XML & Deploy

MetaMatrix Designer – for XML-centric Data Services

Page 32: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

32

Dimension – Choose your approach

• Rapid design & deployment of Web Services• Expose integrated data as XML-based business views• Deployment of Web Services as standard Web apps• Runtime execution optimized through use of MetaMatrix Query Engine

Dimension Models

Web Server

Data Sources

Business Views

<XML><XML><XML>

Web Service Operations

WSDLXSD

Source Models

DeployImport Map Model

WARastoto

Start Here?

Start Here?

Page 33: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

33

Agenda

• Company Overview & Value Proposition• Data Services Rationale & Best Practices• MetaMatrix Products & Capabilities• Achieving Information Sharing

– Service Enabling Data Assets– Vocabularies & Semantic Interoperability– Bridging Structured/Unstructured Information

• Customer Use Cases• Summary, Q & A

Page 34: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

34

T

Authoritative Sources:• Mapped to logical

Multiple Internal/External Information Sources

Application views of information:

• Relational, XML

T T

XML Document<a>

</a>

<b>

</b>…

T

TT

ODBC/JDBC JDBC SOAP

WebServices

WebServices

Search Applications

Search Applications

BusinessIntelligence

Applications

BusinessIntelligence

Applications

Logical Data Model:• Agency or COI-specific• Rationalize, harmonize,

mediate

C2, Logistics, Intelligence, …

COI Data Dictionary

bldg_id SITENUM Facility_ID

Location_ID

bldg_type Depot_Number

Location_Type

Page 35: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

35

FBI CBP NYC NY NJ

SemanticData Services

Matched (Confidence of 90%)

Gender ID

Person Sex Code

Ontology

“Sex” semantically related to “Gender”

Semantic Matching - example

Data Sources

Semantic Data Services– key component of information sharing

and interoperability programs – automated semantic mapping to aid

domain experts in quickly reconciling disparate schemas and vocabularies

– more rapid deployment of a mediation solution

MatchIt – an extensible ontology-driven tool– variety of algorithms for determining

semantic equivalence– discovers similarities between

elements of heterogeneous data, automatically exposing potential semantic matches.

– matches elements of data sources to target schemas of Data Services, such as TWPDES, GJXDM, NIEM, C2IEDM, HL7

Page 36: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

36

Automated Term Discovery (Interpret)

A comprehensive list of terms automatically discovered across all sources

All the available definitions found in the MatchIT knowledge-base

All the usage instances where each term was used in any of the sources

Results of the automated

tokenization

Page 37: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

37

Contextualize (Interpret)

Automated term tokenization

Automated semantic linking using the default knowledge-base contained within MatchIT

ArticleAmount

Amount Article

Sum

Assets

Creation

Synonym

Type-of

Page 38: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

38

Semantic Matching (Mediate)

• With relationships pre-established within the knowledge-base…

• Identify the Target and the Source(s) and run the match.

ArticleAmount

ProductShares

Automatically linked by a specific % distance

Page 39: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

39

Facilitate Decision Making (Mediate)

Helps facilitate rapid decision making

Target element for matching

Automatically calculated semantic distance between terms

Source candidate for matching

Page 40: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

40

J-8 Force Structure

J-7 Operational Plans

J-6 C4CS

TData Sources- Authoritative- Redundant

- Overlapping

Multiple Internal/External Information Sources

T T

ODBC/JDBC JDBC SOAP

WebServices

WebServices

Portal Applications

Portal Applications

BusinessIntelligence

Applications

BusinessIntelligence

Applications

Enterprise-wide or COI-driven Data Models

• Rationalization• Harmonization• Data Catalogs (DDMS)

Support Multiple Enterprise Semantic Models

J-5 Plans & Policy

J-4 Logistics (GCSS)

J-3 Operations

J-2 Intelligence

J-1 Manpower / Personnel

Page 41: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

41

Why Vocabulary Management?

– Knowledge lies everywhere - you must involve data from disparate sources

– The volume and disparately of data is too significant - you must enable machine involvement

– Using semantics is not enough - you must be able to leverage domain concepts and terminologies

– You must have the ability to infer relationships across the data

You can’t act on data alone!

Page 42: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

42

Benefits of Vocabulary Management

• Develop reusable information models and schemas

• Implicitly improves data integrity

• Capture business and technology requirements in a single vocabulary

• Capture institutional knowledge

• Enables semantic mining techniques for deeper data discovery and information sharing

• Accelerate interoperability, web services and SOA development and deployment

• Establish and maintain a common relationship across data sources

• Establish and maintain compliance with industry exchange models

• Reduce IT expenses by leveraging data in its native source

• Reduce IT expenses associated with building and maintaining partner integration

• Improved information sharing directly enhances decision making

Page 43: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

43

Knoodl.com - from Revelytix

• A publicly-available collaborative wiki for collaborative vocabulary/ontology development

• Extends the wiki metaphor with a formal model for semantic markup

• Ideal for – Community of Interest (COI) based OWL

development– Domain vocabulary creation and management– OWL registry/repository

• Scheduled to go live 30 Oct 06

Page 44: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

44

Enterprise Model (UML)

Data Models(Relational, XML)XML

XMLXML

Physical Sources

Model & Relate information within any domain

Ontology Models(e.g. OWL, RDF)

Relate information in different domains/models

Search within and across domains for related information

Integration Driven By Semantics

Page 45: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

45

Ontology-Driven Integration Example

Land

4 Wheel

2 Wheel

TruckBus Car

Fuel Truck

CargoTruck

Transportation T

T

T

T

equivalence

equivalence

equivalence

equivalence

Logical Views Physical SourcesOntology

Page 46: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

46

Agenda

• Company Overview & Value Proposition• Data Services Rationale & Best Practices• MetaMatrix Products & Capabilities• Achieving Information Sharing

– Service Enabling Data Assets– Vocabularies & Semantic Interoperability– Bridging Structured/Unstructured Information

• Customer Use Cases• Summary, Q & A

Page 47: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

47

Person Search - Conceptual Use Case

EnterpriseInformation:AddressesOrganizationsAffiliationsAccountsTransactionsCall HistoryAgreementsPolicies

Relationships inherent in the search results link to enterprise apps, databases, and other repositories

Page 48: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

48

Incorporating Enterprise Data into Search

• The usefulness of an organization's data is dependent upon understanding and applying context– In a typical text search application, context is supplied by

document content, or metadata tags (filename, author, date, etc.)– An organization's structured data sources do not usually lend

themselves to document-centric approaches

• The context of structured data relies on:– metadata (typically implicit) for table names, column names,

datatypes, and business descriptions for each– implied DB relationships such as foreign keys between tables– relationships (mappings) to a business data dictionary

• The volume of structured data requires a combination of indexed and non-indexed approaches

Page 49: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

49

MetaMatrix and Google

MetaMatrixServer

RDBMS

ERP, CRM…

ContentRepository

LegacySystems

GoogleSearch

Appliance

ContentRepository

...

CustomApplication

ContentRepository

Text Search w/ filtering

criteria (optional)

Structured Data crawling & index

build

Navigate to related data from Search

UI

HTML I/F

HT

ML

I/F

JDBC

Connect

or

Fram

ew

ork

Select & drill down to discover record details,

related data links, & metadata

Field name look-up in

Business Data Dictionary

1

2

3

4

Page 50: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

50

Data Source Schema (as is)

Page 51: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

51

• Transformations from one or more sources

• Transformations defined with:– Joins/unions– Criteria– Functions

• Elements mapped to dictionary

• Business definitions captured

Enhanced Data Model for Search

Page 52: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

52

Agenda

• Company Overview & Value Proposition• Data Services Rationale & Best Practices• MetaMatrix Products & Capabilities• Achieving Information Sharing

– Service Enabling Data Assets– Vocabularies & Semantic Interoperability– Bridging Structured/Unstructured Information

• Customer Use Cases• Summary, Q & A

Page 53: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

53

Major US Federal Government Customers

NSA - Multiple Programs (NES Base-lined) In-Q-Tel/CIA TRANSCOM – Command Metadata

Management System Air Force - Command and Control Center DISA - Global Combat Support Systems

(GCSS) DISA – Anti Drug Network (ADNET) DLA – Integrated Data Environment (IDE) Mitre – Air Force ESC/DoD DDMS work UK – NSA Equivalent, CJIT

Page 54: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

54

DISA GCSS – Customer Use Case

• Global Combat Support System (GCSS)– Mission: supply the war-fighter with access to accurate and

timely logistics information

• Focused Logistics– Fusion of information technologies to enable forces of the future

to be more mobile and versatile – Provides the joint war fighter with a single capability to manage

and monitor units, personnel, and equipment– Deployed at 23 sites around the world– Networked environment allows DoD users to access shared

data & applications, regardless of location

• Conducted and comprehensive evaluation/competitive procurement and selected MetaMatrix

Page 55: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

55

GCSS Architectural Overview

CSDE

MetaMatrix

Sun V880Solaris 9/10:WL8.1

Portal

NGA

Sun 280RSolaris 9:WL8.1

WebCOP

Sun 280RSolaris 9:

GDSS JTAV GSORTS JOPES(JOPES2K)

Theater Data Sources (also TMS)

GTN

Ligthhouse

PortalClients

Web Browser

DoD PKIDirectory

CSDSFLIS

DMDC

Oracle

BI tool

Electronic Battlebook

Query Tools

Watchboard

Force Closure

WebServices Web Services

Page 56: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

56

CFDB

CSDS DMDC

GSORTSIDE/AVNGA

FLIS

CSDS_PL

CSDS_VBL

Facilities_VMLMaterial_VML

Facilities_VQLMaterial_VQL

GDSS

Plans_VQL

Pri

vate

Dat

a an

d M

etad

ata

Pu

bli

c D

ata

Virtual Mid Layer (VML)

Virtual Query Layer (VQL)

(Exposed Views)

JOPESClassic

JOPES4.0

Virtual Base Layer

(VBL)

Physical Layer

(PL)GTN

GCSS Modeling Approach

Page 57: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

57

UID

CMDMETOC

JSA

ISR

JC3IEDM

C2IEDM VMF

SADL

AFC2ISRC's Air Ops Data Unification

ADOCS

USMTF

Link-16

TBONEGCCS

GCSS

UNIFIED

VOCABULARYDCGS

BFT

TST

CommunitiesOf Interest

Data Standards

Programs

Mobility Ops

Page 58: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

58

US TRANSCOM: Metadata Federation

Integrate diverse sources of metadata to achieve enterprise-wide, end-to-end systems analysis and impact awareness

CRIS(DTS Metadata) MetaBase

Metadata Integration Layer

ERWin

MetaMatrix VDB

Metadata Search& Reporting

Common Metadata Repository Viewer

ImportDTS-ERWinRelationships

Page 59: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

59

Data Relationships in CMDR Viewer

Source System

Target System

Interface Template

Interface Template Elements

Source System

Related Entity & Attributes from ERWin Model

New Source-to-Interface-to-Target Relationship Submitted to Repository

Page 60: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

60

MetaBase

• Data dictionary• Integration paths• Portal metadata

NSA’s E-Space Portal for STRATCOM

Data Host(s)

ODSCached

Data

MissionDataMissionDataMissionData

Sources

MetaMatrixQuery Engine

Information IntegrationHost(s)

Application ServerHost(s)

Query Service(Business Layer)

Presentation Layer

Query Page

Control

ResultsPage

Control

FormPage

Control

Browser

Browser

Browser

Client(s)PlannersOperators

T

Bottom-up(harmonization)

<X>

</X>

<X>

</X>

<X><X>

Top-down (mapping)

Org, Person, Image,

Location

Portal Metadata• Name• Information Context• Usage• Description• Display Name• Default Value• Label• Attribute Units• Logical Operator(s)• Presentation Type• Sort Order• Visible

ExternalFeeds

Page 61: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

61

Agenda

• Company Overview & Value Proposition• Data Services Rationale & Best Practices• MetaMatrix Products & Capabilities• Achieving Information Sharing

– Service Enabling Data Assets– Vocabularies & Semantic Interoperability– Bridging Structured/Unstructured Information

• Customer Use Cases• Summary, Q & A

Page 62: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

62

The Path to Information Sharing

• Import or reverse engineer resources• Import exchange models & knowledge-bases• Metadata repository for storing/relating/querying

• Discover terminologies• Relate to embedded knowledge-base• Inventory, assess, & analyze resources

• Automate semantic matching• Decision facilitation• Data service creation

• Create vocabularies• Domain collaboration

GATHER

INTERPRET

MEDIATE

RELATE

Page 63: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

63

Synergistic products

MetaMatrix Use importers (JDBC, ODBC, UML, ERwin, Popkin, and XML Schema) Store in integrated metadata repository

MetaMatrix & MatchIT Automated symbol discovery Present “target” logical model Inventory & assess resources

MetaMatrix & MatchIT Import domain vocabularies Disambiguate match sets using semantics Create enterprise or domain-level data services Map to schema-compliant XML documents

Knoodl.com & MetaMatrix Ontology or Knowledge-base Mgmt. Use domain knowledge to mapping

GATHER

INTERPRET

MEDIATE

RELATE

Bottom-Up or Top-Down

Page 64: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

64

• On-demand information– Real time data integration– Information sharing between business units

• Enabling SOA in an evolving world– Consume and produce Web services– And still provide full support for ODBC, JDBC, and legacy

• Federation of disparate information– Rationalized to controlled vocabularies– Relational + XML + Web Services + Enterprise Apps + Legacy

• Faster time to market– Integrated information in days, weeks– Tight coupling of design & implementation phases– Leveraging the skill-set of the data architects for integration

• Costs across application lifecycle reduced– Model-driven abstraction layer eases development/maintenance– Better management of data assets across the enterprise

MetaMatrix Value PropositionMetaMatrix Value PropositionRapid, cost-effective COTS tool for enterprise

information integration and exchange

Page 65: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

Achieving Information Sharing in Federal Agencies via Data Services, SOA, and

Controlled Vocabularies

October 12, 2006

A Presentation for the Federal Data Architecture Subcommittee

Chuck Moshercmosher @ metamatrix.com

Page 66: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

Additional Technical Material

October 2006

Chuck Mosher

cmosher @ metamatrix.com

Page 67: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

67

• NIAP Certification in process– Common criteria– Evaluation Assurance Level 2 (EAL2)– Security Target document completed– Cygnacom – testing, validation– http://www.cygnacom.com/labs/sel_epl.htm

• DCIDS 6/3– Protection Level 3 (PL3), Sept-Oct 2004– Ft. Meade Enterprise Information Technology Center– Working in conjunction with X7 group– Hosted on Sun Solaris for specific Ft. Meade program

Certifications

Page 68: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

68

Data

Model

Meta-model

Meta Object Facility (MOF)

Page 69: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

69

ConnectorConnector

ResultRequest

Connector Framework

Connectors use the framework + metadata

to integrate new sources quickly – avoids significant cost, time of

new wrappers.

Connectors use the framework + metadata

to integrate new sources quickly – avoids significant cost, time of

new wrappers.

Any Information Source

Translator turns MM requests into source-specific requests, and translates results.

Connection holds the (pooled) connection, sends requests, receives responses

Connector

Translator

Translate Input

Translate Output

Connection

Read Response

Write Request

MetaMatrix Query Engine

Translator

Translate Input

Translate Output

Connection

Read Response

Write Request

Translator

Translate Input

Translate Output

Connection

Read Response

Write Request

Connector Framework

Page 70: Achieving Information Sharing in Federal Agencies via Data Services, SOA, and Controlled Vocabularies October 12, 2006 A Presentation for the Federal Data

70

MetaMatrix Complements ESB’s

Dimension adds the following capabilities to an ESB…• Rich, advisor-based, model-driven design tool• Ability to leverage data models and manage metadata• Clear way to visualize and define mappings between non-

XML sources and XML views (even for complex industry schemas – C2IEDM, NIEM, GJXDM, HL7, XBRL)

• Ability to do SQL-based transformations, not just XSLT (including multi-source, complex joins and unions)

• Query planner/optimizer that makes intelligent decisions about whether to execute transformations “at the source” vs. “on the bus”

• Automated semantic matching & generation of transformations

Data Services to connect ESB’s to Enterprise Data