Upload
talulah-taylor
View
17
Download
1
Embed Size (px)
DESCRIPTION
DAMES: Data Management through e-Social Science. e-Science approaches in DAMES Simon Jones Department of Computing Science and Mathematics University of Stirling. Rationale. We aim to investigate and develop: - PowerPoint PPT Presentation
Citation preview
DAMES - Data Management through e-Social Science
1
DAMES: Data Management through e-Social Science
e-Science approaches in DAMES
Simon JonesDepartment of Computing Science and Mathematics
University of Stirling
DAMES - Data Management through e-Social Science
2
RationaleWe aim to investigate and develop:• ‘e-Infrastructure’ services targeted to data
management requirements across a rich range of social science data resources
• An internet ‘portal’ that will make available a variety of specific data resources as Grid services– augmented with portfolios of tools for supporting the
processes involved in data management
DAMES - Data Management through e-Social Science
3
Approaches (on-going!)
• Social science data resources are distributed, disaggregated and heterogeneous– Metadata description– Semantically-based discovery– Data abstraction/virtual fusion
• Easy but secure access is required– "Virtualization"/"fusion", workflow support for SS– Fine grained authorisation infrastructures
DAMES - Data Management through e-Social Science
4
Meta-data support• Existing metadata standards have been
assessed– Data Documentation Initiative, DDI– Statistical Data and Metadata eXchange, SDMX– UK Data Archive– Nesstar
• Focussing continuing work on exploiting DDI3 – DAMES must engineer compatibility with currently
used metadata aproaches
DAMES - Data Management through e-Social Science
5
Semantically based data discovery
• To extend data discovery through metadata DAMES will develop techniques for data discovery through semantic queries
• OWL ontology framework: to give meaning to data resources
• OWL-S: for developing semantic grid services
• DAMES will support a Grid service for registration and discovery of data resources using semantic grid techniques
DAMES - Data Management through e-Social Science
6
Data from heterogeneous sources• Data abstraction can help with
heterogeneity:– Support for accessing data content without
regard to detailed representation– Metadata support is essential
• Extending current work using OGSA-DAI to deal with a wider variety of SS formats: e.g. SPSS, Stata
• A Grid service to give uniform access to underlying data
DAMES - Data Management through e-Social Science
7
Data from multiple sources• Related data sources may need processing as
if combined– The sources may be distributed and heterogeneous
• DAMES will investigate "virtual fusion" techniques– Leveraging data abstraction and effective metadata
support
• Uniform query processing Grid services will be developed– Related to DQP
DAMES - Data Management through e-Social Science
8
Support for e-Social Science:Workflows
• This research will focus on adapting and extending workflow modelling approaches– BPEL, ebXML, Taverna, WHIP
• Typical social science applications will be supported by workflows, e.g.– Occupational analysis, census analysis, social care
• A visual design tool will be developed for defining new workflows in e-Social Science– Integrated into the DAMES Portal– With execution support
DAMES - Data Management through e-Social Science
9
GEODE: Grid Enabled Occupational Data Environment
• Previous SS/CS collaboration at Stirling• Occupational scheme linking is a common
practice for researchers• Geode enables a virtual community of
occupational information researchers– Portal gateway for occupational information– Data abstraction– Uniform access to resources– Occupational matching services
• Demonstrates viability of the DAMES approach
DAMES - Data Management through e-Social Science
10
GEODE prototype• Windows environment• Java• GridSphere Portal Framework• Globus Toolkit 4
– Index Service (Virtual Organization)– OGSA-DAI WSRF (Data Access Middleware)
• Custom OGSA-DAI resources and activities• Accesses CSV, Relational data resources
DAMES - Data Management through e-Social Science
11
Example: Grid Enabled Occupational Data Environment (GEODE)
DAMES - Data Management through e-Social Science
12
Summary• Distributed, disaggregated, heterogeneous
data sources need:– Metadata– Semantically-based discovery– Data abstraction/virtual fusion– Specialised SS workflows– Security (later in workshop)
• GEODE gives a springboard for GE*DE