Upload
osthus
View
333
Download
3
Embed Size (px)
Citation preview
©2016 Allotrope Foundation
Allotrope Foundation: Driving Metadata & Master Data Management through Improved Data Modeling with Semantic Technologies
Dana Vanderwall, Ph.D.Director, Biology & Preclinical IT (BMS)
Vice Chair, Board of Directors (Allotrope)
Eric Little, Ph.D.Vice President, Data Science (OSTHUS)
Adjunct Professor (NYU Polytechnic School of Engineering )
©2016 Allotrope Foundation
The Current Situation in the LabMany challenges exist for data to be captured, integrated and shared• Data Silos• Incompatible instruments and
software systems• Legacy architectures are brittle
and rigid• SME knowledge resides in
people’s heads• Data schemas are not explicitly
understood• Lack of common vision between
business units and scientists
2
©2016 Allotrope Foundation
How do we change that?
3
Data in Standard Format
Metadata in a Standard vocabularyRegulatory GuidanceMethodsRecipesSOPs…
Vendor-Specific Formats
ProcessMaterial
EquipmentResult
©2016 Allotrope Foundation
Allotrope Foundation: Driving the Change
4
• Subject Matter Experts• Project Funding
Member Companies
• Project Management• Legal & Logistical Support
Secretariat
• Framework Development• Technical Leadership
ProfessionalSoftware Firm
• Requirements & Specifications• Contributions, PoC Applications
Partner Network
AbbVieAmgenBaxterBayer
BiogenBoehringer IngelheimBristol-Myers SquibbEli Lilly
Genentech/RocheGlaxoSmithKlineMerck & Co.Pfizer
ACD/LabsAgilent BioviaBrukerBSSNIDBSLabAnswerLabVantageLEAP TechnologiesMestrelab Research
Mettler ToledoPerkinElmerPersistent SystemsRiffynSartoriusShimadzuTetra ScienceThermo ScientificWaters
Erasmus Univ. Med CenterJ. Paul Getty Trust(UK) Science and Technology Facilities CouncilUniversity of SouthamptonUniversity of Strathclyde
©2016 Allotrope Foundation
Allotrope Data Format (ADF)
5
Data DescriptionRDF Model
Data Cubes Universal data container
Data Package Virtual file system *
Contains:• Method, instrument, sample,
process, result, etc.• Data cube metadata• Data package metadata• …
Analytical data represented by one- or multidimensional arrays.
HDF5Platform Independent File Format
Allotrope Data Format
Analytical data represented by arbitrary formats, incl. native instrument formats, images, pdf, video, etc.
Specifically designed to store and organize large amounts of numerical data.
API
s (J
ava
& .
NET
cla
ss lib
rari
es)
v1.0 ADF, Taxonomies, Class Libraries released Sept 2015, v1.1 April 2016
©2016 Allotrope Foundation
Moving from Data Format to Semantics• Has its origins in philosophy - generally understood as the abstract study
of meaning• Distinguished from syntax – which is the rules-based grammar of a
language
6
“Washington”
©2016 Allotrope Foundation
Allotrope Foundation Taxonomies (AFT)
7
©2016 Allotrope Foundation
Result
Process
Equipment
8©2016 Allotrope Foundation
Allotrope Taxonomies Standardize our Metadata
©2016 Allotrope Foundation
Utilizing the Semantic Spectrum (Moving Beyond Taxonomies)
9
Code (Lists) Terms (Soil, Plant, etc.)
Controlled Vocabulary(Agreed Upon Terms)
Taxonomy(Hierarchy)
Thesaurus(Preferred Labels, Synonyms, etc.)
RDF Models(Triples as Graphs)
OWL Ontologies(RDF + Axioms)
Reasoning(Rule-based Logics:
Discover New Patterns)
Ontologies and Reasoning add Axioms and Advanced Logic
©2016 Allotrope Foundation
Understanding the 4V’s of Big Data
10
Normally the focus –Big Data Analysis is more than just size
Performance is Critical to Success
Data complexity is increasing – Model complexity
Uncertainty abounds – requires statistics and probabilities
Majority of Big Data analytics approaches treat these two V’s
Semantic technologies provide
clear advantages
Mathematical Clustering
Techniques provide clear advantages
©2016 Allotrope Foundation
Why Semantics Matters for Data Analytics
11
Big Data approaches require proper metadata and
terminologies to integrate information well
Relationships matter in the data
Understanding perspective (context) is crucial for success
in today’s world
Semantics provides better data models/schemas
©2016 Allotrope Foundation
The Foundation for Real Data Analytics on the Laboratory Workflow and Data
12
Plan Analysis
Prepare Samples
Submit Samples
Control Inst. Acquire Data
Process Data
Analyze Data
Reports Results
Store, Archive
Data
Request ReportSearch &
Reuse Data
Sample Prep Data
Instrument Instructions
Instrument Data Processed Data Analyzed Data Reported
Results Stored DataAnalytical Method
Data DescriptionRDF Model
Data Cubes Universal data
container
Data Package Virtual file system
Allotrope Data Format
©2016 Allotrope Foundation
How is the Framework Being Used? Implementation by Member Companies
13
DevelopmentResearch Commercial
Member non-GMP GMPInstrument
BMS
Bayer
Baxter
Merck & Co.
Amgen
Boehringer-Ingelheim
GSK Drug Substance Release & Stability
Structure ID, Purification,In vitro bioanalysis
Method ScreeningHPLC-UV/MS
HPLC-UVBalance
HPLC-UV/MS
Structure IDHPLC-MS
FermentationProcess ControlBioanalyzer
Small and Large Molecule CMC
Genentech
Elemental Impurities
Assay, PurityHPLC-UV
Biogen CRO IntegrationHPLC-UV
Pfizer LC Data to ADF Converter/AdapterHPLC-UV
ICP-MS
pH, Weighing, GC, Karl Fischer, TGA, NMR , Cell Density/Viability, Blood Gas Analyzer, Cell Culture Analyzer, Capillary Electrophoresis…
©2016 Allotrope Foundation
How is the Framework Being Used? Implementations by Member Companies
14
DevelopmentResearch Commercial
Member non-GMP GMPInstrument
Member 6
Member 3
Member 2
Member 9
Member 1
Member 5
Member 8 Drug Substance Release & Stability
Structure ID, Purification,In vitro bioanalysis
Method ScreeningHPLC-UV/MS
HPLC-UVBalance
HPLC-UV/MS
Structure IDHPLC-MS
FermentationProcess ControlBioanalyzer
Small and Large Molecule CMCMultiple types
Member 7
Elemental ImpuritiesICP-MS
Assay, PurityHPLC-UV
pH, Weighing, GC, Karl Fischer, TGA, NMR , Cell Density/Viability, Blood Gas Analyzer , Cell Culture Analyzer, Capillary Electrophoresis…
Member 4 CRO IntegrationHPLC-UV
Member 10 LC Data to ADF Converter/AdapterHPLC-UV
DevelopmentResearch Commercial
Member 6
Member 9
Member 8 Drug Substance Release & Stability
Structure ID, Purification,In vitro bioanalysis
Method ScreeningHPLC-UV/MS
HPLC-UVBalance
HPLC-UV/MS
Member 10 LC Data to ADF Converter/AdapterHPLC-UV
Member 1 Small and Large Molecule CMC
Multiple types
taxonomies
methodsrepository
data repository adapter
instrument adapter
pH, Weighing, GC, Karl Fischer, TGA, NMR , Cell Density/Viability, Blood Gas Analyzer, Cell Culture Analyzer, Capillary Electrophoresis…
©2016 Allotrope Foundation
Smart Labs for the 21st CenturySmart labs in the future will provide the enterprise with:
• Integrated Data – common reference data structures (vocabularies)
• Sharable Data – easier interaction across teams and business units
• Scalability – Big data applications that can be highly elastic
• Conceptual Representations – context and perspective are captured
• Advanced Analytics – complex & automated problem-solving capabilities
©2016 Allotrope Foundation
Thank you!• Any questions, please contact the Secretariat at
[email protected] or [email protected]
• 2016 Workshops– January 20, 2016: San Francisco, CA @ Genentech– June 2016: Ingelheim, Germany @ Boehringer Ingelheim– September 2016: Indianapolis, ID @ Eli Lilly and Co.
http://www.allotrope.org for more information and to register
16