17
NanoInformaTIX receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 814426 eNanoMapper data model Dr. Nina Jeliazkova Ideaconsult Ltd. Sofia, Bulgaria https://www.ideaconsult.net/ 1

E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

  • Upload
    others

  • View
    0

  • Download
    0

Embed Size (px)

Citation preview

Page 1: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 814426

eNanoMapper data modelDr. Nina Jeliazkova

Ideaconsult Ltd. Sofia, Bulgariahttps://www.ideaconsult.net/

1

Page 2: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

Introduction

Nina Jeliazkova• https://orcid.org/0000-0002-4322-6179• Ideaconsult Ltd. Sofia, Bulgaria• We are developing (mostly open source) tools for data management

and modeling for• Chemical substances (safety, pharma, etc)

• FP7 OpenTox, FP7 CADASTER, FP7 ToxBank, H2020 FET ExCAPE, LIFE Concert REACH, AMBIT LRI toolbox (CEFIC), Toxtree, number of industry projects

• Nanomaterial safety (nanomaterials are chemical substances!)• FP7 eNanoMapper, H2020 NanoReg2, caLIBRAte, GRACIOUS, NanoinformaTIX,

Gov4Nano, RiskGone, European observatory of nanomaterials (EUON)

• Will talk about eNanoMapper data model

2

Page 3: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

The eNanoMapper data model

• Has concepts and relationships• Is not a taxonomy

• But can use (multiple) existing taxonomies to annotate entities

• Is not an ontology• But can use (multiple) existing ontologies to annotate entities• The data entries based on the eNanoMapper model can be converted to

different ontologies. Examples:• RDF serialization using BioAssay ontology classes and properties (relationships)• https://isa-tools.org/ data model, which itself has RDF serialization using several ontologies

• It could be converted to and from different data models and formats• Different representations are appropriate for UI, data capture from instruments, big data

integration and analysis, modelling , AI, etc.

3

Page 4: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

2. Application domain: data about chemical substances

4

Yes, Excel files, plenty of them, (majority of NanoSafety Cluster data)

OECD Harmonized templates/ IUCLID(mandatory for REACH dossiers)

CODATA VAMAS Uniform Description System

Don’t forget ISO standards, this is what industry uses

BioAssay ontology

https://isa-tools.org/

Page 5: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

3. Intended purpose: Organising the nanosafety data

• Challenges• Diverse data sources• Diverse data input

formats• Different data

organization• Diverse modelling tools

• Approach:• Enable mappings!

• i.e. eNanoMapper

5

• Physico-chemical identity : Different analytic techniques, manufacturing conditions, batch effects, mixtures, impurities, size distribution, differences in the amount of surface modification, etc.

• Biological identity : Wide variety of measurements, toxicity pathways, effects of ENM coronas, modes-of-action, interactions (cell lines, assays).

• Support for data analysis : Requires “spreadsheet” or matrix view of data. The experimental data in the public datasets is usually not in a form appropriate for modelling (merging multiple values, conditions, similar experiments into matrix form is a challenge).

Page 6: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

4. How do you represent the world

• (as a continuum? as discrete particles? with quantum mechanics?)• Irrelevant question. Depends on the use case and data available• Material representation

• A material is represented as REST resource. • A REST resource may have many different representations

• Material is composed of components with specified role (multiple serialisations)• A component is represented as e.g. chemical structure – including, but not limited to name,

SMILES, connection table, crystal structure format, any digital representation deemed important (multiple, requested by Mime-type)

• Data about material• A measurement is the result of applying a (measurement) protocol to a material sample. • Simulation data is the result of applying an (in-silico) protocol to a digital representaiton

of a material • Again REST resource with multiple serializations (including semantic)

6

Page 7: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

5/8.Concepts and relations• Concepts:

• Substance/Material• Nanomaterials are chemical substances• Composition, components, structure, structure properties

• Measurements• Protocol applications (protocol, protocol parameters, factors)• Results (what is measured and what is the result)

• Relationships• Examples : material components (part of , role)• A measurement is the result of applying a protocol to a

material. • The protocols have attributes (e.g. instrument, cell

model)• The outcome of a measurement are data values (numeric,

scalar, vector, categorical, text, etc). • The data entities can be related to each other (e.g. IC50 is

defined by dose response). Measurements can be related to each other as well.

7

Page 8: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

6. Industrial use cases

• AMBIT LRI tool http://ambitlri.ideaconsult.net (REACH dossiers of chemical substances, same data model)

• EUON https://euon.echa.europa.eu/enanomapper• Largest integration of nanosafety data https://search.data.enanomapper.net

8

Page 9: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

7. Overlap with other taxonomy and ontologies

• We use (multiple!) existing taxonomies and ontologies to annotate data entries

• Different data models, standards (ISO , OECD HT, etc), data integration, different tools , synonym search (via query expansion)

9

Page 10: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

Representation• 9. What is the knowledge your specific ontology represents

• Knowledge necessary for a pragmatic description of current practices. Flexible model based on integration of ideas from several approaches of representing data on chemical substances

• 10. How does your ontology represent the relations between different granularity views on the same object?

• Different REST representations, if needed, denoted by MIME type

• 11. How does your ontology represent materials?• See previous slides

• 15. What is the representation language and implementation?• REST resources with multiple representations (JSON, JSON-LD, RDF, other domain specific formats)

10

Page 11: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

Representation

• 12. What type of processes do you address? How does your ontology represent these processes?

• Annotation with external ontologies/taxonomies

• 13. How does your ontology represent manufacturing?• Annotation with external ontologies/taxonomies

• 14. How does your ontology address the circular connection between physical properties, materials models (see definition in RoMM Review of Materials Modelling VI) and measurement?

• We represent models and measurements as application of a protocol (computational or experimental) to a material. The protocol is defined as a procedure to assess a physical property of the material (or approximation of it).

11

Page 12: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

14. Properties and measurements

• Properties definitions• may differ, see table on the right

• Of interest in nanosafety:• Properties that can be measured• Properties relevant (e.g. for transport and

fate)• Examples

• “On powders, He-pycnometry is the most appropriatemethod for powders (standardized, available, highlyreproducible), and it measures the mass of the particledivided by its apparent volume = Apparent particledensity = Skeletal density. This is relevant for transportand fate by aerosol and in suspension.”

• “In contrast, true density = theoretical density is lessrelevant for nanosafety purposes, because the biologicalprocesses do not break up closed pores within theparticles. Further, it requires sophisticated methods.”

• More examples in the extra slides

12

Page 13: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 814426

Thank you!Discussions

13

Page 14: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

14. Properties and measurements. Shape

14

Page 15: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

14. Properties and measurements. Aspect ratio

15

ISO enanoMapper ontology

Page 16: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

14. Properties and measurements. Feret diameter

16

ISO

eNanoMapper ontology

Page 17: E v }D u } o · 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg. 1dqr,qirupd7,;uhfhlyhvixqglqjiurpwkh(xurshdq8qlrq¶v+rul]rq uhvhdufkdqg

NanoInformaTIX receives funding from the European Union’s Horizon 2020 research andinnovation programme under grant agreement No 814426

Composition

17

Coating

Core