17
Software Sustainability Institute www.software.ac. uk Dealing with software: the research data issues http://dx.doi.org/10.6084/m9.figshare.1150298 26 August 2014, Dealng with Data Conference Neil Chue Hong (@npch), Software Sustainability Institute ORCID: 0000-0002-8876-7606 | Where indicated slides licensed under Supported by Project funding from

Where indicated slides licensed under

Embed Size (px)

DESCRIPTION

Dealing with software: the research data issues http://dx.doi.org/10.6084/m9.figshare.1150298 26 August 2014, Dealng with Data Conference Neil Chue Hong (@ npch ), Software Sustainability Institute ORCID: 0000-0002-8876-7606 | [email protected]. Project funding from. Supported by. - PowerPoint PPT Presentation

Citation preview

Page 1: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

Dealing with software:the research data issueshttp://dx.doi.org/10.6084/m9.figshare.1150298

26 August 2014, Dealng with Data ConferenceNeil Chue Hong (@npch), Software Sustainability InstituteORCID: 0000-0002-8876-7606 | [email protected]

Where indicatedslides licensed under

Supported by Project funding from

Page 2: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

“Re-” is the new black

Page 3: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

The Research Cycle

Create

Test

Interpret

PublishRevise Paper

Data

Software

Research Outputs Research is a continuous cycle.

When we publish we are contributing to the body of knowledge.

Page 4: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

Research/Reuse/Reward Cycle

Index

Identify

CiteRewardCreate

Test

Interpret

PublishRevise

Research Reuse Reuse is also a cycle. We build our research on the work of others.

Reward mechanisms should encourage reuse.

Page 5: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

The current process

Startresearch

Writesoftware

Usesoftware

Produceresults

Publishresearch

paper

Releasedata

Releasesoftware

Which mentions software and data

This process is simple but does not reward production orreuse of good software and data.

It also has a long contribution cycle.

Page 6: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

“Re-”positoriesBackup|Sharing|Archivingof software

Page 7: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

Differing roles, different repositories

backup sharing archiving

TimescalesPolicyLicensing

IngestMetadataAssurance

Page 8: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

Versioning

Personalv1

Personal v2

Personalv3

Personal v2a

Public v1

Personal v3a

Personal v2a

Public v2

Public v3

Why do we version?- To indicate a change- To allow sharing- To confer special status

Version control systems make this easy and conceptof a person and an outputare there but not unique

Page 9: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

Algorithm

Function

Prog

ram

Library / Suite / Package

Granularity

What do we define?- Useful units of reuse

Page 10: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

What do we choose to identify:- Workflow?- Software that runs workflow?- Software referenced by workflow?- Software dependencies? What’s the minimum citable part?

Boundary

Page 11: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

AuthorshipAuthorship• Which authors have had what impact on each version of the software?• Who had the largest contribution to the scientific results in a paper?• Can micro-attribution work? Can track author, but not contribution?

http://beyond-impact.org/?p=175

OGSA-DAI projects statistics from Ohloh

Why do we identify?- To measure- To restrict- To communicate- To include

Page 12: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

Code as a Research Object

• What if you could assign DOIs to code easily?

• Could we make software more reusable?• http://mozillascience.org/code-as-a-research-object-a-new-project/• https://guides.github.com/activities/citable-code/

Page 13: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

Writesoftware

A better process?

Startresearch

Identifyexisting

software

Usesoftware

Produceresults

Publishresearch

paper

Adapt/extend

software

Releasedata

Releasesoftware

Publishsoftware

paper Publishdata

paper

Which references

softw

are and data papers

Software and data papers are needed as proxies for rewarding reuse.

But it enables a shorter contribution cycle for data and software.

Page 14: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

Alternative Metrics

Page 15: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

One-click challenge

• “One-click” archiving of a significant version of software in a code repository to a suitable institutional repository

• “Suitable” repository: Clear access / deposit / preservation policy Adherence to standards Ability to easily “transfer” in / out Allows use of appropriate licenses for code Sustainability of hosting organisation Ability to monitor, check integrity Provides permanent unique identifiers

• Proposing a hackday to make this happen

Page 16: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

Summary

• Software is an important output of the research cycle, and should be rewarded

• Repositories play an important role in the research cycle, including software

• But software has specific issues with regards to research data management

• Tooling is needed to lower barriers to deposit

Page 17: Where indicated slides licensed under

Software Sustainability Institute

www.software.ac.uk

Further information

• This presentation: Slides: http://dx.doi.org/10.6084/m9.figshare.1150298 Abstract: http://dx.doi.org/10.6084/m9.figshare.1150299

• Where does it go from here: the place of software in digital repositories http://www.research.ed.ac.uk/portal/en/publications/where

-does-it-go-from-here-the-place-of-software-in-digital-repositories(ab6130c6-aee6-4972-9256-8ea0eb1862c9).html

• Software Papers: improving the reusability and sustainability of scientific software http://dx.doi.org/10.6084/m9.figshare.795303

• Software Sustainability Institute http://www.software.ac.uk/ Supported by EPSRC

Grant EP/H043160/1