Upload
lani-cardenas
View
32
Download
0
Embed Size (px)
DESCRIPTION
Dealing with software: the research data issues http://dx.doi.org/10.6084/m9.figshare.1150298 26 August 2014, Dealng with Data Conference Neil Chue Hong (@ npch ), Software Sustainability Institute ORCID: 0000-0002-8876-7606 | [email protected]. Project funding from. Supported by. - PowerPoint PPT Presentation
Citation preview
Software Sustainability Institute
www.software.ac.uk
Dealing with software:the research data issueshttp://dx.doi.org/10.6084/m9.figshare.1150298
26 August 2014, Dealng with Data ConferenceNeil Chue Hong (@npch), Software Sustainability InstituteORCID: 0000-0002-8876-7606 | [email protected]
Where indicatedslides licensed under
Supported by Project funding from
Software Sustainability Institute
www.software.ac.uk
“Re-” is the new black
Software Sustainability Institute
www.software.ac.uk
The Research Cycle
Create
Test
Interpret
PublishRevise Paper
Data
Software
Research Outputs Research is a continuous cycle.
When we publish we are contributing to the body of knowledge.
Software Sustainability Institute
www.software.ac.uk
Research/Reuse/Reward Cycle
Index
Identify
CiteRewardCreate
Test
Interpret
PublishRevise
Research Reuse Reuse is also a cycle. We build our research on the work of others.
Reward mechanisms should encourage reuse.
Software Sustainability Institute
www.software.ac.uk
The current process
Startresearch
Writesoftware
Usesoftware
Produceresults
Publishresearch
paper
Releasedata
Releasesoftware
Which mentions software and data
This process is simple but does not reward production orreuse of good software and data.
It also has a long contribution cycle.
Software Sustainability Institute
www.software.ac.uk
“Re-”positoriesBackup|Sharing|Archivingof software
Software Sustainability Institute
www.software.ac.uk
Differing roles, different repositories
backup sharing archiving
TimescalesPolicyLicensing
IngestMetadataAssurance
Software Sustainability Institute
www.software.ac.uk
Versioning
Personalv1
Personal v2
Personalv3
Personal v2a
Public v1
Personal v3a
Personal v2a
Public v2
Public v3
Why do we version?- To indicate a change- To allow sharing- To confer special status
Version control systems make this easy and conceptof a person and an outputare there but not unique
Software Sustainability Institute
www.software.ac.uk
Algorithm
Function
Prog
ram
Library / Suite / Package
…
Granularity
What do we define?- Useful units of reuse
Software Sustainability Institute
www.software.ac.uk
What do we choose to identify:- Workflow?- Software that runs workflow?- Software referenced by workflow?- Software dependencies? What’s the minimum citable part?
Boundary
Software Sustainability Institute
www.software.ac.uk
AuthorshipAuthorship• Which authors have had what impact on each version of the software?• Who had the largest contribution to the scientific results in a paper?• Can micro-attribution work? Can track author, but not contribution?
http://beyond-impact.org/?p=175
OGSA-DAI projects statistics from Ohloh
Why do we identify?- To measure- To restrict- To communicate- To include
Software Sustainability Institute
www.software.ac.uk
Code as a Research Object
• What if you could assign DOIs to code easily?
• Could we make software more reusable?• http://mozillascience.org/code-as-a-research-object-a-new-project/• https://guides.github.com/activities/citable-code/
Software Sustainability Institute
www.software.ac.uk
Writesoftware
A better process?
Startresearch
Identifyexisting
software
Usesoftware
Produceresults
Publishresearch
paper
Adapt/extend
software
Releasedata
Releasesoftware
Publishsoftware
paper Publishdata
paper
Which references
softw
are and data papers
Software and data papers are needed as proxies for rewarding reuse.
But it enables a shorter contribution cycle for data and software.
Software Sustainability Institute
www.software.ac.uk
Alternative Metrics
Software Sustainability Institute
www.software.ac.uk
One-click challenge
• “One-click” archiving of a significant version of software in a code repository to a suitable institutional repository
• “Suitable” repository: Clear access / deposit / preservation policy Adherence to standards Ability to easily “transfer” in / out Allows use of appropriate licenses for code Sustainability of hosting organisation Ability to monitor, check integrity Provides permanent unique identifiers
• Proposing a hackday to make this happen
Software Sustainability Institute
www.software.ac.uk
Summary
• Software is an important output of the research cycle, and should be rewarded
• Repositories play an important role in the research cycle, including software
• But software has specific issues with regards to research data management
• Tooling is needed to lower barriers to deposit
Software Sustainability Institute
www.software.ac.uk
Further information
• This presentation: Slides: http://dx.doi.org/10.6084/m9.figshare.1150298 Abstract: http://dx.doi.org/10.6084/m9.figshare.1150299
• Where does it go from here: the place of software in digital repositories http://www.research.ed.ac.uk/portal/en/publications/where
-does-it-go-from-here-the-place-of-software-in-digital-repositories(ab6130c6-aee6-4972-9256-8ea0eb1862c9).html
• Software Papers: improving the reusability and sustainability of scientific software http://dx.doi.org/10.6084/m9.figshare.795303
• Software Sustainability Institute http://www.software.ac.uk/ Supported by EPSRC
Grant EP/H043160/1