Upload
dan-bolser
View
764
Download
0
Embed Size (px)
Citation preview
Wikis at workA short introduction to 'wikis' and
wikis for biology
Dan Bolser and Paolo Romano
URL for slides: http://
Overview of today's course
Session 1 (35 min): What is a wiki? – A gentle introduction to the wiki
concept, and a look at Wikipedia.
Session 2 (35 min)
Biological wikis! – A review of some of the most important wikis for biology (BioWikis).
BREAK (15 min)
Overview of today's course
Session 3 (35 min)
Semantic MediaWiki – Software for 'data wikis'. Cell Lines Wiki – A pilot scientific wiki.
Practical session (60 min)
Editing Wikipedia (for the first time?) Working with Cell lines Wiki – A gentle introduction
to some of the key features.
Session 1
What is a wiki?
Wiki means QUICK!
Old way:
Wiki is a quicker way to let peopleput content on the web
Wiki is a quicker way to let peopleput content on the web
Old way:
Wiki is a quicker way to let peopleput content on the web
Old way:
Wiki is a quicker way to let peopleput content on the web
Old way:
Host
HTML editor
FTP software
Domain name
Wiki is a quicker way to let peopleput content on the web
Old way:
Wiki is a quicker way to let peopleput content on the web
Old way:
Wiki way:
Wiki means QUICK!
There is no longer one single 'point of control' for managing web content.
Content is managed by a decentralized community of participating users.
Wiki is radically different!
Is this good or bad?
Other advantages of wiki
There are many other advantages over 'traditional' web publishing...
Notification of changes History of changes Discussion of changes
The rise of the wiki
Condensed history
1994: Cunningham coined the term 'wiki'. A site (for software developers) with pages that can
be edited via the browser, each with a page history.
Over the next five years it spawned alternative wiki applications and websites (wiki culture).
By 2000, it had developed lots of of spin-off content, most notably MeatballWiki (for general discussion).
2001, Wikipedia launched.
2007, Wikipedia in the worlds top 10 web sites.
Wikipedia
Size Stats.
Growth Research.
Rules e.g. Deleting content.
Culture e.g. Anatomy of a talk page.
Almost all of its articles can be edited by anyone with access to the site, and it has about 90,000 regularly active contributors.
Wikipedia
Size 19.7 million articles
3.7 million in English 847,069 in Italian (4th out of 282)
2.7 billion monthly hits from the US alone.
7th most popular site in the world. The largest and most popular
general reference work.
Wikipedia
Growth (Kittur) Wikipedia has been
growing exponentially since 2002 (Voss)
“wisdom of crowds” or “elite users”
a large number of people with a small number of edits, or
a core group who do most of the work?
Wikipedia
Rules Their are no rules! The 'five pillars' of
Wikipedia' Standard deletion
procedure...
Wikipedia
Culture Their are no rules! The 'five pillars' of
Wikipedia' Discussion!
(Not bureaucracy)
Moving on...
Is WP, or something like it, the future for science?
Lets find out in Session 2!
Session 1, References
http://wikipedia.org/wiki/History_of_wikis
http://wikipedia.org/wiki/Wikipedia
1) Voss, J. Measuring Wikipedia. In Proceedings of the ISSI 2005 (Stockholm, Sweden, July 24-28, 2005).
2) Kittur A, Chi EH, Pendleton BA, Suh B, Mytkowicz T. Power of the few vs. wisdom of the crowd: Wikipedia and the rise of the bourgeoisie. CHI 2007.
http://wikipedia.org/wiki/Wikipedia:Five_pillars
http://meta.wikimedia.org/wiki/ Research:Newsletter/2011-07-25
Session 2
Scientific wikis
31
Community annotation
32
Annotation
33
Annotation
34
Community annotation
Driven by two key factors:
The vast increase in biological data
The clear success of Wikipedia
35
BioMoore's Law
Over time: Cost per unit of information can be decreased by
orders of magnitude. Throughput is increased by orders of magnitude.
Fan et al. 2006. Nat Rev Genet.
Comprehensive disease studies that might require ~1bn genotypes would now cost only a few million dollars. Revolution in human genetics.
36
BioMoore's Law
Over time: Cost per unit of information can be decreased by
orders of magnitude. Throughput is increased by orders of magnitude.
Fan et al. 2006. Nat Rev Genet.
Comprehensive disease studies that might require ~1bn genotypes would now cost only a few million dollars. Revolution in human genetics.
37
Community annotationis essential
Centralised databases can't cope with annotating the influx of data.
Less investment in more specialised data. Fewer people with a stake. Specialists more disparate.
Communities are smaller and more focused.
Do wikis hold the answer? Wikipedia as a model…
38
But Wikipedia isn’t always the answer ...
• Wikipedia is an educational resource.
– All articles are encyclopaedic in style.
– Explicitly forbids data from ‘original research’:
• http://wikipedia.org/wiki/Wikipedia:No_original_research
– “Wikipedia does not publish original research”.
– No tools for the specific analysis, presentation, or collection of ‘biological’ data.
• BioWikis!
39
BioWikis
Wikis with a biological subject matter, customized for analysis, presentation and collection of specific biological data and biological data types:
Wikis for biology
Proteopedia
http://proteopedia.org/
Aim – To make knowledge about proteins accessible.
Features: Interactive 3D viz. Contributions linked to
publications.
Problems Doesn't work on all
browsers. Can be slow.
Reference Hodis, E. et al., 2008. Proteopedia
- a scientific “wiki” bridging the rift between three-dimensional structure and function of biomacromolecules. Genome biology, 9(8), p.R121.
WikiPathways
http://wikipathways.org
Aim – The curation of biological pathways.
Features: Interactive pathway
editing. Integrated to many
biological databases.
Problems Doesn't work on all
browsers. Can be slow.
Reference Pico, A.R. et al., 2008.
WikiPathways: pathway editing for the people. PLoS biology, 6(7), p.e184.
EcoliWiki
http://ecoliwiki.net/
Aim – Share info related to non-pathogenic E. coli.
Features: Very extensive and
well structured domain information.
Referencing is good.
Problems Big resource Specific focus, could it
be applied to others?
Reference EcoliWiki, 2007. EcoliHub’s
subsystem for community annotation. http://ecoliwiki.net.
CHDWiki
http://goo.gl/info/Zxg8H
Aim(?) – Geneticists, clinicians, and mol. biologists working on Congenital Heart Defects.
Features: Curated gene lists. PPI browser.
Problems Very old MediaWiki
fork Used?
Reference Barriot, R. et al., 2010.
Collaboratively charting the gene-to-phenotype network of human congenital heart defects. Genome medicine, 2(3), p.16.
SEQwiki
http://seqwiki.org
Aim – a catalogue of analysis tools and, technologies for NGS.
Features Structured data. Linked to an
established forum.
Problems Data can become
error prone. Data is difficult to
centrally manage.
Reference Li J.W., et. al. 2012. The
SEQanswers wiki: A wiki database of tools for high throughput sequencing analysis. Nucleic Acids Research 2012 Database special issue.
Listing of biowikis...
http://bioinformatics.org/wiki/BioWiki
BioWikis
There is a variety of different systems.
All seek to 'recognize' contributors (biologists) in a way more familiar to scientists than Wikipedia.
Most have features not found in Wikipedia. Some projects use the base of Wikipedia to
successfully build integrated 'sub-projects'...
The MCB Portal in Wikipedia
http://en.wikipedia.org/wiki/Wikipedia:MCB
Aim – Better organize information related to molecular and cell biology on Wikipedia.
Features Integrated with WP
Problems The project can be
confusing to those unfamiliar with Wikipedia.
Reference ...
Gene Wiki
http://en.wikipedia.org/wiki/Portal:Gene_Wiki
Aim – applying community intelligence to gene annotation.
Special features: Semi-automatic gene
curation system.
Problems The project can be
confusing to those unfamiliar with Wikipedia.
Reference Huss, J.W. et al., 2008. A gene
wiki for community annotation of gene function. PLoS biology, 6(7), p.e175.
Wikis for Cancer?
Some aditional references
1. Fan JB, Chee MS, Gunderson KL. Highly parallel genomic assays. Nature reviews. Genetics 7, 632-44 (2006).
2. The Molecular modelling blog post http://rosettadesigngroup.com/blog/373/scientific-wikis-part-i/
1. http://bifx.org/wiki/BioWiki2. Daub, J. et al., 2008. The RNA WikiProject: community annotation of RNA families. RNA
(New York, N.Y.), 14(12), pp.2462-4. 3. Stehr, H. et al., 2010. PDBWiki: added value through community annotation of the Protein
Data Bank. Database, p.baq009-baq009.4. Flórez, L.A. et al., 2009. A community-curated consensual annotation that is continuously
updated: the Bacillus subtilis centred wiki SubtiWiki. Database : the journal of biological databases and curation, p.bap012
Session 3
Semantic MediaWiki
Semantic MediaWiki(a data wiki engine)
What is SMW?
Motivation
Frontend What you see as a
user of SMW.
Backend What you do as a
SMW site developer...
Data
Properties and types
Classes
Templates
Forms
Queries
References
SMW Homepage: http://semantic-mediawiki.org
MW Templates: http://www.mediawiki.org/wiki/Help:Templates
SF Homepage: http://www.mediawiki.org/wiki/ Extension:Semantic_Forms
SMW #ask: http://semantic-mediawiki.org/wiki/ Help:Inline_queries
SMW demo sites:
http://pgscdemo.referata.com
http://discoursedb.org
http://sandbox.referata.com
Cell Lines Wiki