12
A Web Interface for Microtask-Based Crowdsourcing Ilya Sukhopluev, Dmitry Ustalov NLPub

AINL 2016: Ustalov

Embed Size (px)

Citation preview

Page 1: AINL 2016: Ustalov

A Web Interface for Microtask-Based Crowdsourcing

Ilya Sukhopluev, Dmitry Ustalov

NLPub

Page 2: AINL 2016: Ustalov

Outline• Introduction•Related Work•Mechanical Tsar•Case Study: RSR•Conclusion

Page 3: AINL 2016: Ustalov

Introduction•Everybody loves crowdsouring:• data enrichment;• data validation;• solving “AI-hard” problems;•making the world a better place.

•Crowdsourcing needs infrastructure.•This talk is neither about AI nor NLP……but we have the matter to discuss.

Page 4: AINL 2016: Ustalov

Related Work•TurKit• http://groups.csail.mit.edu/uid/turkit/

•psiTurk• https://psiturk.org/

•Troia• https://github.com/ipeirotis/Troia-Server

•PyBossa• http://pybossa.com/

Page 5: AINL 2016: Ustalov

Mechanical Tsar•A crowdsourcing engine.• Runs microtasks.• Collects the answers.• Aggregates them!

•Different front-ends:•Web, Telegram, etc.

•http://mtsar.nlpub.org/•https://nlpub.ru/Mechanical_Tsar

Page 6: AINL 2016: Ustalov

Architecture•Mechanical Tsar is the engine.•Boyarin is a front-end application.•PostgreSQL is the database.•Piwik is the telemetry system.

Page 7: AINL 2016: Ustalov

Example: RDT (2016)•Continuation of the RUSSE study.• http://russe.nlpub.ru/

•Evaluating word relatedness.

Panchenko A. et al. (2016) Human and Machine Judgements for Russian Semantic Relatedness. To appear in Springer CCIS vol. 661.

Page 8: AINL 2016: Ustalov

Mechanical Tsar: Stages

Page 9: AINL 2016: Ustalov

Mechanical Tsar: Setup

Page 10: AINL 2016: Ustalov

Boyarin: Annotation

Page 11: AINL 2016: Ustalov

Conclusion• It is free. No reason not to use it.•Cooperation wanted!!•Plans:• track & analyze the user activity;• try more workflows.

•Cases:• shared task annotation;• private crowdsourcing.

Page 12: AINL 2016: Ustalov

Thank You!•Dmitry Ustalov,[email protected] I provide a LinkedIn link here.•http://mtsar.nlpub.org/The reported study was funded by Russian Foundation for Basic Research according to the research project № 16-37-00354 мол_а “Adaptive Crowdsourcing Methods for Linguistic Resources”. This work was supported by the Russian Foundation for the Humanities project № 13-04-12020 “New Open Electronic Thesaurus for Russian” and project № 16-04-12019 “RussNet and YARN thesauri integration”. The present work is also supported by a short-term grant provided by the Deutscher Akademischer Austauschdienst.