12
Kaggle – A platform for data scientists 160106 陳陳

陳琤 20160106 kaggle

  • Upload
    -

  • View
    311

  • Download
    2

Embed Size (px)

Citation preview

Page 1: 陳琤 20160106 kaggle

Kaggle – A platform for data scientists

160106陳琤

Page 2: 陳琤 20160106 kaggle

Introduction

Page 3: 陳琤 20160106 kaggle

Introduction

Data providers

Kaggle

Data scientists

Page 4: 陳琤 20160106 kaggle

Interface

Page 5: 陳琤 20160106 kaggle

Competitions• Featured – commercial problems with prize money• Masters – open only to elite Kagglers• Recruiting• Research• Playground• Getting Started• (Public Datasets)• (In class)

Page 6: 陳琤 20160106 kaggle

Competitions

Page 7: 陳琤 20160106 kaggle

Competitions

Page 8: 陳琤 20160106 kaggle

Public Datasets

Page 9: 陳琤 20160106 kaggle

Public Datasets• Available data: - .csv - SQLite database - raw data files (ex: pdf)

Page 10: 陳琤 20160106 kaggle

Kaggle Script• Enable users to run R, Python, Julia, and R Markdown code directly on

the provided datasets with some restrictions.

• Any images, csv files or html files generated directly display.

• Other users can vote, give comments on your script.

• https://www.kaggle.com/benhamner/d/benhamner/nips-2015-papers/exploring-the-nips-2015-papers

Page 11: 陳琤 20160106 kaggle

Community• User ranking and tier system - Users gain points from their performance in competitions. - Users are separated into 3 tier: novice/kaggler/master

• Forum

Page 12: 陳琤 20160106 kaggle

Conclusion•競賽過程中 Kaggle扮演著很重要的媒介,除了維護平台以外還有和資料提供者的協商溝通以及 data cleaning。•要打造一個具有開源精神的平台,必須讓使用者能有充分的參與感、成就感以及能加強與社群的聯結。