Upload
-
View
5.060
Download
2
Embed Size (px)
Citation preview
Py “Baseball” Data PyCon mini Hirosima 2016
Python
Shinichi Nakagawa(Baseball Analyst&Pythonista)
Starting Member
• Who am I?( )
• PyData
• PyData / #
• Python
• PyData + (FIP/RC27)
•
Who am I?
• Shinichi Nakagawa(@shinyorke)
• Python , Hack ※ Python
• HR .
• Python/Agile/PyData/SABRmetrics( )
•
• ( ) .
• ( ) HR .
• 1 2
.
• (Django) Python .
• https://service.visasq.com
• https://tech.visasq.com
•
•
•
• &
• etc…
• Web Python
•
• IPython + pandas
(Hello World )
•
•
.
.
• Deep Learning ,
.
• (Pandas )
& .
PyData / #
PyData
“”” PyData
Python Python Library
“””
※@iktakahiro http://www.slideshare.net/iktakahiro/pydata-67913897
PyData
• , ,Python
&( ) .
• , or .
• Excel Python, Deep Learning,
etc… PyData
PyData ( )
( )
“””
“”” https://ja.wikipedia.org/wiki/
( )
• ,
• 1970
, &
•
( , )
•
• ( , ,FA)
• ( )
•
• ( , etc…)
• ( , J )
× ( ) ※ × +
× ( ) ※ × +
※
5
• ( - ) = 5 ( )
•
•
•
•
•
.
• ( - ) 5 5
(ry .
• = ( 2 )÷( 2 + 2 )
•
Python×Pandas
Python×pandas
# Python 3 (3.4 ) ( )$ pip install ipython pandas beautifulsoup4 numpy lxml html5lib# ipython ( Jupyter )$ ipython
Python×pandas
# import pandas as pdimport numpy as np
# ( )df = pd.read_html('http://baseball.yahoo.co.jp/npb/standings/')
# df_cl = df[0].drop([0]) #
Python×pandas
# # ( )df_cl.columns = ['rank', 'name', 'games', 'win', 'lose', 'draw', 'pct', 'gb', 're_games', 'r', 'er', 'hr', 'sb', 'ba', 'era']
# df_cl['win'] = df_cl['win'].fillna(0).astype(np.int64) # df_cl['lose'] = df_cl['lose'].fillna(0).astype(np.int64) # df_cl['pct'] = df_cl['pct'].fillna(0).astype(np.float64) # df_cl['r'] = df_cl['r'].fillna(0).astype(np.int64) # df_cl['er'] = df_cl['er'].fillna(0).astype(np.int64) #
Python×pandas
# df_cl['difference'] = df_cl['r'] - df_cl['er']
# df_cl['pythagorean_win_per'] = (df_cl['r'] ** 2) / (df_cl['r'] ** 2 + df_cl['er'] ** 2)
# df_cl['pythagorean_win'] = (df_cl['pythagorean_win_per'] * 143).fillna(0).astype(np.int64)
df_cl['pythagorean_lose'] = 143 - df_cl['pythagorean_win']
# df_cl.sort_values(by='pythagorean_win_per', ascending=False)
https://gist.github.com/Shinichi-Nakagawa/8ff55af83390fcd2e2dd34bcb914868c
( )
×
•
• (+187)
• 5
• /
• DeNA ,
•
• ( )
?
( )
• & (& )
• , , ,
•
×PyData
• FIP
• (RC27)
• scrapy CSV
• CSV pandas, seaborn, jupyter &
( )
FIP(Fielding Independent Pitching)
• , ( )
• , (+ ),
• ( )
• xFIP
FIP .
FIP( TOP 20)
FIP( & )
FIP(50 Histogram)
FIP(50 Histogram)
FIP(50 Histogram)
FIP
•
•
•
• FIP
•
FIP ( )
RC27
• 9 1
?
• VS , ?
• RC(Run Created, ) 1
•
RC27 (350 )
RC27 TOP30(350 )
RC27(Histogram)
RC27(Histogram)
RC27(Histogram)
RC27
• 1-6
• RC27 Top30 6
•
•
•
• ( )
•
6 Top30
• ,
• ,
, FIP ( )
•
[ ]
• ,
FIP, WHIP, K/BB, etc…
• ,
RC27 3 ( 6 )
•
Py "Baseball" Data - Python ※pandas, Re:dash (& )
MonotaRO TechTalk #4
http://www.kokuchpro.com/event/monotarotech4/
&
Shinichi Nakagawa(Twitter/Facebook/visasQ:@shinyorke)