Upload
airtoxin-ishii
View
7.864
Download
0
Embed Size (px)
Citation preview
DataFrameの結合http://pandas.pydata.org/pandas-docs/stable/merging.html
pysqldfyhat/pandasqlが既にDataFrameにSQL発行できるライブラリを公開していたが、メンテされていなかった。
pandasqlをベースに、クエリ発行が出来るデータ形式を増やしたり、UDFを使えるようにしたり。
SQLite3の構文をサポート。
How to use
from pysqldf import SQLDF, load_iris
sqldf = SQLDF(globals()) iris = load_iris()
sqldf.execute("select * from iris;")
union
buyer = sqldf.execute(""" select name, sex, age from buyer1 union all select name, sex, age from buyer2; """)
join
purchaser_log = sqldf.execute(""" select * from buyer as b inner join purchase_log as p on b.name = p.buyer; """)
UDFdef is_royal_customer(name): if name == "alice": return True else: return False
sqldf = SQLDF(globals(), udfs={ "is_royal_customer": is_royal_customer })
sqldf.execute(""" select *, is_royal_customer(name) as royal from ( select name, sex, age from buyer1 union all select name, sex, age from buyer2 ) """)
UDF
UDF(aggregate)集約クラスか関数を作成。
集約クラスはsqlite3のドキュメントを参照。
関数はカラムの値のリストを受け取り、値1つを返すようにする。
UDFと同じようにSQLDFのコンストラクタに関数またはクラスを渡すと使えるようになる。
UDF(aggregate)def is_royal_bought(royals): if 1 in royals: return True else: return False
sqldf = SQLDF(globals(), udafs={ "is_royal_bought": is_royal_bought })
UDF(aggregate)
sqldf.execute(""" select item, sum(quantity), is_royal_bought(royal) from purchaser_log group by item """)
purchaser_log = sqldf.execute(""" select * from ( select *, is_royal_customer(name) as royal from ( select name, sex, age from buyer1 union all select name, sex, age from buyer2 ) ) as b inner join purchase_log as p on b.name = p.buyer; """)
sqldf.execute(""" select name, item, sum(quantity) as cnt from purchaser_log group by name, item """)
sqldf.execute(""" select name, item, sum(quantity) as cnt from purchaser_log group by name, item """).pivot("name", "item", "cnt")
sqldf.execute(""" select name, item, sum(quantity) as cnt from purchaser_log group by name, item """).pivot("name", "item", "cnt").fillna(0)