18
PRML復々習レーン#9 LT Kernel regression with Python __youki__

Kernel regression with python

Embed Size (px)

Citation preview

PRML復々習レーン#9 LTKernel regression with Python

__youki__

本題に入る前に

• PRMLのカーネル法解説がわかりにくいと思うあなたへ.

“Nonparametric Econometrics: A Primer”Jeffrey S. Racine

Foundations and Trends® in Econometrics, Vol.3 No.1 (2008)DOI: 10.1561/0800000009

英語だけどとってもわかりやすいです!

Outline

• Kernel Regression with instant code

– Kernel regression with local constant estimator

• 1-D kernel regression

• 2-D kernel regression

• Kernel Regression with statsmodels

– Kernel regression with local linear estimator

• 1-D kernel regression

• 2-D kernel regression

Kernel Regression with Instant Code

Implementation of local constant estimator

Implementation of local constant estimator

• y: Local Constant Estimator PRML(6.45)

• g: The Generalized Product Kernel Density Estimator

• k: Gaussian Kernel for continuous variablesdef get_gaussian_kernel(h, X, x):

return (np.sqrt(2*np.pi) ** -1) * np.exp(-.5 * ((X - x)/h) ** 2)

def get_local_constant_estimator(h, X, Y, x):y = np.empty(x.shape[0])for i in xrange(x.shape[0]):

K = get_gpke(h, X, x[i])y[i] = (Y * K).sum() / K.sum()

return y

def get_gpke(h, X, x):K = np.empty(X.shape)for j in xrange(len(x)):

K[:, j] = get_gaussian_kernel(h, X[:, j], x[j])gpke = K.prod(axis=1) / h ** len(x)

return gpke

kernel regression for 1d and 2d data

2d混合ガウス分布

1d正弦関数

DEMO with instant python code

Kernel Regression with statsmodels

local linear estimator and bandwidth estimation

statsmodels?

• Statistics in python– http://statsmodels.sourceforge.net/devel/

• 統計モデルを用いたデータ分析ツール– Linear Regression– Generalized Linear Models– Robust Linear Models– Regression with Discrete Dependent Variable– Time Series analysis– Statistics– Nonparametric Methods kernel regression included!!– Generalized Method of Moments– Empirical Likelihood

• まだRには遠く及ばないが簡単なことはこれを使えばできる!• 機械学習系のライブラリにはstatsmodelsではなくscikit-learnがある.

Kernel regression in statsmodelsAdditional functionalities

Estimator: how local linear estimator works?

Edge bias occurs in LC

Estimator: how local linear estimator works?

Bad effect in LL

Local Constant vs. Local Linear

Bandwidth optimization: How AIC & CV works?

Plug-InRuppert, D., S. J. Sheather, and M. P. Wand (1995), ‘An effective bandwidth selector for local least squares regression’. Journal of the American Statistical Association 90, 1257–1270.

AIC & CVHurvich, C. M., J. S. Simonoff, and C. L. Tsai (1998), ‘Smoothing parameter selection in nonparametric regression using an improved Akaike information criterion’. Journal of the Royal Statistical Society Series B 60, 271–293.

上記以外にも経験的なBandwidth算出方法がある

“Nonparametric Econometrics: A Primer” pp. 43

kernel regression for 1d and 2d dataUsing statsmodels-0.5.0

2d混合ガウス分布

1d正弦関数

DEMO with statsmodels