Atmospheric dynamics final project: A Trivial Test on a Bayesian Statistic Distribution Model for...

Atmospheric Dynamics Final Project: A Trivial Test on a Bayesian Statistic Distribution Model for Air PollutionWritten ReportB02602005顏東白This final project was a small test on the accuracy of a developing statistic distribution model. It was aimed to discuss further properties of the model for which may be used in future air pollution project.

1. TheoryThe model is a Bayesian model, assuming Gaussian distribution. The velocity on each observation site gives the expected position for the next time interval, while the variance is estimated from the data of all the observation sites.

Two key concepts were used to optimize the estimation: accuracy and creditability.

Accuracy refers to the minimization of the squared error:

(∑ φ ( x⃑ i , t n ) p ( x⃑ j− x⃑ i , v⃑ ( x⃑ i , t ))

∑ p ( x⃑ j− x⃑i , v⃑ ( x⃑ i , t ))−φ( x⃑ j , t n+1))

Creditability refers to maximization of the representative ability of the sample points to each other. It can be done by maximizing the logarithm of the likelihood function:

ln (L(σ2|(x i )n ) )=−n2 ln (2π σ 2)− 12σ2

∑ijd ij

Minimization of the square error yields

∑ijφ j d ij

2e−d ij

2σ 2 ∑ije

−dij2

2σ2 =∑ijφ j e

−dij2

2σ 2 ∑ijd ij

2e−dij

Maximization of the logarithm of the likelihood function yields:

σ̂ 2= 12n2

∑ijd ij2

An appropriate selection of sites is to select the sites for which

E [∑ijφ j d ij

2e−dij

2 σ̂2 ∑ije

−dij2

2 σ̂ 2 ]=E[∑ijφ j e

−dij2

2 σ̂ 2 ∑ijd ij

2e−dij

2 σ̂ 2 ]

Where sigma hat is the estimated standard deviation from MLE. Such selection was not used due to its complexity; instead a more naïve estimation was adopted. We tried to minimize the function

(− ln (2π σ2 )− 12n2σ2

∑ijdij2)2

+(∑ijφ j d ij

2 e−dij

2σ2 ∑ije

−dij2

2σ 2 −∑ijφ j e

−dij2

2σ2 ∑ijd ij2 e

−dij2

2σ 2 )2

As a tradeoff between MSE and MLE considerations.

2. Code

##Environment Setting

library("KernSmooth", lib.loc="C:/Program Files/R/R-3.2.2/library")

library("fields", lib.loc="~/R/win-library/3.2")

##Parameters Setting

#Gaussian Function

G<-function(x){

b=2.71828182846^(-x^2)/(2*pi)^0.5

return(b)

t=0.01

#random generator

x_0=rnorm(200,0,1)

y_0=rnorm(200,0,1)

data_0=cbind(x_0,y_0)

#process initial est.

dis_0=bkde2D(data_0,c(0.5,0.5),grid=c(1001L,1001L))

se=matrix(nrow=1001,ncol=1001,byrow=TRUE)

rse=matrix(nrow=1001,ncol=1001,byrow=TRUE)

for(i in 1:1001){

for(j in 1:1001){

se[i,j]=(G(dis_0$x1[i])*G(dis_0$x2[j])-dis_0$fhat[i,j])^2

rse[i,j]=((G(dis_0$x1[i])*G(dis_0$x2[j])-dis_0$fhat[i,j])/(G(dis_0$x1[i])*G(dis_0$x2[j])))^2

##Run advection using RK4

#velocity field

ode<-function(x){

m=matrix(ncol=2,nrow=1,byrow=TRUE)

m[1]=-.5*x[2]

m[2]=.5*x[1]

return(m)

#the Runge-Kutta Method

RK<-function(x){

k_1=ode(x)*t

k_2=ode(x+0.5*k_1)*t

k_3=ode(x+0.5*k_2)*t

k_4=ode(x+k_3)*t

x_f=x+(k_1+2*k_2+2*k_3+k_4)/6

return(x_f)

#Run Advection

data=matrix(ncol=2,nrow=200,byrow=TRUE)

data=data_0

for(i in 1:5000){

for(j in 1:200){

data[j,]=RK(data[j,])

#process final est.

dis=bkde2D(data,c(0.5,0.5),grid=c(1001L,1001L))

se=matrix(nrow=1001,ncol=1001,byrow=TRUE)

rse=matrix(nrow=1001,ncol=1001,byrow=TRUE)

for(i in 1:1001){

for(j in 1:1001){

se[i,j]=(G(dis$x1[i])*G(dis$x2[j])-dis_0$fhat[i,j])^2

rse[i,j]=((G(dis$x1[i])*G(dis_0$x2[j])-dis$fhat[i,j])/(G(dis$x1[i])*G(dis$x2[j])))^2

##Kriging Estimation

#Gaussian Function

G<-function(x){

b=2.71828182846^(-x^2)/(2*pi)^0.5

return(b)

#input observed data

obs=matrix(nrow=ns,ncol=3,byrow=TRUE)

for(i in 1:10){

x_obs[i]=cos(2*pi*i/10)

x_obs[i+10]=2*cos(2*pi*i/10)

x_obs[i+20]=.5*cos(2*pi*i/10)

y_obs[i]=sin(2*pi*i/10)

y_obs[i+10]=2*sin(2*pi*i/10)

y_obs[i+20]=.5*sin(2*pi*i/10)

x_obs[31]=0

y_obs[31]=0

d=matrix(nrow=length(x_obs),ncol=length(y_obs),byrow=TRUE)

sigma=c()

for(j in 1:length(x_obs)){

obs[j,1]=G(x_obs[j])*G(y_obs[j])

obs[j,2]=-.5*y_obs[j]

obs[j,3]=.5*x_obs[j]

for(k in 1:length(x_obs)){

d[j,k]=((x_obs[k]-x_obs[j])-obs[j,2]*t)^2+((y_obs[k]-y_obs[j])-obs[j,3]*t)^2

int=.001

c_1=matrix(nrow=length(x_obs),ncol=length(y_obs),byrow=TRUE)

for(i in L:U){

for(k in 1:length(y_obs)){

c_1[j,k]=obs[j,1]*d[j,k]*G(-sqrt(d[j,k]/(int*i)))

c_2[j,k]=G(-sqrt(d[j,k]/(int*i)))

c_3[j,k]=obs[j,1]*G(-sqrt(d[j,k]/(int*i)))

c_4[j,k]=d[j,k]*G(-sqrt(d[j,k]/(int*i)))

f[i]=(sum(c_1)*sum(c_2)-sum(c_4)*sum(c_3))^2

plot(seq(int*L,int*U,int),f[c(L:U)],main="Constrain Distribution",xlab="sigma",ylab="f(x)")

sigma=0.42 #sorting function requires

sigma=sum(d)/2/ns^2

#hybrid

int=.00001

L=43590

U=43600

for(i in L:U){

for(k in 1:length(y_obs)){

c_1[j,k]=obs[j,1]*d[j,k]*G(-sqrt(d[j,k]/(int*i)))

c_2[j,k]=G(-sqrt(d[j,k]/(int*i)))

c_3[j,k]=obs[j,1]*G(-sqrt(d[j,k]/(int*i)))

c_4[j,k]=d[j,k]*G(-sqrt(d[j,k]/(int*i)))

f[i]=(sum(c_1)*sum(c_2)-sum(c_4)*sum(c_3))^2

g[i]=(log(2*pi*(int*i)^2)+sum(d)/2/(int*i)/ns^2)^2

h[i]=g[i]+f[i]

plot(seq(int*L,int*U,int),h[c(L:U)],main="Constrain Distribution",xlab="sigma",ylab="h(x)")

sigma=.43595

#set grid

x_est=seq(-3,3,0.1)

y_est=seq(-3,3,0.1)

d_est=array(dim=c(length(x_est),length(y_est),length(x_obs)))

#estimation

est=matrix(nrow=length(x_est),ncol=length(y_est),byrow=TRUE)

err=matrix(nrow=length(x_est),ncol=length(y_est),byrow=TRUE)

rel=matrix(nrow=length(x_est),ncol=length(y_est),byrow=TRUE)

for(i in 1:length(x_est)){

for(j in 1:length(y_est)){

for(s in 1:length(x_obs)){

d_est[i,j,s]=((x_est[i]-x_obs[s])-obs[s,2]*t)^2+((y_est[j]-y_obs[s])-obs[s,3]*t)^2

est[i,j]=sum(obs[,1]*G(-sqrt(d_est[i,j,]/sigma)))/sum(G(-sqrt(d_est[i,j,]/sigma)))

err[i,j]=(G(x_est[i])*G(y_est[j])-est[i,j])^2

rel[i,j]=err[i,j]/(G(x_est[i])*G(y_est[j]))^2

3. Results and Discussion

1) The error in between the middle of sample points is the greatest.2) Relative error is acceptable only within the circle with radius 1, no matter how you select your sites.3) Site selection only alters the predicting ability within the r<1 circle.4) Inside r<0.5, prediction ability decreases; this can be improved when site selection is closer to origin.

Atmospheric dynamics final project: A Trivial Test on a Bayesian Statistic Distribution Model for...

Environment

Nutricion trivial

Trivial locomotor

Naive Bayesian and Bayesian Network

Bayesian Decision and Bayesian Learning

Atmospheric Thermodynamics

TRIVIAL Autonomia

Bayesian Networks October 9, 2008 Sung-Bae Cho. Bayesian Network –Introduction –Inference of Bayesian Network –Modeling of Bayesian Network Bayesian Network

Trivial Doblesentidos

Bayesian net16409

Trivial digestiu

Bayesian Networks

Atmospheric correction

Trivial Biblioteca

Trivial verbos

Bayesian analysis

Trivial Creativo

Trivial infantil

Trivial disney

Bayesian Learning

Trivial e.f