MACHINE LEARNING, AN EMULATOR, AND THE …ƒスト京研究報告会 2016.10.14 MACHINE LEARNING, AN EMULATOR, AND THE FUTURE OF COSMOLOGICAL SIMULATIONS BY U-TOKYO GROUP (YOSHIDA,

ポスト京研究報告会 2016.10.14

MACHINE LEARNING, AN EMULATOR, AND THE FUTURE OF COSMOLOGICAL SIMULATIONS

BY U-TOKYO GROUP (YOSHIDA, NISHIMICHI, OOGI)

▸ Cosmology with Subaru HSC survey

▸ A large number of cosmological N-body simulations

▸ Parameter space exploration and the Gaussian process

▸ Direct integration of collisionless Boltzmann equation

▸ Future prospects

サブCの活動▸ ~３か月おきに全体会合(2016.3, 6, 9) 　　　　　　各機関報告や、分野連携(BH降着+semi-analytic法など）　　を推進. 次回会合を国内WSにすることを検討中　　　

(理論懇、CfCA UM, ポスト京9全体などを避ける)

▸ ブラソフ、N体、プラズマ、BH各班順調にコード開発　　　ただし京コンピューター前期は1M nh ほど未使用だった

G-G LENSING SIMULATIONS SLIDE MADE BY T. NISHIMICHI

THE SIMULATION DESIGN IN HIGH DIMENSION SPACE

：：

6D cosmological parameter space

Latin Hypercube (40 sampling points) + Gaussian process

spline, PCA, model fitting

other dependence (time, scale, mass, …) quick prediction to be compared with given data to run an MCMC chain

sampling points: 21, 81, 80

G-G LENSING SIMULATIONS

EFFICIENT SAMPLING IN MULTI DIMENSIONAL SPACE: LATIN HYPERCUBE

‣ Each sample is the only one in each axis-aligned hyperplane containing it ‣ One can find many realizations of such

design (ex. diagonal design)‣ Impose additional condition such as “the sum

of the distances to the nearest design point is maximal” (maximin distance)

！！！！！

！！！！

！

cosmological parameter 1

cosm

olog

ical

par

amet

er 2


cosm

olog

ical

par

amet

er 2


cosm

olog

ical

par

amet

er 2

cosmological parameter 1co

smol

ogic

al p

aram

eter

2



ωb = Ωbh2: ±5%ωc = Ωch2: ±10%ΩΛ: ±20%ln(1010 As): ±20%ns: ±5%w: ±20%

varied cosmology ‣ “sliced” LH design (Ba,

Brenneman & Myers ’15) ‣ generate 100 samples eventually ‣ maxi-min distance LH design for

every 20 models (e.g., red/blue points)

‣ 2 types of sims ‣ keep the initial random

number seed (20 done) ‣ independent seeds (40 done) ‣ red: emulator ‣ blue: validation

fiducial model ‣ PLANCK15 flat ΛCDM ‣ 24 realizations done ‣ assess statistical error ‣ check the accuracy of the

emulator84 sims are already

available in total

EFFICIENT SAMPLING IN MULTI DIMENSIONAL SPACE: LATIN HYPERCUBE



SIMULATION SPEC

✓ N of particles: 20483

✓ box size: 1h-1Gpc resolve a 1012 h-1Msolar halo with ~100 particles

✓ 2nd-order Lagrangian PT initial condition @ zin=59 (vary slightly for different cosmologies to keep the RMS displacement about 25% of the inter-particle separation)

✓ Tree-PM force by L-Gadget2 (w/ 40963 PM mesh)

✓ 21 outputs in 0 ≦ z ≦ 1.5 (equispaced in linear growth factor)

✓ Data compression (256GB -> 48GB par snapshot)

✓ positions -> displacement (16 bits par dimension; accuracy ~1h-1kpc)

✓ velocity: discard after halo identification

✓ ID: rearrange the order of particles by ID and then discard

✓ already consuming ~200TB in half a year (~observational data)

GAUSSIAN PROCESSG-G LENSING SIMULATIONS

‣ A kind of machine learning that interpolates in function space ‣ non-parametic Bayesian inference ‣ quick in high dimension input space

‣ Basic quantities (c.f., normal distribution) ‣ mean function (cf. mean) ‣ covariance function (cf. variance)

‣ mean can be anything, set zero usually ‣ covariance function is characterized

by a simple function with several hyper parameters

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.

45.4: Examples of covariance functions 543

matrix for the vector tN+1 ≡ (t1, . . . , tN+1)T. We define submatrices of CN+1

as follows:

CN+1 ≡

⎡

⎢⎢⎢⎣

⎡

⎣ CN

⎤

⎦

⎡

⎣ k

⎤

⎦

[kT

] [κ]

⎤

⎥⎥⎥⎦. (45.35)

The posterior distribution (45.34) is given by

P (tN+1 | tN ) ∝ exp[−1

2[tN tN+1

]C−1

N+1

[tN

tN+1

]]. (45.36)

We can evaluate the mean and standard deviation of the posterior distributionof tN+1 by brute-force inversion of CN+1. There is a more elegant expressionfor the predictive distribution, however, which is useful whenever predictionsare to be made at a number of new points on the basis of the data set of sizeN . We can write C−1

N+1 in terms of CN and C−1N using the partitioned inverse

equations (Barnett, 1979):

C−1N+1 =

[M mmT m

](45.37)

where

m =(κ− kTC−1

N k)−1 (45.38)

m = −m C−1N k (45.39)

M = C−1N +

1m

mmT . (45.40)

When we substitute this matrix into equation (45.36) we find

P (tN+1 | tN ) =1Z

exp

[−(tN+1 − t̂N+1)2

2σ2t̂N+1

](45.41)

where

t̂N+1 = kTC−1N tN (45.42)

σ2t̂N+1

= κ− kTC−1N k. (45.43)

The predictive mean at the new point is given by t̂N+1 and σt̂N+1defines the

error bars on this prediction. Notice that we do not need to invert CN+1 inorder to make predictions at x(N+1). Only CN needs to be inverted. ThusGaussian processes allow one to implement a model with a number of basisfunctions H much larger than the number of data points N , with the com-putational requirement being of order N 3, independent of H. [We’ll discussways of reducing this cost later.]

The predictions produced by a Gaussian process depend entirely on thecovariance matrix C. We now discuss the sorts of covariance functions onemight choose to define C, and how we can automate the selection of thecovariance function in response to data.

45.4 Examples of covariance functions

The only constraint on our choice of covariance function is that it must gen-erate a non-negative-definite covariance matrix for any set of points {xn}N

n=1.

Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.

45.4: Examples of covariance functions 543

matrix for the vector tN+1 ≡ (t1, . . . , tN+1)T. We define submatrices of CN+1

as follows:

CN+1 ≡

⎡

⎢⎢⎢⎣

⎡

⎣ CN

⎤

⎦

⎡

⎣ k

⎤

⎦

[kT

] [κ]

⎤

⎥⎥⎥⎦. (45.35)

The posterior distribution (45.34) is given by

P (tN+1 | tN ) ∝ exp[−1

2[tN tN+1

]C−1

N+1

[tN

tN+1

]]. (45.36)

We can evaluate the mean and standard deviation of the posterior distributionof tN+1 by brute-force inversion of CN+1. There is a more elegant expressionfor the predictive distribution, however, which is useful whenever predictionsare to be made at a number of new points on the basis of the data set of sizeN . We can write C−1

N+1 in terms of CN and C−1N using the partitioned inverse

equations (Barnett, 1979):

C−1N+1 =

[M mmT m

](45.37)

where

m =(κ− kTC−1

N k)−1 (45.38)

m = −m C−1N k (45.39)

M = C−1N +

1m

mmT . (45.40)

When we substitute this matrix into equation (45.36) we find

P (tN+1 | tN ) =1Z

exp

[−(tN+1 − t̂N+1)2

2σ2t̂N+1

](45.41)

where

t̂N+1 = kTC−1N tN (45.42)

σ2t̂N+1

= κ− kTC−1N k. (45.43)

The predictive mean at the new point is given by t̂N+1 and σt̂N+1defines the

error bars on this prediction. Notice that we do not need to invert CN+1 inorder to make predictions at x(N+1). Only CN needs to be inverted. ThusGaussian processes allow one to implement a model with a number of basisfunctions H much larger than the number of data points N , with the com-putational requirement being of order N 3, independent of H. [We’ll discussways of reducing this cost later.]

The predictions produced by a Gaussian process depend entirely on thecovariance matrix C. We now discuss the sorts of covariance functions onemight choose to define C, and how we can automate the selection of thecovariance function in response to data.

45.4 Examples of covariance functions

The only constraint on our choice of covariance function is that it must gen-erate a non-negative-definite covariance matrix for any set of points {xn}N

n=1.

answer: ex.

✓infer hyper parameters θ from training data (xi, ti)

✓Given any input xN+1, infer tN+1 from θ and (xi, ti)

(x1, t1)(x2, t2)

(x3, t3)(x4, t4)

(x5, t5)(x6, t6)

SUMMARY + FUTURE▸ Modeling the halo mass function and galaxy-

galaxy lensing signal ▸ Latin hypercube design + fitting/GP/spline ▸ handy emulator in python almost ready ▸ accuracy test undergoing, naively expect 5%

accuracy

▸ To come ▸ scikit-learn → george ▸ RSD emulator to combine g-g lensing and 3D

clustering (needs bigger volume) ▸ further extension under discussion ▸ e.g., non-flat, w0-wa cosmologies


：：

(6++)-D cosmological parameter space

other dependence (time, scale, mass, …)

1970 1980 1990 2000 2010 2020

10 3

10 7

10 5

10 9

N

10 11

umbe

r of m

ass e

lem

ents

Evolution of cosmological simulations

Hubble Volume

Gelb & Bertschinger

Peebles, Miyoshi & Kihara

Suginohara, Suto, Bouchet, Hernquist

Y.P.Jing

Davis, Efstathiou, Frenk, White

Aarseth, Gott, Turner

Horizon run

Millennium simulation

10 13

Dark sky

“Numerical” prediction

N > 1012

2020 N ~ 1015

( )

2073 N

ms ESHYV u  ESHYV & VPYYVU

u Ø  , XLN SHX TLYO vdz

, XLN SHX TLYO e

Ø  ESHYV vd uye , , / e

Ø  88C j e

u  VYOPRH H VYOPKH HUK DTLT XH +) ,

b  / ESHYV & VPYYVU gb  ��

w i , VYP P L 8S 5VUYLX H P L g c 8PSIL L HS +)),

b  / ecc p s j j o – g o

b  85 g v 56 >v g g o – e

h v j o j j

u  ESHYV & VPYYVUØ  r gØ  r

@

LTP& HNXHUNL flu  tt A UNL& H 7 SLX r g &3 A

u  n–i j j kojje . VX 0 v ,CE6 A UNL& H . 0 l jjodz v o

l s jea o g

l v ke

u  A g PTP LX w oi e

u  o d o wpx– jflk g LTP& HNXHUNL w e

u

YPU- H L

/ g u  /

Ø Ø  v w VYP P L SPTP LX

u  f56ü  85 g v > v g

g o – eü  > 85 , A . . e

85 , >&IVKA . .

@

1286 simulation

100 Mpc on a side

Hot Dark Matter Cold Dark Matter

V LX Y L X T HYY VUYLX H PVU

ü  o ur d,pr vd.

g eü  g e

ü  , oo s jre

k

P(k)

@

Ø  :U LS 4EF 4EF+ 2 +./IP aVH 1 KV ISL - M H(F5,) :C (F5-) Y Y R IH( 4& 45 5 4

Ø  :U LS 4EF. + 2 . +IP aVH / KV ISL 1 HRMVXLY & 45 FLVU OP UPNO Y HUKPUN YR SHRL j jo w z r x

Ø  c8 P Y 5&457 2 +1IPccccccccccKV ISL + aVH v ccccccc cccccc

@

Documents

MACHINE LEARNING, AN EMULATOR, AND THE …ƒスト京 研究報告会 2016.10.14 MACHINE LEARNING, AN EMULATOR, AND THE FUTURE OF COSMOLOGICAL SIMULATIONS BY U-TOKYO GROUP (YOSHIDA,

MACHINE LEARNING, AN EMULATOR, AND THE …ƒスト京研究報告会 2016.10.14 MACHINE LEARNING, AN EMULATOR, AND THE FUTURE OF COSMOLOGICAL SIMULATIONS BY U-TOKYO GROUP (YOSHIDA,