Upload
dinhthuan
View
227
Download
0
Embed Size (px)
Citation preview
ポスト京 研究報告会 2016.10.14
MACHINE LEARNING, AN EMULATOR, AND THE FUTURE OF COSMOLOGICAL SIMULATIONS
BY U-TOKYO GROUP (YOSHIDA, NISHIMICHI, OOGI)
▸ Cosmology with Subaru HSC survey
▸ A large number of cosmological N-body simulations
▸ Parameter space exploration and the Gaussian process
▸ Direct integration of collisionless Boltzmann equation
▸ Future prospects
サブCの活動▸ ~3か月おきに全体会合(2016.3, 6, 9) 各機関報告や、分野連携(BH降着+semi-analytic法など) を推進. 次回会合を国内WSにすることを検討中
(理論懇、CfCA UM, ポスト京9全体などを避ける)
▸ ブラソフ、N体、プラズマ、BH各班順調にコード開発 ただし京コンピューター前期は1M nh ほど未使用だった
G-G LENSING SIMULATIONS SLIDE MADE BY T. NISHIMICHI
THE SIMULATION DESIGN IN HIGH DIMENSION SPACE
: :
6D cosmological parameter space
Latin Hypercube (40 sampling points) + Gaussian process
spline, PCA, model fitting
other dependence (time, scale, mass, …) quick prediction to be compared with given data to run an MCMC chain
sampling points: 21, 81, 80
G-G LENSING SIMULATIONS
EFFICIENT SAMPLING IN MULTI DIMENSIONAL SPACE: LATIN HYPERCUBE
‣ Each sample is the only one in each axis-aligned hyperplane containing it ‣ One can find many realizations of such
design (ex. diagonal design)‣ Impose additional condition such as “the sum
of the distances to the nearest design point is maximal” (maximin distance)
! ! ! !!
!!!!
!
cosmological parameter 1
cosm
olog
ical
par
amet
er 2
cosmological parameter 1
cosm
olog
ical
par
amet
er 2
cosmological parameter 1
cosm
olog
ical
par
amet
er 2
cosmological parameter 1co
smol
ogic
al p
aram
eter
2
G-G LENSING SIMULATIONS SLIDE MADE BY T. NISHIMICHI
G-G LENSING SIMULATIONS
ωb = Ωbh2: ±5%ωc = Ωch2: ±10%ΩΛ: ±20%ln(1010 As): ±20%ns: ±5%w: ±20%
varied cosmology ‣ “sliced” LH design (Ba,
Brenneman & Myers ’15) ‣ generate 100 samples eventually ‣ maxi-min distance LH design for
every 20 models (e.g., red/blue points)
‣ 2 types of sims ‣ keep the initial random
number seed (20 done) ‣ independent seeds (40 done) ‣ red: emulator ‣ blue: validation
fiducial model ‣ PLANCK15 flat ΛCDM ‣ 24 realizations done ‣ assess statistical error ‣ check the accuracy of the
emulator84 sims are already
available in total
EFFICIENT SAMPLING IN MULTI DIMENSIONAL SPACE: LATIN HYPERCUBE
G-G LENSING SIMULATIONS SLIDE MADE BY T. NISHIMICHI
G-G LENSING SIMULATIONS
SIMULATION SPEC
✓ N of particles: 20483
✓ box size: 1h-1Gpc resolve a 1012 h-1Msolar halo with ~100 particles
✓ 2nd-order Lagrangian PT initial condition @ zin=59 (vary slightly for different cosmologies to keep the RMS displacement about 25% of the inter-particle separation)
✓ Tree-PM force by L-Gadget2 (w/ 40963 PM mesh)
✓ 21 outputs in 0 ≦ z ≦ 1.5 (equispaced in linear growth factor)
✓ Data compression (256GB -> 48GB par snapshot)
✓ positions -> displacement (16 bits par dimension; accuracy ~1h-1kpc)
✓ velocity: discard after halo identification
✓ ID: rearrange the order of particles by ID and then discard
✓ already consuming ~200TB in half a year (~observational data)
GAUSSIAN PROCESSG-G LENSING SIMULATIONS
‣ A kind of machine learning that interpolates in function space ‣ non-parametic Bayesian inference ‣ quick in high dimension input space
‣ Basic quantities (c.f., normal distribution) ‣ mean function (cf. mean) ‣ covariance function (cf. variance)
‣ mean can be anything, set zero usually ‣ covariance function is characterized
by a simple function with several hyper parameters
Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.
45.4: Examples of covariance functions 543
matrix for the vector tN+1 ≡ (t1, . . . , tN+1)T. We define submatrices of CN+1
as follows:
CN+1 ≡
⎡
⎢⎢⎢⎣
⎡
⎣ CN
⎤
⎦
⎡
⎣ k
⎤
⎦
[kT
] [κ]
⎤
⎥⎥⎥⎦. (45.35)
The posterior distribution (45.34) is given by
P (tN+1 | tN ) ∝ exp[−1
2[tN tN+1
]C−1
N+1
[tN
tN+1
]]. (45.36)
We can evaluate the mean and standard deviation of the posterior distributionof tN+1 by brute-force inversion of CN+1. There is a more elegant expressionfor the predictive distribution, however, which is useful whenever predictionsare to be made at a number of new points on the basis of the data set of sizeN . We can write C−1
N+1 in terms of CN and C−1N using the partitioned inverse
equations (Barnett, 1979):
C−1N+1 =
[M mmT m
](45.37)
where
m =(κ− kTC−1
N k)−1 (45.38)
m = −m C−1N k (45.39)
M = C−1N +
1m
mmT . (45.40)
When we substitute this matrix into equation (45.36) we find
P (tN+1 | tN ) =1Z
exp
[−(tN+1 − t̂N+1)2
2σ2t̂N+1
](45.41)
where
t̂N+1 = kTC−1N tN (45.42)
σ2t̂N+1
= κ− kTC−1N k. (45.43)
The predictive mean at the new point is given by t̂N+1 and σt̂N+1defines the
error bars on this prediction. Notice that we do not need to invert CN+1 inorder to make predictions at x(N+1). Only CN needs to be inverted. ThusGaussian processes allow one to implement a model with a number of basisfunctions H much larger than the number of data points N , with the com-putational requirement being of order N 3, independent of H. [We’ll discussways of reducing this cost later.]
The predictions produced by a Gaussian process depend entirely on thecovariance matrix C. We now discuss the sorts of covariance functions onemight choose to define C, and how we can automate the selection of thecovariance function in response to data.
45.4 Examples of covariance functions
The only constraint on our choice of covariance function is that it must gen-erate a non-negative-definite covariance matrix for any set of points {xn}N
n=1.
Copyright Cambridge University Press 2003. On-screen viewing permitted. Printing not permitted. http://www.cambridge.org/0521642981You can buy this book for 30 pounds or $50. See http://www.inference.phy.cam.ac.uk/mackay/itila/ for links.
45.4: Examples of covariance functions 543
matrix for the vector tN+1 ≡ (t1, . . . , tN+1)T. We define submatrices of CN+1
as follows:
CN+1 ≡
⎡
⎢⎢⎢⎣
⎡
⎣ CN
⎤
⎦
⎡
⎣ k
⎤
⎦
[kT
] [κ]
⎤
⎥⎥⎥⎦. (45.35)
The posterior distribution (45.34) is given by
P (tN+1 | tN ) ∝ exp[−1
2[tN tN+1
]C−1
N+1
[tN
tN+1
]]. (45.36)
We can evaluate the mean and standard deviation of the posterior distributionof tN+1 by brute-force inversion of CN+1. There is a more elegant expressionfor the predictive distribution, however, which is useful whenever predictionsare to be made at a number of new points on the basis of the data set of sizeN . We can write C−1
N+1 in terms of CN and C−1N using the partitioned inverse
equations (Barnett, 1979):
C−1N+1 =
[M mmT m
](45.37)
where
m =(κ− kTC−1
N k)−1 (45.38)
m = −m C−1N k (45.39)
M = C−1N +
1m
mmT . (45.40)
When we substitute this matrix into equation (45.36) we find
P (tN+1 | tN ) =1Z
exp
[−(tN+1 − t̂N+1)2
2σ2t̂N+1
](45.41)
where
t̂N+1 = kTC−1N tN (45.42)
σ2t̂N+1
= κ− kTC−1N k. (45.43)
The predictive mean at the new point is given by t̂N+1 and σt̂N+1defines the
error bars on this prediction. Notice that we do not need to invert CN+1 inorder to make predictions at x(N+1). Only CN needs to be inverted. ThusGaussian processes allow one to implement a model with a number of basisfunctions H much larger than the number of data points N , with the com-putational requirement being of order N 3, independent of H. [We’ll discussways of reducing this cost later.]
The predictions produced by a Gaussian process depend entirely on thecovariance matrix C. We now discuss the sorts of covariance functions onemight choose to define C, and how we can automate the selection of thecovariance function in response to data.
45.4 Examples of covariance functions
The only constraint on our choice of covariance function is that it must gen-erate a non-negative-definite covariance matrix for any set of points {xn}N
n=1.
answer: ex.
✓infer hyper parameters θ from training data (xi, ti)
✓Given any input xN+1, infer tN+1 from θ and (xi, ti)
(x1, t1)(x2, t2)
(x3, t3)(x4, t4)
(x5, t5)(x6, t6)
SUMMARY + FUTURE▸ Modeling the halo mass function and galaxy-
galaxy lensing signal ▸ Latin hypercube design + fitting/GP/spline ▸ handy emulator in python almost ready ▸ accuracy test undergoing, naively expect 5%
accuracy
▸ To come ▸ scikit-learn → george ▸ RSD emulator to combine g-g lensing and 3D
clustering (needs bigger volume) ▸ further extension under discussion ▸ e.g., non-flat, w0-wa cosmologies
G-G LENSING SIMULATIONS
: :
(6++)-D cosmological parameter space
other dependence (time, scale, mass, …)
1970 1980 1990 2000 2010 2020
10 3
10 7
10 5
10 9
N
10 11
umbe
r of m
ass e
lem
ents
Evolution of cosmological simulations
Hubble Volume
Gelb & Bertschinger
Peebles, Miyoshi & Kihara
Suginohara, Suto, Bouchet, Hernquist
Y.P.Jing
Davis, Efstathiou, Frenk, White
Aarseth, Gott, Turner
Horizon run
Millennium simulation
10 13
Dark sky
“Numerical” prediction
N > 1012
2020 N ~ 1015
( )
2073 N
ms ESHYV u ESHYV & VPYYVU
u Ø , XLN SHX TLYO vdz
, XLN SHX TLYO e
Ø ESHYV vd uye , , / e
Ø 88C j e
u VYOPRH H VYOPKH HUK DTLT XH +) ,
b / ESHYV & VPYYVU gb ��� �����������
w i , VYP P L 8S 5VUYLX H P L g c 8PSIL L HS +)),
b / ecc p s j j o – g o
b 85 g v 56 >v g g o – e
h v j o j j
u ESHYV & VPYYVUØ r gØ r
@
LTP& HNXHUNL flu tt A UNL& H 7 SLX r g &3 A
u n–i j j kojje . VX 0 v ,CE6 A UNL& H . 0 l jjodz v o
l s jea o g
l v ke
u A g PTP LX w oi e
u o d o wpx– jflk g LTP& HNXHUNL w e
u
YPU- H L
/ g u /
Ø Ø v w VYP P L SPTP LX
u f56ü 85 g v > v g
g o – eü > 85 , A . . e
85 , >&IVKA . .
@
1286 simulation
100 Mpc on a side
Hot Dark Matter Cold Dark Matter
V LX Y L X T HYY VUYLX H PVU
ü o ur d,pr vd.
g eü g e
ü , oo s jre
k
P(k)
@
Ø :U LS 4EF 4EF+ 2 +./IP aVH 1 KV ISL - M H(F5,) :C (F5-) Y Y R IH( 4& 45 5 4
Ø :U LS 4EF. + 2 . +IP aVH / KV ISL 1 HRMVXLY & 45 FLVU OP UPNO Y HUKPUN YR SHRL j jo w z r x
Ø c8 P Y 5&457 2 +1IPccccccccccKV ISL + aVH v ccccccc cccccc
@