56
Sparse modeling 2 Shiro Ikeda The Institute of Statistical Mathematics 3 July 2015 Ikeda (ISM) Sparse modeling 3/July/2015 1 / 56

Sparse modeling 2 - 東京工業大学watanabe- · Sparse modeling 2 Shiro Ikeda The ... Black hole is \black," but surrounding area emit light. ... Size of black holes Ikeda (ISM)

Embed Size (px)

Citation preview

Sparse modeling 2

Shiro Ikeda

The Institute of Statistical Mathematics

3 July 2015

Ikeda (ISM) Sparse modeling 3/July/2015 1 / 56

Some applications

Sparsity and Information Processing

minx

[∥y − Ax∥2ℓ2 + λ∥x∥ℓ1

]estimation, model selection Reflection seismology

estimation, prediction, model selection fMRI and EMG

inverse problem, imaging Imaging of black hole

X ≃ L+M , L: low-rank, M : sparse

sparse matrix Movie analysis

Ikeda (ISM) Sparse modeling 3/July/2015 2 / 56

Some applications Reflection seismology

Some applicationsReflection seismologyfMRI and EMGImaging of black holeSparse matrix

How to solve the problems

Conclusion

Ikeda (ISM) Sparse modeling 3/July/2015 3 / 56

Some applications Reflection seismology

Reflection seismology minx

[∥y − Ax∥2ℓ2 + λ∥x∥ℓ1

]▶ Investigate the structure

underground.

▶ Make a sound with aloud-speaker and recordthe reflections.

▶ Reflections only occurs atthe boundaries.

Taylor, Banks, & McCoy (1979). “Deconvolution with the ℓ1 norm,”Geophysics, 44(1), 39-52.

Santosa & Symes (1986). “Linear inversion of band-limited reflectionseismograms,” SIAM. J. Sci. Stat. Comp., 7(4), 1307-1330.

Ikeda (ISM) Sparse modeling 3/July/2015 4 / 56

Some applications Reflection seismology

Reflection seismology

g(t): Sound recorded by the microphone

f(t): Sound from loud-speaker

h(t): Impulse response

0time

r(t) h(t) is non-zero only whenthere is a reflection. Thenumber of the boundaries is notlarge, and it is sparse.

Ikeda (ISM) Sparse modeling 3/July/2015 5 / 56

Some applications Reflection seismology

Reflection seismology

g(t) =

∫ t

0f(t− τ)h(τ)dτ

discretize as follows

g = (g1, · · · , gn)T , gi = g(i∆t)

F = (Fij), Fij = f((i− j + 1)∆t)

h = (h1, · · · , hn)T , hi = h(i∆t)

then,

g(i∆t) =∑j

f((i− j + 1)∆t)h(j∆t).

Ikeda (ISM) Sparse modeling 3/July/2015 6 / 56

Some applications Reflection seismology

Reflection seismology

g(i∆t) =∑j

f((i− j + 1)∆t)h(j∆t)

g = Fh

h is sparse, and we can use the LASSO framework.

minh

[∥g − Fh∥2ℓ2 + λ∥h∥ℓ1

].

Ikeda (ISM) Sparse modeling 3/July/2015 7 / 56

Some applications fMRI and EMG

Some applicationsReflection seismologyfMRI and EMGImaging of black holeSparse matrix

How to solve the problems

Conclusion

Ikeda (ISM) Sparse modeling 3/July/2015 8 / 56

Some applications fMRI and EMG

fMRI and EMG minx

[∥y − Ax∥2ℓ2 + λ∥x∥ℓ1

]Ask the subject to control the sum of the EMG, and record the EMGof agonist and antagonist muscles

Ganesh, Burdet, Haruno, & Kawato (2008). “Sparse linear regressionfor reconstructing muscle activity from human cortical fMRI,”NeuroImage, 42(4), 1463-1472.

Ikeda (ISM) Sparse modeling 3/July/2015 9 / 56

Some applications fMRI and EMG

fMRI and EMG

Torque corresponding to EMG

Ikeda (ISM) Sparse modeling 3/July/2015 10 / 56

Some applications fMRI and EMG

fMRI and EMG

Voxels of the brain related to the EMG

Ikeda (ISM) Sparse modeling 3/July/2015 11 / 56

Some applications fMRI and EMG

fMRI and EMG

Estimated EMG

Ikeda (ISM) Sparse modeling 3/July/2015 12 / 56

Some applications Imaging of black hole

Some applicationsReflection seismologyfMRI and EMGImaging of black holeSparse matrix

How to solve the problems

Conclusion

Ikeda (ISM) Sparse modeling 3/July/2015 13 / 56

Some applications Imaging of black hole

Black hole (Joint work with Mareki Honma at NAOJ)

VLBI: Very Long Baseline Interferometer

▶ Resolution of a telescope Θ depends on the aperture D and thewavelength λ as

Θ ∝ λ

D

for example, for λ ≃ 1mm and D ≃ 8000km, Θ ≃ 25µarcsecond

Ikeda (ISM) Sparse modeling 3/July/2015 14 / 56

Some applications Imaging of black hole

Interferometer

Ikeda (ISM) Sparse modeling 3/July/2015 15 / 56

Some applications Imaging of black hole

Target of VLBI

Ikeda (ISM) Sparse modeling 3/July/2015 16 / 56

Some applications Imaging of black hole

EHT: Event Horizon Telescope

Ikeda (ISM) Sparse modeling 3/July/2015 17 / 56

Some applications Imaging of black hole

Imaging of black hole

VLBI: Very Long Baseline Interferometer

▶ Our goal is to take a image of a black hole with the interferometer.

▶ Black hole is “black,” but surrounding area emit light.

Ikeda (ISM) Sparse modeling 3/July/2015 18 / 56

Some applications Imaging of black hole

Size of black holes

Ikeda (ISM) Sparse modeling 3/July/2015 19 / 56

Some applications Imaging of black hole

Resolution of a telescope

Ikeda (ISM) Sparse modeling 3/July/2015 20 / 56

Some applications Imaging of black hole

Interferometer

Basic equation

The relation between image I(x, y) and the observation at the position(u, v), S(u, v) is

I(x, y) =

∫ ∫S(u, v)e−2πi(ux+vy)dudv

S(u, v) =

∫ ∫I(x, y)e2πi(ux+vy)dxdy.

Observation and the image are related with the Fourier transform.

Ikeda (ISM) Sparse modeling 3/July/2015 21 / 56

Some applications Imaging of black hole

Interferometer

Measurements and S(u, v)

Ikeda (ISM) Sparse modeling 3/July/2015 22 / 56

Some applications Imaging of black hole

Interferometer

Problem

Ideally, compute the image I(x, y) by using the inverse Fourier transformto S(u, v), but since the observation is only measured partially, there is adifficulty.

S(u, v)

2-dim Fouriertransformation

I(x, y)

Ikeda (ISM) Sparse modeling 3/July/2015 23 / 56

Some applications Imaging of black hole

Interferometer

Problem

Applying LASSO.

Ikeda (ISM) Sparse modeling 3/July/2015 24 / 56

Some applications Imaging of black hole

Inverse problem

Problem

Simulate black hole image, and compute the observation where the pointsof observation is 2/5 of the whole points. Then, reconstruct the image isreconstructed with LASSO.

black hole image

Image computed byLASSO

Reconstructed image

5 10 15 20 25 30

5

10

15

20

25

30

Real observation data will be provided in 2015–2016. We prepare for dataanalysis.

Ikeda (ISM) Sparse modeling 3/July/2015 25 / 56

Some applications Sparse matrix

Some applicationsReflection seismologyfMRI and EMGImaging of black holeSparse matrix

How to solve the problems

Conclusion

Ikeda (ISM) Sparse modeling 3/July/2015 26 / 56

Some applications Sparse matrix

Separation of a matrix

Separation of a matrix

M = L+ S

minL,S

[∥∥M∥∥∗ + λ

∥∥S∥∥ℓ1

]subject to M = L+ S.

∥∥L∥∥∗ = tr(√LTL) =

∑σ(L) is the nuclear norm, which is the sum of

singular values.∥∥S∥∥

ℓ1=

∑ij |Sij |.

Ikeda (ISM) Sparse modeling 3/July/2015 27 / 56

Some applications Sparse matrix

Application

Candes, Li, Ma, & Wright (2011). “Robust principal componentanalysis?,” J. ACM, 58(3).

Ikeda (ISM) Sparse modeling 3/July/2015 28 / 56

Some applications Sparse matrix

Application

Ikeda (ISM) Sparse modeling 3/July/2015 29 / 56

How to solve the problems Compressed Sensing: Linear Programming

Some applications

How to solve the problemsCompressed Sensing: Linear ProgrammingLASSO: LARSLASSO: Quadratic ProgrammingLASSO: Iterative Shrinking Algorithm

Conclusion

Ikeda (ISM) Sparse modeling 3/July/2015 30 / 56

How to solve the problems Compressed Sensing: Linear Programming

Linear Programming

Let u, c ∈ ℜn be n-dimensional real vectors, A ∈ ℜm×n, and b ∈ ℜm.

Standard form

minu

ctu subject to Au ≤ b.

▶ Basic problem in optimization theory.

▶ Lot of numerical packages exist.

▶ Solved by simplex- or interior- methods.

▶ Large scale problems can be solved efficiently.

Solve compressed sensing with linear programming.

Ikeda (ISM) Sparse modeling 3/July/2015 31 / 56

How to solve the problems Compressed Sensing: Linear Programming

Linear Programming

Compressed Sensing

minx

∥∥x∥∥ℓ1

subject to Ax = y

Let xj = x+j − x−j , x+j ≥ 0, x−j ≥ 0,

minx+,x−

1tn(x+ + x−) subject to A(x+ − x−) = y

Ikeda (ISM) Sparse modeling 3/July/2015 32 / 56

How to solve the problems Compressed Sensing: Linear Programming

Linear Programming

Let γ = (x+1 , · · · , x+n , x−1 , · · · , x−n )t. For optimal γ, at least one of x+j

and x−j becomes 0 for all j.

minγ

1t2nγ subject to

(A −A−A A

)γ ≤

(y−y

), −E2nγ ≤ 02n

Ikeda (ISM) Sparse modeling 3/July/2015 33 / 56

How to solve the problems LASSO: LARS

Some applications

How to solve the problemsCompressed Sensing: Linear ProgrammingLASSO: LARSLASSO: Quadratic ProgrammingLASSO: Iterative Shrinking Algorithm

Conclusion

Ikeda (ISM) Sparse modeling 3/July/2015 34 / 56

How to solve the problems LASSO: LARS

Solving LASSO with LARS

LASSO problem

minx

[∥y −Ax∥2ℓ2 + λ∥x∥ℓ1

], λ > 0

LARS-LASSO

Solve the above problem for many λ’s.

Efron, Hastie, Johnstone, & Tibshirani, (2004). “Least angleregression,” The Annals of Statistics, 32(2), 407-499.

Rosset & Zhu, (2007). “Piecewise linear regularized solution paths,”The Annals of Statistics, 35(3), 1012–1030.

Ikeda (ISM) Sparse modeling 3/July/2015 35 / 56

How to solve the problems LASSO: LARS

Solving LASSO with LARS

LASSO

Let xj = x+j − x−j , x+j ≥ 0, x−j ≥ 0, and

minx+,x−

[∥∥y −A(x+ − x−)∥∥2ℓ2+ λ

n∑j=1

(x+j + x−j )]

subject to x+j , x−j ≥ 0 j = 1, · · · , n.

Lagrange function

minx+,x−

[L(x) + λ

n∑j=1

(x+j + x−j )−n∑

j=1

λ+j x

+j −

n∑j=1

λ−j x

−j

]

where L(x) =∥∥y −A(x+ − x−)

∥∥2ℓ2. Let x(λ) be the optimal x for a λ.

Ikeda (ISM) Sparse modeling 3/July/2015 36 / 56

How to solve the problems LASSO: LARS

Solving LASSO with LARS

Karush-Kuhn-Tucker Condition

∂xjL(x(λ)) + λ− λ+j = 0

−∂xjL(x(λ)) + λ− λ−j = 0

λ+j x

+j = 0, λ+

j ≥ 0

λ−j x

−j = 0, λ−

j ≥ 0 j = 1, · · · , n.

Study the above conditions, where

∂xjL(x) =∂L(x)

∂xj= −2(y −Ax)taj

and aj = (a1j , · · · , amj)t is the j-th column vector of A.

Ikeda (ISM) Sparse modeling 3/July/2015 37 / 56

How to solve the problems LASSO: LARS

Solving LASSO with LARS

Optimal xj(λ) falls into one of the following cases (λ > 0).

(i) x−j = 0, x+

j = 0

|∂xjL(x)| ≤ λ, x−j = 0, x+j = 0.

(ii) x+j > 0,

∂xjL(x) = −λ, x−j = 0 (λ+j = 0, λ−

j > 0)

(iii) x−j > 0,

∂xjL(x) = λ, x+j = 0 (λ+j > 0, λ−

j = 0)

Ikeda (ISM) Sparse modeling 3/July/2015 38 / 56

How to solve the problems LASSO: LARS

Solving LASSO with LARS

Depending on λ, the set of “Active” component, which are non-zero xjchanges. Let us define this set as A. A = {j : xj(λ) = 0}

A member of Active set

j ∈ A ⇒ ∂xjL(x(λ)) = −sgn(xj(λ))λ

A component which is “not” Active

j /∈ A ⇒ |∂xjL(x(λ))| ≤ λ

Ikeda (ISM) Sparse modeling 3/July/2015 39 / 56

How to solve the problems LASSO: LARS

Solving LASSO with LARS

Let xA be the x with only the active components j ∈ A. From the fact∂xA∂xAL(x(λ)) = 2At

AAA, we can derive the following relation.

∂xA∂xAL(x(λ))∂xA(λ)

∂λ= −sgn(xA(λ))

∂xA(λ)

∂λ= −

(∂xA∂xAL(x(λ))

)−1sgn(xA(λ))

∂xA(λ)

∂λ= −1

2

(At

AAA

)−1sgn (xA(λ))

Ikeda (ISM) Sparse modeling 3/July/2015 40 / 56

How to solve the problems LASSO: LARS

Solving LASSO with LARS

Algorithm

1. x = 0, A = {argmaxj |ytaj |}, dA = −sgn (−ytaA), dAc = 0

2. If max |∂xjL(x)| > 0, move to the following.

i s1 = min{s > 0 : |∂xjL(x+ sd)| = |∂xAL(x+ sd)|, j /∈ A}s2 = min{s > 0 : (x+ sd)j = 0, j ∈ A}s = min(s1, s2)

ii x← x+ dγIf s = s1, add j, which achieves s1, to A.If s = s2, remove j, which achieves s2, from A.

iii Update dA = −(AtAAA)

−1sgn(∂xAL(x)), dAc = 0.iv Move to step 2.

Ikeda (ISM) Sparse modeling 3/July/2015 41 / 56

How to solve the problems LASSO: LARS

An example of LARS

0 2 4 6 8 10 12 14 16 18−4

−3.5

−3

−2.5

−2

−1.5

−1

−0.5

0

0.5

1

Σ |βj|

β j

(∑|βj | is equivalent to |x|ℓ1 .)

Ikeda (ISM) Sparse modeling 3/July/2015 42 / 56

How to solve the problems LASSO: LARS

An example of LARS

2 4 6 8 10 12 14 16 185

6

7

8

9

10

11

12

13

14

log

λ

Σ|βj|

(∑|βj | is equivalent to |x|ℓ1 .)

Ikeda (ISM) Sparse modeling 3/July/2015 43 / 56

How to solve the problems LASSO: Quadratic Programming

Some applications

How to solve the problemsCompressed Sensing: Linear ProgrammingLASSO: LARSLASSO: Quadratic ProgrammingLASSO: Iterative Shrinking Algorithm

Conclusion

Ikeda (ISM) Sparse modeling 3/July/2015 44 / 56

How to solve the problems LASSO: Quadratic Programming

Quadratic Programming (QP)

Let u,f ∈ ℜn be n-dimensional real vectors, H ∈ ℜn×n be a positivesemidefinite matrix, A ∈ ℜm×n, and b ∈ ℜm.

Standard form

minu

1

2utHu+ f tu

subject to

Au ≤ b.

▶ Basic problem in optimization theory.

▶ Lot of numerical packages exist.

▶ Large scale problems can be solved.

Solve LASSO with quadratic programming.

Ikeda (ISM) Sparse modeling 3/July/2015 45 / 56

How to solve the problems LASSO: Quadratic Programming

Solving LASSO with Quadratic Programming

LASSO

minx

[∥∥y −Ax∥∥2ℓ2+ λ

∥∥x∥∥ℓ1

]=min

x

[(y −Ax

)t(y −Ax

)+ λ

∑j

|xj |]

=yty +minx

[(xtAtAx− 2ytAx

)+ λ

∑j

|xj |]

Ikeda (ISM) Sparse modeling 3/July/2015 46 / 56

How to solve the problems LASSO: Quadratic Programming

Solving LASSO with Quadratic Programming

Let xj = x+j − x−j , x+j ≥ 0, x−j ≥ 0,

LASSO

minx+,x−

[(x+ − x−)tAtA(x+ − x−)− 2ytA(x+ − x−)

)+ λ(x+ + x−)

]subject to

x+j , x−j ≥ 0 j = 1, · · · , n.

Satisfying xj = x+j − x−j , one of x+j and x−j becomes 0 because of the lastterm of the above formulation.

Ikeda (ISM) Sparse modeling 3/July/2015 47 / 56

How to solve the problems LASSO: Quadratic Programming

Solving LASSO with Quadratic Programming

Let γ = (x+1 , · · · , x+n , x−1 , · · · , x−n )t,

LASSO

minγ

[γt

(AtA −AtA−AtA AtA

)γ +

(−2Aty + λ1n2Aty + λ1n

)t

γ]

subject to − E2nγ ≤ 02n.

Ikeda (ISM) Sparse modeling 3/July/2015 48 / 56

How to solve the problems LASSO: Iterative Shrinking Algorithm

Some applications

How to solve the problemsCompressed Sensing: Linear ProgrammingLASSO: LARSLASSO: Quadratic ProgrammingLASSO: Iterative Shrinking Algorithm

Conclusion

Ikeda (ISM) Sparse modeling 3/July/2015 49 / 56

How to solve the problems LASSO: Iterative Shrinking Algorithm

LASSO: Iterative Shrinking Algorithm

Quadratic programming cannot solve very large problems. We need adifferent approach.

Surrogate function

f(x) =∥∥y −Ax

∥∥2ℓ2+ λ

∥∥x∥∥ℓ1

d(x,x0) =c

2

∥∥x− x0

∥∥2ℓ2− 1

2

∥∥Ax−Ax0

∥∥2ℓ2

Set c larger than the largest eigen value of AtA, and ∂2d(x,x0)/∂x∂x ispositive definite matrix.

New cost function

Q(x,x0) = f(x) + d(x,x0)

x = argminx

Q(x,x0)

Ikeda (ISM) Sparse modeling 3/July/2015 50 / 56

How to solve the problems LASSO: Iterative Shrinking Algorithm

LASSO: Iterative Shrinking Algorithm

Q(x,x0) = Const− 2xt[At(y −Ax0) + cx0] + λ∥∥x∥∥

ℓ1+ c

∥∥x∥∥2ℓ2

Minimize Q(x,x0) with respect to x. Let v =1

cAt(y −Ax0) + x0 and

x = argminx

Q(x,x0) = argminx

[−2cxtv + λ

∥∥x∥∥ℓ1+ c

∥∥x∥∥2ℓ2

]= argmin

x

[λ∥∥x∥∥

ℓ1+ c

∥∥x− v∥∥2ℓ2

]= argmin

x

[λc

∥∥x∥∥ℓ1+

∥∥x− v∥∥2ℓ2

]

Ikeda (ISM) Sparse modeling 3/July/2015 51 / 56

How to solve the problems LASSO: Iterative Shrinking Algorithm

LASSO: Iterative Shrinking Algorithm

x = argminx

[λc

∑i

|xi|+∑i

(xi − vi)2]

each component becomes independent.

xi = argminx

[λc|xi|+ (xi − vi)

2]

Ikeda (ISM) Sparse modeling 3/July/2015 52 / 56

How to solve the problems LASSO: Iterative Shrinking Algorithm

LASSO: Iterative Shrinking Algorithm

Let us define Sω(v) as follows,

Sω(v) =

{0 |v| < ω/2

v − sgn(v)ω

2|v| ≥ ω/2

then,

xi = Sλ/c(vi)

thus, x = argminxQ(x,x0) can be solved efficiently.

Ikeda (ISM) Sparse modeling 3/July/2015 53 / 56

How to solve the problems LASSO: Iterative Shrinking Algorithm

LASSO: Iterative Shrinking Algorithm

Let us consider the following update rule.

xk+1 = argminx

Q(x,xk) = Sλ/c

(1cAt(y −Axk) + xk

)Because Q(x,xk) = f(x) + d(x,xk) and d(x,xk) ≥ 0,

f(xk+1) ≤ Q(xk+1,xk) = minx

Q(x,xk) ≤ Q(xk,xk) = f(xk)

Therefore, by repeating, f(x) is minimized with respect to x.

Daubechies, Defrise, & De-Mol, (2004). “An iterative thresholdingalgorithm for linear inverse problems with a sparsity constraint,”Commin. on Pure and Applied Math., LVII, 1413-1457.

Ikeda (ISM) Sparse modeling 3/July/2015 54 / 56

Conclusion

Conclusion

Solving Compressed Sensing and LASSO

▶ Introduced strategies for solving Compressed Sensing and LASSO.

▶ Compressed sensing can be solved with Linear Programming.

▶ LASSO can be solved with LARS, Quadratic Programming, andIterative Shrinking Algorithm, depending on the size of the problem.

Ikeda (ISM) Sparse modeling 3/July/2015 55 / 56

Conclusion

Report

Questions▶ Explain what is the sparse modeling.

▶ Explain an example of the sparse modeling application.

▶ Propose a new application where sparse modeling is effective.

Ikeda (ISM) Sparse modeling 3/July/2015 56 / 56