Upload
natsuki
View
213
Download
1
Embed Size (px)
Citation preview
IMAGE RESTORATION WITH2-D NON-SEPARABLE OVERSAMPLED LAPPED TRANSFORMS
Shogo MURAMATSU∗
Niigata UniversityDept. of Electrical and Electronic Eng.
8050 2-no-cho Ikarashi, Nishi-ku, Niigata950-2181, JAPAN
Natsuki AIZAWA
Niigata UniversityGraduate School of Science & Technology8050 2-no-cho Ikarashi, Nishi-ku, Niigata
950-2181, [email protected]
ABSTRACT
This work proposes to apply a two-dimensional (2-D) non-separable
oversampled lapped transform (NSOLT) to image restoration.
NSOLT is a lattice-structure-based redundant transform which satis-
fies the symmetric, real-valued and compact-support property. The
lattice structure is able to constitute a Parseval frame with rational
redundancy and produce a dictionary with directional atoms. In this
study, the performance for deblurring, super-resolution and inpaint-
ing is evaluated. The iterative-shrinkage/thresholding algorithm
(ISTA) is adopted to show the significance of NSOLT in the image
restoration applications. It is verified that the six-level NSOLT with
redundancy less than two yields superior or comparable restoration
performance to the two-level non-subsampled Haar transform of
redundancy seven in both of PSNR and SSIM.
Index Terms— Non-separable oversampled lapped transform,
Deblurring, Super-resolution, Inpainting, ISTA
1. INTRODUCTION
Fig. 1 shows a parallel structure of P -channel 2-D non-separable
filter banks. The system consists of an analysis and synthesis bank,
where z ∈ C2 denotes a 2 × 1 complex variable vector (zy, zx)
T
in the 2-D z-transform domain, Hp(z) and Fp(z) are the transfer
functions of the p-th analysis and synthesis filter, respectively. The
symbols ↓Mp and ↑Mp are the downsampler and upsampler with
factor Mp ∈ Z2×2, respectively [1, 2]. The sampling ratio of the p-
th channel, Mp, is given by Mp = |det(Mp)|. The total sum of the
reciprocals of {Mp}P−1p=0 , i.e. R =
∑P−1p=0 M−1
p , is referred to as
redundancy. WhenR > 1, the system becomes oversampled (OS).
OS filter banks have close relation to the frame theory of vector
space [3–5]. With this rigid theory and recent development of opti-
mization techniques, these systems have found a lot of image pro-
cessing applications such as denoising, deblurring, super-resolution
and inpainting, as well as compressive sensing [6–10]. In most of
such applications, filter banks are used to sparsely represent a given
image as a linear-combination of image prototypes (atoms). Thus,
the selection of atoms is a quite important task since they determine
the model of given images.
Two simple ways to construct OS filter banks are known. One
is a mixture construction of multiple critically-sampled (CS) sys-
tems [6,11–13], and the other is a non-subsampled or shift-invariant
construction, which is realized by removing the downsamplers and
∗This work was supported by JSPS KAKENHI (23560443).
Fig. 1. Parallel structure of a P -channel filter bank.
upsamplers from a CS system. It, however, can hardly exploit the de-
sign freedom of OS systems with these two approaches. In addition,
the redundancy R is restricted to be integer. Especially, the latter
case tends to have large redundancy. The larger the redundancy is,
the farer the system puts itself from practical applications, especially
from embedded vision systems, where the computational resources
are severely restricted and special attention should be paid to the
memory access. Contourlet proposed by Do et. al. is able to have
rational redundancy as well as the directional, linear-phase (LP) and
FIR property [14]. However, due to its hierarchical tree structure
with CS systems, it is faced with almost the same restriction.
Currently, we are proposing a novel filter bank construction to
obtain 2-D non-separable OS LP perfect reconstruction (PR) FIR
systems [15]. The system is regarded as an extention of 1-D OS
LP PR filter bank to the 2-D non-separable case [16], and also as
generalization of non-separable CS LP PR filter bank to the OS
case [2, 11, 17]. We refer to the proposed filter bank as a non-
separable oversampled lapped transform (NSOLT). One of the fea-
tures of NSOLT is its high degree of freedom in the redundancy ratio.
In this work, we propose to apply NSOLT to image restoration
in order to enhance the merit of the new OS system. In order to
show the significance, some simulation results of deblurring, super-
resolution and inpainting are shown and compared with the perfor-
mance of the non-subsampled Haar wavelet transform [6, 7, 9, 18].
2. IMAGE RESTORATION WITH ISTA
In this section, we review a formulation of image restoration prob-
lem, and then introduce the iterative-shrinkage/thresholding algo-
rithm (ISTA) as a solver for the problem [8, 19, 20].
1051978-1-4799-2341-0/13/$31.00 ©2013 IEEE ICIP 2013
Fig. 2. Framework of problem setting
2.1. Problem Formulation
Fig. 2 shows the framework of our problem setting. Let x ∈ RN be
an observed image which is represented by
x = Pu∗ +w,
where u∗ ∈ RM (M ≥ N ) is an unknown original image, P ∈
RN×M is a linear discrete operator which represents degradation
and pixel loss through the measurement process, and w ∈ RN is a
measurement noise modeled as a zero-mean additive white Gaussian
noise (AWGN), respectively [8, 18–20].
Image restoration is a problem of finding a good candidate im-
age u ∈ RM of the unknown high-resolution clean image u∗ only
from the observed image x. Since the operator P is in general not
invertible, the problem is ill-posed. In this situation, sparsity works
well for restoring the original image [6, 7, 9, 18].
In the sparse representation approach, the candidate u is ex-
pressed by a linear-combination of atoms in a dictionary D ∈R
M×L, i.e.
u = Dy,
where y ∈ RL is a candidate coefficient vector. Due to the mea-
surement process P, the matrix A = PD ∈ RN×L(N ≤ L) which
relates the observed image x to the coefficient vector y∗ is not in-
vertible. In order to find y, we adopt the following formulation of a
regularized least square problem:
y = argminy
1
2‖x−PDy‖22 + λρ(y), (1)
where ‖·‖2 is the ℓ2-norm of a vector, y ∈ RL is a coefficient vector,
ρ(·) is a regularization term and λ ∈ R+ is a scalar parameter to con-
trol the trade-off between reconstruction fidelity and sparsity, which
is referred to as a regularization parameter. When ρ(·) is a convex
function, the proximal forward-backward algorithm can be used to
solve (1) [21]. For ρ(y) = ‖y‖1, i.e. the ℓ1-norm regularization is
selected, the solver reduces to ISTA, which guarantees the conver-
gence to an exact solution and applicable to large data [8, 19, 20].
2.2. Iterative Shrinkage/Thresholding Algorithm(ISTA)
If we select the ℓ1-norm as the sparsity measure ρ(·), we can use
ISTA shown in Algorithm 1 [8, 19, 20], where Tλ(·) is the vector
function that performs the element-wise scalar soft-shrinkage oper-
ation
Tλ(v) = diag(sign(v)) · (|v| − λ1)+,
where sign(·) and | · | take the element-wise signs and absolute
values, respectively, and (·)+ replaces negative elements to zeros
and remains positive elements. The Lipschitz constant L is deter-
mined only by the degradation process when D is a tight frame since
Data: Observed picture x ∈ RN
Result: Restored picture u ∈ RM
Initialization;
i← 0;
y(0) ← ATx;
Main iteration to find y that minimizes
f(y) = 12‖x−Ay‖22 + λ‖y‖1;
repeat
i← i+ 1;
y(i) ← T λL
(
y(i−1) − 1LA
T (Ay(i−1) − x))
;
until ‖y(i) − y(i−1)‖22/‖y(i)‖22 < ǫ;
u← Dy(i);
Algorithm 1: ISTA, where A = PD and L =λmax(A
TA) = Bλmax(PTP) , i.e. the Lipschitz constant
of the gradient of 12‖x−Ay‖22 [8].
λmax(ATA) = λmax(AAT ) = Bλmax(PPT ) = Bλmax(P
TP)holds, where λmax(·) and B denote the maximum eigenvalue and the
frame bound of D, respectively.
3. NON-SEPARABLE OS LAPPED TRANSFORMS
The selection of the dictionary D, i.e. the modeling of the original
signal, is a quite important task for solving the problem in (1) since
it influences both of the restoration quality and the computational
complexity. In this section, let us introduce NSOLT and propose to
adopt it as the dictionary D in (1).
3.1. Lattice Structure
NSOLT is a 2-D non-separable redundant transform which we are
proposing recently [15]. NSOLT has a capability to be oversampled,
overlapping, directional and paraunitary with the symmetric, real-
valued and compact-support atoms. As well, the boundary operation
proposed in the article [22], which is originally developed for 2-D
non-separable CS LP paraunitary systems, is available.
NSOLTs are categorized into the following two types according
to the number of symmetric atoms ps and the number of antisym-
metric atoms pa = P − ps:
1. Type-I: ps = pa,
2. Type-II: ps 6= pa.
In the followings, we only introduce the Type-II case of ps > pa.
Fig. 3 shows a lattice structure of an analysis bank of Type-II
NSOLT, where we assume a uniform decomposition case, i.e. Mp =M for p = 0, 1, · · · , P − 1, and the sampling factor is diagonal,
i.e. M = diag(My,Mx). The polyphase matrix E(z) of Type-II
NSOLT is represented by
E(z) =
Ny/2∏
ℓy=1
{
R{y}Eℓy
QE(zy)R{y}Oℓy
QO(zy)}
×Nx/2∏
ℓx=1
{
R{x}Eℓx
QE(zx)R{x}Oℓx
QO(zx)}
·R0E0, (2)
1052
Fig. 3. A lattice structure of analysis bank of a 2-D Type-II non-separable OS lapped transform (NSOLT).
where
QE(zd) = B(pa)P
(
IP−pa O
O z−1
dIpa
)
B(pa)P , R
{y}Eℓ =
(
W{d}ℓ
O
O Ipa
)
,
QO(zd) = B(pa)P
(
Ipa O
O z−1
dIP−pa
)
B(pa)P , R
{x}Oℓ =
(
Ips O
O U{d}ℓ
)
.
and
B(m)P =
1√2
(
Im O Im
OT √2IP−2m OT
Im O −Im
)
.
W{d}ℓ ∈ R
ps×ps and U{d}ℓ ∈ R
pa×pa are arbitrary invertible ma-
trices. We adopt the initial matrix E0(z) defined by the product of
the matrix representation of 2-D discrete cosine transform (DCT)
E0 ∈ RM×M and
R0 =(
W0 OO U0
)
(
I⌈M/2⌉ O
O OO I⌊M/2⌋
O O
)
∈ RP×M , (3)
where W0 ∈ Rps×ps and U0 ∈ R
pa×pa are arbitrary invertible
matrices.
3.2. Parseval Frame Construction
If all of the parameter matrices W{d}ℓ , U
{d}ℓ , W0 and U0 are
orthonormal, then E(z) becomes paraunitary. From the frame-
theoretic point of view, a paraunitary system corresponds to a tight
frame [4]. Furthermore, NSOLT yields a 1-tight frame, i.e. Parseval
frame. If we have a paraunitary analysis bank, then we can obtain a
paraunitary synthesis bank by the para-conjugation of E(z):
R(z) = z−n
ET (z−I). (4)
In the OS case, there is infinite number of PR combination of anal-
ysis and synthesis banks [23, 24]. It is verified that the above pair
of filter banks constitute a PR system together. The synthesis pro-
cess with R(z) corresponds to a linear operation with the dictionary
D, and the analysis process with E(z) corresponds to the adjoint
operator DT which is required for AT = DTPT in Algorithm 1.
3.3. Design Example
Fig. 4 shows a design example of Type-II NSOLT with rational re-
dundancy [15]. The design specification is summarized as follows:
• Sampling ratio: My = Mx = 2 (M = 4)• ♯Channels: P = ps + pa = 5 + 2 = 7 (R = P/M = 7/4)• Polyphase order: Ny = Nx = 2• Paraunitary• No DC-leakage
From Fig. 4, it is observed that some atoms have diagonal charac-
teristics. Note that the redundancy is less than two and its iterative
decomposition of the lowest frequency components approaches to
two, but never exceeds two.
Fig. 4. Design example of Type-II NSOLT. The impulse and fre-
quency amplitude responses in (ωy, ωx)T ∈ [−π, π)2 are shown,
where P = ps+pa = 5+2 = 7, My = Mx = 2 and Ny = Nx = 4.
(a) goldhill (b) lena (c) barbara (d) baboon
Fig. 5. Original pictures u∗ of size 128× 128, 8-bit grayscale.
4. PERFORMANCE EVALUATION
In order to verify the significance of NSOLT, let us evaluate the
performance with the ISTA-based image restoration [8, 19, 20]. In
the followings, we deal with the deblurring, super-resolution and in-
painting problem [18].
4.1. Simulation Condition
Let us compare the image restoration performances with NSOLT
with that of the non-subsampled Haar transform (NSHT). The fol-
lowings summerize the details of the adopted transforms:
Non-subsampled Haar transform (NSHT): Two-level DWT con-
struction, separable, tight, nondirectional,R = 3 + 4 = 7.
Non-separable oversampled lapped transform (NSOLT): Six-
level DWT construction, non-separable, tight, directional,
R = 6(∑6
ℓ=1 4−ℓ) + 4−6 = 1.999755859375.
Note that NSHT is a special case of Type-I NSOLT [15]. Thus, the
basis termination method for the boundary operation can be applied
[22]. In this simulation, the basis termination method is adopted
to both of the transforms. Fig. 5 shows pictures used as unknown
clean images, u∗, and Tabs. 1 and 2 summerize the performance
evaluations in terms of the peak-signal to noise ratio (PSNR) and
structural similarity (SSIM) indeces1 [25].
1MATLAB function ssim index.m fromhttp://www.cns.nyu.edu/˜lcv/ssim/ was used.
1053
Table 1. Comparison of PSNRs between two transforms for various
pictures and measurement processes, where parameter λ, of which
value is given in the parenthesis, is experimentally given.Process Picture NSHT(R = 7) NSOLT (R < 2)
Deblurring
goldhill 27.32 (0.0001) 27.60 (0.0017)lena 26.97 (0.0001) 27.20 (0.0014)
barbara 25.27 (0.0001) 25.54 (0.0112)baboon 21.49 (0.0001) 21.46 (0.0001)
goldhill 28.05 (0.0001) 27.96 (0.0003)Super lena 27.54 (0.0001) 27.50 (0.0005)
Resolution barbara 25.51 (0.0001) 25.63 (0.0010)baboon 21.61 (0.0001) 21.56 (0.0001)
Inpainting
goldhill 24.69 (0.0160) 33.31 (0.0080)lena 23.52 (0.0110) 33.33 (0.0110)
barbara 21.37 (0.0120) 32.30 (0.0140)baboon 21.30 (0.0320) 26.66 (0.0130)
Table 2. Comparison of SSIM indexes between two transforms for
various pictures and measurement processes, where parameter λ, of
which value is given in the parenthesis, is experimentally given.Process Picture NSHT(R = 7) NSOLT (R < 2)
Deblurring
goldhill 0.626 (0.0014) 0.636 (0.0008)lena 0.752 (0.0010) 0.786 (0.0015)
barbara 0.700 (0.0025) 0.745 (0.0053)baboon 0.546 (0.0101) 0.545 (0.0001)
goldhill 0.684 (0.0030) 0.671 (0.0001)Super lena 0.823 (0.0001) 0.821 (0.0002)
Resolution barbara 0.768 (0.0001) 0.767 (0.0002)baboon 0.584 (0.0040) 0.555 (0.0001)
Inpainting
goldhill 0.691 (0.0090) 0.918 (0.0060)lena 0.687 (0.0010) 0.945 (0.0080)
barbara 0.612 (0.0010) 0.945 (0.0060)baboon 0.747 (0.0170) 0.899 (0.0070)
4.2. Deblurring
Deblurring is a problem to restore a clear picture from blurred one,
where AWGN is often assumed. In the framework shown in Fig. 2,
P is modeled as a convolution matrix which consists of the impulse
response, i.e. point-spread-function (PSF), with spatial shifts. The
adjoint operator PT required ISTA is realized by the convolution
with spatially reversal system, i.e. the 180-degree rotated filter [13].
As the PSF h[ny, nx], we employed the 2-D Gaussian filter with
standard deviation σh = 2.0. AWGN is also assumed with standard
deviation σn = 5. Fig. 6 shows an observed picture of “barbara”
and two different restoration results. From Tabs. 1 and 2, it is ob-
served that NSOLT shows better performance than NSHT in terms
of PSNR and SSIM index.
(a) Observed (b) NSHT (c) NSOLT
Fig. 6. Results of deblurring for “barbara.”
(a) Observed (b) NSHT (c) NSOLT
Fig. 7. Results of super-resolution for “barbara.”
(a) Observed (b) NSHT (c) NSOLT
Fig. 8. Results of inpainting for “barbara.”
4.3. Super Resolution
Super-resolution is a problem to restore a clear high-resolution pic-
ture from decimated or low-resolution one. In Fig. 2, P is modeled
as a convolution and downsampling matrix. The adjoint operator
PT is composed of the upsampling and 180-degree-rotated convo-
lution matrix [13]. In this simulation, we assumed the 2-D Gaussian
filter with standard deviation σh = 2.0 as a PSF f [ny, nx], ver-
tical downsampling with factor two and horizontal downsampling
with factor two. Any noise is not explicitly added. In Fig. 7 the
super-resolution performances are compared between two methods
for “barbara.” From Tabs. 1 and 2, it is observed that the perfor-
mances of NSHT and NSOLT show comparable to each other.
The performance of NSHT is slightly superior to that of NSOLT.
Our conjecture is that the Haar transform is suitable for the assumed
degradation process with the 2× 2 rectangular subsampling.
4.4. Inpainting
Inpainting is a problem to restore a missing pixels from the other ob-
served remaining pixels. P is simply modeled as a diagonal matrix
of which elements are either of 0 or 1, which denote missing and re-
maining pixel position, respectively. Thus, the adjoint operator PT
is exactly the same as P since PT = P [13]. Fig. 8 compares the
inpainting performances between two methods for “barbara.” The
observed picture in Fig. 8 loses 20% pixels randomly. From Tabs. 1
and 2, NSOLT shows significant performance improvement of in-
painting. This is because NSOLT has larger extent of atoms than
those of NSHT.
5. CONCLUSIONS
A novel image restoration technique was proposed by introducing a
non-separable oversampled lapped transform (NSOLT). Through the
application to deblurring, super-resolution and inpainting with ISTA,
it is verified that the proposed dictionary shows superior or compa-
rable performance to the non-subsampled Haar transform (NSHT)
with quite small redundancy.
As future works, we are concerned with the regularization other
than the ℓ1-norm, the kernel-based approach [26], dictionary learn-
ing approach [27] and restoration of Poissonian images [28].
1054
6. REFERENCES
[1] P. P. Vaidyanathan, Multirate Systems and Filter Banks, Pren-
tice Hall, 1993.
[2] Shogo Muramatsu, Akihiko Yamada, and Hitoshi Kiya, “A de-
sign method of multidimensional linear-phase paraunitary fil-
ter banks with a lattice structure,” IEEE Trans. Signal Process.,
vol. 47, no. 3, pp. 690–700, Mar. 1999.
[3] Zoran Cvetkovic and Martin Vetterli, “Oversamped filter
banks,” IEEE Trans. Signal Process., vol. 46, no. 5, pp. 1245–
1255, May 1998.
[4] H. Bolcskei, F. Hlawatsch, and H.G. Feichtinger, “Frame-
theoretic analysis of oversampled filter banks,” IEEE Trans.
Signal Process., vol. 46, no. 12, pp. 3256 –3268, Dec. 1998.
[5] Jelena Kovacevic and Amina Chebira, “Life beyond bases:
The advent of frames (part i),” IEEE Signal Process. Mag.,
July 2007.
[6] Jean-Luc Starck, Fionn Murtagh, and Jala M. Fadili, Sparse
Image and Signal Processing: Wavelets, Curvelets, Morpho-
logical Diversity, Cambridge University Press, 2010.
[7] Stephane Mallat, A Wavelet Tour of Signal Processing, Third
Edition: The Sparse Way, Academic Press, 2008.
[8] Daniel P. Palomar and Yonina C. Eldar, Eds., Convex Optimiza-
tion in Signal Processing and Communications, Cambridge
University Press, 2009.
[9] Michael Elad, Sparse and Redundant Representations: From
Theory to Applications in Signal and Image Processing,
Springer, 2010.
[10] Yonia C. Eldar and Gitta Kutyniok, Eds., Compressed Sensing,
Theory and Applications, Cambridge University Press, 2012.
[11] Shogo Muramatsu, Dandan Han, Tomoya Kobayashi, and
Hisakazu Kikuchi, “Directional lapped orthogonal transform:
Theory and design,” IEEE Trans. Image Process., vol. 21, no.
5, pp. 2434–2448, May 2012.
[12] Shogo Muramatsu, “SURE-LET image denoising with multi-
ple directional LOTs,” in IEEE Proc. of PCS, May 2012.
[13] Shogo Muramatsu, Natsuki Aizawa, and Masahiro Yukawa,
“Image restoration with union of directional orthonormal
DWTs,” in Proc. of APSIPA ASC, Dec. 2012.
[14] Minh N. Do and Martin Vetterli, “The contourlet transform:
An efficient directional multiresolution image representation,”
IEEE Trans. Image Process., vol. 14, no. 12, pp. 2091–2106,
Dec. 2005.
[15] Shogo Muramatsu and Natsuki Aizawa, “Lattice structures for
2-D non-separable oversampled lapped transforms,” in IEEE
Proc. of ICASSP, May 2013.
[16] Lu Gan and Kai-Kuang Ma, “Oversampled linear-phase per-
fect reconstruction filterbanks: theory, lattice structure and pa-
rameterization,” IEEE Trans. Signal Process., vol. 51, no. 3,
pp. 744 – 759, Mar. 2003.
[17] Lu Gan and Kai-Kuang Ma, “A simplified lattice factoriza-
tion for linear-phase perfect reconstruction filter bank,” IEEE
Signal Process. Lett., vol. 8, no. 7, pp. 207–209, July 2001.
[18] Michael Elad, Mario A. T. Figueiredo, and Yi Ma, “On the role
of sparse and redundant representations in image processing,”
Proc. IEEE, vol. 98, no. 6, pp. pp.972–982, June 2010.
[19] Ingrid Daubechies, Michel Defrise, and Christine De Mol,
“An iterative thresholding algorithm for linear inverse prob-
lems with a sparsity constraint,” Communications on Pur and
Applied Mathematics, vol. LVII, pp. pp.1413–1457, 2004.
[20] Mario A. T. Figueiredo, Jose M. Bioucas-Dias, and Robert D.
Nowak, “Majorization-minimization algorithms for wavelet-
based image restoration,” IEEE Trans. Image Process., vol.
16, no. 12, pp. pp.2980–2991, 2007.
[21] Patric L. Combettes and Valerie R. Wajs, “Signal recovery
by proximal foward-backward splitting,” Multiscale Modeling
and Simulation, vol. 4, no. 4, pp. pp.1168–1200, 2005.
[22] Shogo Muramatsu, Tomoya Kobayashi, Minoru Hiki, and
Hisakazu Kikuchi, “Boundary operation of 2-D non-separable
linear-phase paraunitary filter banks,” IEEE Trans. Image Pro-
cess., vol. 21, no. 4, pp. 2314–2318, Apr. 2012.
[23] Toshihisa Tanaka, “A direct design of oversampled perfect re-
construction FIR filter banks,” IEEE Trans. Signal Process.,
vol. 54, no. 8, pp. 3011–3022, Aug. 2006.
[24] Li Chai, Jingxin Zhang, and Yuxia Sheng, “Optimal design of
oversampled synthesis FBs with lattice structure constraints,”
IEEE Trans. Signal Process., vol. 59, no. 8, pp. 3549–3559,
Aug. 2011.
[25] Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P.
Simoncelli, “Image quality assessment: From error visibility
to structural similarity,” IEEE Trans. Image Process., vol. 13,
no. 4, pp. 600–612, Apr. 2004.
[26] Peyman Milanfar, “A tour of modern image filtering,” IEEE
Signal Process. Mag., vol. 30, no. 1, pp. pp.16–128, Jan. 2013.
[27] Ivana Tosic and Pascal Frossard, “Dictionary learning,” IEEE
Signal Process. Mag., vol. 28, no. 2, pp. 27–38, Mar. 2011.
[28] Mario A. T.Figueiredo and Jose M. Bioucas-Dias, “Restoration
of Poissonian images using alternating direction optimization,”
IEEE Trans. Image Process., vol. 19, no. 12, pp. pp.3133–
3145, Dec. 2010.
1055