Upload
driss
View
215
Download
3
Embed Size (px)
Citation preview
Fusion of Face and Iris Features Extraction Basedon Steerable Pyramid Representation for
Multimodal Biometrics
Khalid Fakhar1, Mohamed El Aroussi1,2, Rachid Saadane1,2, Mohammed Wahbi2 & Driss Aboutajdine1
1 LRTT, Unit Associe au CNRST (URAC 29), Faculty of Sciences, Mohammed V University-Agdal, Rabat, Morocco2 LETI-EHTP casablanca
Abstract—In this paper, we make a first attempt to combineface and iris biometrics using an efficient local appearance featureextraction method based on steerable pyramid (S-P), to capturesthe intrinsic geometrical structures of face and iris image, itdecomposes the face and iris image into a set of directional sub-bands with texture details captured in different orientations atvarious scales. Local information is extracted from S-P sub-bandsusing block-based statistics to reduce the required amount of datato be stored. The obtained local features are combined at the scorelevel for developing a multimode biometric approach, which isable to diminish the drawback of single biometric approach aswell as to improve the performance of authentication system.We combine a face database FERET and iris database CASIA(version 1) to construct a multimodal biometric experimentaldatabase with which we validate the proposed approach and eval-uate the multimodal biometrics performance. The experimentalresults reveal the multimodal biometric authentication is muchmore reliable and precise than single biometric approach.
Index Terms—Multimodal biometrics, Face, Iris, Steerablepyramid.
I. INTRODUCTION
Biometric authentication, which identifies an individual
person using a single biometric trait for authentication using
physiological and/or behavioral characteristics, such as iris,
face, fingerprints, hand geometry, handwriting, retinal, vein,
and speech, is one of the most reliable and capable than
knowledge-based (e.g., password) or token-based (e.g., a key)
techniques, since biometric features are hardly stolen or forgot-
ten. However, recognition based on any one of these modalities
may not be sufficiently robust or else may not be acceptable to
a particular user group or in a particular situation or instance.
Therefore, the performance of single biometric system need be
improved, and the techniques of multimodal biometric system
can offer a feasible method to solve the problems coming from
single biometric system [1].
Multimodal biometric system makes use of different biomet-
ric traits simultaneously to authenticate a person’s identity. The
key technique of multimodal biometric system is information
fusion. However, an integration scheme is required to fuse the
information presented by the individual modalities. There are
three main strategies to build multimodal biometric systems.
The first method is to apply decision fusion which means
combining accept or reject decisions of unimodal systems.
The other method to construct a multimodal system is using
the feature fusion. This means that features extracted using
multiple sensors are concatenated. Finally there is the score
fusion level which we will use in this paper.
In this paper, the face and iris biometric traits are selected
to construct multimodal biometric system because face recog-
nition is most natural and acceptable in identity authentication
whereas iris recognition is one of the most accurate biomet-
rics. Yunhong et al. proposed a method for combining face
and iris biometrics in identity verification in 2003, which
studies information fusion at the matching score level [2].
Meantime, comparisons of fusion methods including sum
rule, Fisher Linear Discriminant Analysis (FLD) and Radial
Basis Function Neural Network (RBFNN) are also discussed
deeply. Some other fusion strategies for face and iris [3], [4]
have been proposed in the past few years but the using of
steerable pyramids as feature extraction for both modalities
have never been explored. Therefore, this paper presents a
new method called block based steerable pyramids [5], which
gives better performance and better accuracy for both traits
(face & iris) [5], [6]. Also the proposed approach shows
that integration of face and iris biometrics can achieve higher
performance that may not be possible using a single biometric
indicator alone.
The rest of this paper is organized as fallow. Section 2
presents the multimodal biometrics system, which is used to
increase the performance of individual biometric trait. Section
3 presents image preprocessing and feature extraction using
block based steerable pyramids. Finally, the experimental
results are given in section 4. Conclusions are given in the
last section.
II. THE MULTIMODAL BIOMETRIC SYSTEM
The proposed multimodal biometrics system of face and
iris is shown in the Figure 1. Firstly, the image preprocessing
and feature extraction module employs some image processing
algorithms to demarcate the region of interest from the input
image (eye or face). For both, face and iris image pre-
processing the paper uses a new approach called block based
978-1-61284-732-0/11/$26.00 ©2010 IEEE
steerable pyramids for feature extraction [5], [6]. Secondly,
the matching score for each trait is calculated by using city-
bank distance. Finally, the final score is generated by using the
sum of the normalized (Z-Score, Min-Max) score technique at
fusion level, which is then passed to the decision module.
Preprocessing andFeature Extraction
Module
Eye
Image
Matching
Module
FusionModule
DecisionModule
YesorNo
Iris Template
Face Template
Matching
Module
Preprocessing andFeature Extraction
Module
Face
Image
Fig. 1. Block diagram of face and iris multimodal biometric system.
III. IMAGE PREPROCESSING AND FEATURE EXTRACTION
A. Image Preprocessing
Both face and iris images are preprocessed to extract only
the relevant part of the image containing useful information.
For face, all images are aligned with respect to the manually
detected eye coordinates, scaled and histogram equalized. The
iris images are segmented by using the technique proposed
by Wildes which is based on the implementation of circular
Hough transform. For more details of the image preprocessing
the authors can refer to our previews works [5], [6].
B. Face and Iris Feature Extraction Method
The face and iris has many irregular small blocks [5], [6]
which are more directional. Therefore, steerable pyramid [7]
decomposition can be used to split the features in a face and
iris images into different sub-bands at different levels, with
’approximations’ and ’details’. Since a one-level decomposi-
tion may not be adequate to effectively isolate these features,
it is necessary to explore different combination of sub-bands at
higher levels to obtain a suitable isolation. In order to capture
multi-orientation information in face and iris images better,
a straightforward solution is calculating derivatives in all
directions, but this method can present high computation cost.
Based on the theorem of steerable filter [8], the derivatives of
an image in any direction can be interpolated by several basis
derivative functions.
Figures 2 and 3, shows the analysis and synthesis rep-
resentation of the steerable pyramid transform for face and
iris. A face and iris images are decomposed into a steerable
pyramid by six oriented third-order band-pass basis filters.
In the first level, six sub-band images are obtained. In both
figures, we can see that each oriented filter is most sensitive to
the oriented information (e.g. edges) that is perpendicular to
the direction of filter. The steerable pyramid combines face and
iris images spatial multi-scale features with multi-orientation
local features. These features are exactly perceptible by V1
area (the first visual area) of human visual cortex. Therefore
it is reasonable that we choose steerable pyramid as local low-
level features for face and iris images.
Fig. 2. Tree-stage & 6 orientation steerable pyramid transform.
Fig. 3. Tree-stage & 6 orientation steerable pyramid transform.
Feature VectorsFor both face and iris to generate the image database,
each image is decomposed into 3-level and 6-orientation sub-
bands. The direct use of S-P coefficients may not extract
the most discriminative features as these coefficients contain
much redundant and irrelevant information. For an efficient
and local representation of the face and iris image, first
each S-P sub-band is partitioned into a set of equally-sized
blocks in a non-overlapping way. Based on common belief,
the statistical measures such as mean, energy, variance and
entropy of the energy distribution of the S-P coefficients for
each sub-band at each decomposition level can be used to
identify a texture. Let Iij(x, y) be the image at the specific
block j of sub-band i, the resulting feature vector νij ={Mean, V ariance,Energy or Entropy}.
Mean =1
M ×N
M∑x=1
N∑y=1
|Iij(x, y)| (1)
V ariance =1
M ×N
M∑x=1
N∑y=1
|Iij(x, y)− μij |2 (2)
Energy =
M∑x=1
N∑y=1
|Iij(x, y)|2 (3)
where M and N is the size of Iij(x, y). Entropy is a statistical
measure of randomness that can be used to characterize the
texture of the input image. Entropy is defined as
Entropy = −∑p
(p× log(p)) (4)
where p contains the histogram counts. The feature vector of
a face and iris is then constructed by concatenating each block
measure to one big feature vectors V =⋃k
i=1
⋃ki
j=1{νij}, k is
the number of S-P sub-bands and ki the number of blocks in
the ith sub-band. Therefore, we can extract the best features
and reduce the size of the data while keeping only the principal
discriminant features. Figure 4 shows the overall diagram of
the face and iris features extraction.
Fig. 4. Diagram of the Block-based S-P features extraction process
C. Face and Iris Matching
In this study, the city-bank distance between two feature
vectors V p and V q is defined as follows
dpq =
√√√√k∑
i=1
ki∑j=1
|νpij − νqij | (5)
The classification performance evaluation is based on pairwise
distance matrix. If there are m training and n test samples, then
a distance matrix should be of size m× n, with each column
representing the distances from the corresponding test sample
to all training samples (classes). The lower the distance, the
closer the two samples.
IV. EXPERIMENTAL RESULTS
A. Biometric Database
We evaluate the proposed multimodal system on a data set
including 108 subjects. Obviously, the FERET database do not
come with corresponding iris images, so to each face image,
we assign an arbitrary (but fixed) iris class. In this way, we
obtain a database of 108 subjects, with 4 face images and 7
iris images per subject. For the face:
• Our experiments are performed also on a subset of
FERET face database [9], pairs of frontal images fa and fb
image sets (fa and fb are two views taken at the same time
with different expressions). The FERET dataset contains
images of 108 individuals. There are four frontal views
of each individual: a neutral expression and a change of
expression from one session, and a neutral expression and
change of expression from a second session that occurred
three weeks after the first. All the images are aligned with
respect to the manually detected eye coordinates, scaled
to 128x128 pixels resolution and histogram equalized. For
each of the individual in the set, fa images are used for
training and the fb are used for testing purposes.Figure 5
depicts some sample images from the FERET database.
Fig. 5. Faces from the FERET Face Database
• For the iris database is authorized from the CASIA iris
image database collected by Institute of Automation,
Chinese Academy of Science [10]. Figure 6 shows four
iris samples from the CASIA iris database. The database
comprises 108 classes of eye images, and each class has
7 images (total 756 images). Whole classes come from
108 different eyes, and each image with the resolution of
320 × 280 in gray level. There are two groups in each
class, the first one has 3 and the second one has 4 images.
Each group denotes different time capturing. The first
three images of each class are selected to constitute the
training set and the other images of each class are used
as the test set. In the experiments for the preprocessing
stage, We adapted the open source code provided by
Libor Masek [11], for Hough transform segmentation,
to our algorithm. 88% of successful localization was
achieved.
Fig. 6. Iris samples from CASIA Iris Database.
B. Optimal Block Size and Statistical Measures
To extract the S-P features we have applied the same
procedure as described in section III-B. Extensive experiments
have been conducted on CASIA and FERET to determine
the optimal block size and the best statistical measures. As
demonstrated in [5], [6], the optimum block size is 8× 8 but
we prefer to use a block size of 16 × 16 since it provides
the best tradeoff between accuracy and execution times. For
the statistical measures we retains the Entropy as it give
high recognition Accuracy. We note that the experiments were
carried out in Matlab 7.5,
C. Face and Iris Fusion
1) Normalization: An important aspect that has to be
addressed in fusion at the matching score level is the nor-
malization of the scores obtained from the different domain
experts [12]. Normalization typically involves mapping the
scores obtained from multiple domains into a common domain
before combining them. This could be viewed as a two-step
process in which the distributions of scores for each domain
is first estimated using robust statistical techniques [13] and
these distributions are then scaled or translated into a common
domain. the most used normalization methods in the literature
are Z-Score and Min-Max.
2) Combination: The simplest form of combination would
be to take the sum of the scores from the two modalities. In
this experiment both the traits are combined at matching score
level using sum of score technique preprocessed by Z-Score
and Min-Max normalization.
Fig. 7. recognition accuracy of fusion by sum rule preprocessed by Min-Maxnormalization.
Fig. 8. recognition accuracy of fusion by sum rule preprocessed by Z-Scorenormalization.
The Fig. 7 and Fig. 8 shows the recognition performance
of different modalities. The results are found to be very
encouraging and promoting for the research in this field. The
overall accuracy of the system is 99.50% compared with
86.3% and 94.1% for face and iris respectively which mean
an improvement rates approximately 13.2% and 5.4%.
V. CONCLUSIONS
In this paper, we have proposed an multibiometric system
based on the fusion of face and iris at score level using an
efficient local appearance feature extraction method based on
S-P. We combine a face database FERET and iris database
CASIA (version 1) to construct a multimodal biometric ex-
perimental database with which we validate the proposed
approach and evaluate the multimodal biometrics performance.
The experimental results reveal the multimodal biometrics
verification is much more reliable and precise than single bio-
metric approach. In the future, the methods for more biometric
fusion should be probed into for multimodal biometric identity
authentication. Future work is suggested towards using differ-
ent and large databases to validate the performance of block-
based S-P approach for fusion of face and iris recognition.
Performance of employing different statistical measures and
different classification methods can also be studied.
REFERENCES
[1] Arun A. Ross, Karthik Nandakumar, and Anil K. Jain, Handbook ofMultibiometrics (International Series on Biometrics), Springer-VerlagNew York, Inc., Secaucus, NJ, USA, 2006.
[2] Yunhong Wang, Tieniu Tan, and Anil K. Jain, “Combining face andiris biometrics for identity verification,” in AVBPA’03 Proceedings ofthe 4th international conference on Audio- and video-based biometricperson authentication, 2003, pp. 805–813.
[3] K. Pan S. Z. Li Z. Zhang, R. Wang and P. Zhang, “Fusion of nearinfrared face and iris biometrics,” in ICB ’07: Proceedings of ICB’07,vol. 4642, Seoul, Korea, 2007, pp. 172–180.
[4] A. Rattani and M. Tistarelli, “Robust multi-modal and multiunit featurelevel fusion of face and iris biometrics,” in ICB ’09: Proceedings of theICB09, vol. 5558, Alghero, Italy, 2009, pp. 960–969.
[5] M. El Aroussi, S. Ghouzali, M. El Hassouni, M .Rziza, and D. Abouta-jdine, “Local appearance based face recognition method using blockbased steerable pyramid transform,” Signal Process, 2010.
[6] M. El Aroussi, K. Fakhar, M. Wahbi, and D. Aboutajdine, “Irisfeature extraction based on steerable pyramid representation,” in ISIVC’10: Proceedings of the 2010 5th International Symposium on I/VCommunications and Mobile Network (ISVC), 2010, pp. 1–4.
[7] E. P. Simoncelli and W. T. Freeman, “The steerable pyramid: a flexiblearchitecture for multi-scale derivative computation,” in ICIP ’95: Pro-ceedings of the 1995 International Conference on Image Processing (Vol.3)-Volume 3, Washington, DC, USA, 1995, p. 3444, IEEE ComputerSociety.
[8] E.H. Adelson and W.T. Freeman, “The design and use of steerable fil-ters,” IEEE Transactions on Pattern Analysis and Machine Intelligence,vol. 13, no. 9, pp. 891–906, 2000.
[9] Phillips P.J. nad Moon H., Rauss P.J., and S. Rizvi, “The feret evaluationmethodology for face recognition algorithms,” IEEE Trans. PatternAnalysis and Machine Intelligence, vol. 22, no. 10, pp. 891–906, 102000.
[10] “Institute of automation, chinese academy of science, casia iris imagedatabase.,” .
[11] L. Masek, “Recognition of human iris patterns for biometric identifica-tion,” Bachelor of Engineering Degree Thesis, The University of WesternAustralia, Australia, 2003.
[12] Brunelli R. and D. Falavigna, “Person identification using multiplecues,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol.12, no. 10, pp. 955–966, 1995.
[13] Hampel F.and Rousseeuw P., Ronchetti E., and Stahel W., “The approachbased on infiuence functions.,” John Wiley & Sons., 1986.