[IEEE 2011 International Conference on Multimedia Computing and Systems (ICMCS) - Ouarzazate, Morocco (2011.04.7-2011.04.9)] 2011 International Conference on Multimedia Computing and

Fusion of Face and Iris Features Extraction Basedon Steerable Pyramid Representation for

Multimodal Biometrics

Khalid Fakhar1, Mohamed El Aroussi1,2, Rachid Saadane1,2, Mohammed Wahbi2 & Driss Aboutajdine1

1 LRTT, Unit Associe au CNRST (URAC 29), Faculty of Sciences, Mohammed V University-Agdal, Rabat, Morocco2 LETI-EHTP casablanca

[email protected]

Abstract—In this paper, we make a first attempt to combineface and iris biometrics using an efficient local appearance featureextraction method based on steerable pyramid (S-P), to capturesthe intrinsic geometrical structures of face and iris image, itdecomposes the face and iris image into a set of directional sub-bands with texture details captured in different orientations atvarious scales. Local information is extracted from S-P sub-bandsusing block-based statistics to reduce the required amount of datato be stored. The obtained local features are combined at the scorelevel for developing a multimode biometric approach, which isable to diminish the drawback of single biometric approach aswell as to improve the performance of authentication system.We combine a face database FERET and iris database CASIA(version 1) to construct a multimodal biometric experimentaldatabase with which we validate the proposed approach and eval-uate the multimodal biometrics performance. The experimentalresults reveal the multimodal biometric authentication is muchmore reliable and precise than single biometric approach.

Index Terms—Multimodal biometrics, Face, Iris, Steerablepyramid.

I. INTRODUCTION

Biometric authentication, which identifies an individual

person using a single biometric trait for authentication using

physiological and/or behavioral characteristics, such as iris,

face, fingerprints, hand geometry, handwriting, retinal, vein,

and speech, is one of the most reliable and capable than

knowledge-based (e.g., password) or token-based (e.g., a key)

techniques, since biometric features are hardly stolen or forgot-

ten. However, recognition based on any one of these modalities

may not be sufficiently robust or else may not be acceptable to

a particular user group or in a particular situation or instance.

Therefore, the performance of single biometric system need be

improved, and the techniques of multimodal biometric system

can offer a feasible method to solve the problems coming from

single biometric system [1].

Multimodal biometric system makes use of different biomet-

ric traits simultaneously to authenticate a person’s identity. The

key technique of multimodal biometric system is information

fusion. However, an integration scheme is required to fuse the

information presented by the individual modalities. There are

three main strategies to build multimodal biometric systems.

The first method is to apply decision fusion which means

combining accept or reject decisions of unimodal systems.

The other method to construct a multimodal system is using

the feature fusion. This means that features extracted using

multiple sensors are concatenated. Finally there is the score

fusion level which we will use in this paper.

In this paper, the face and iris biometric traits are selected

to construct multimodal biometric system because face recog-

nition is most natural and acceptable in identity authentication

whereas iris recognition is one of the most accurate biomet-

rics. Yunhong et al. proposed a method for combining face

and iris biometrics in identity verification in 2003, which

studies information fusion at the matching score level [2].

Meantime, comparisons of fusion methods including sum

rule, Fisher Linear Discriminant Analysis (FLD) and Radial

Basis Function Neural Network (RBFNN) are also discussed

deeply. Some other fusion strategies for face and iris [3], [4]

have been proposed in the past few years but the using of

steerable pyramids as feature extraction for both modalities

have never been explored. Therefore, this paper presents a

new method called block based steerable pyramids [5], which

gives better performance and better accuracy for both traits

(face & iris) [5], [6]. Also the proposed approach shows

that integration of face and iris biometrics can achieve higher

performance that may not be possible using a single biometric

indicator alone.

The rest of this paper is organized as fallow. Section 2

presents the multimodal biometrics system, which is used to

increase the performance of individual biometric trait. Section

3 presents image preprocessing and feature extraction using

block based steerable pyramids. Finally, the experimental

results are given in section 4. Conclusions are given in the

last section.

II. THE MULTIMODAL BIOMETRIC SYSTEM

The proposed multimodal biometrics system of face and

iris is shown in the Figure 1. Firstly, the image preprocessing

and feature extraction module employs some image processing

algorithms to demarcate the region of interest from the input

image (eye or face). For both, face and iris image pre-

processing the paper uses a new approach called block based

978-1-61284-732-0/11/$26.00 ©2010 IEEE

steerable pyramids for feature extraction [5], [6]. Secondly,

the matching score for each trait is calculated by using city-

bank distance. Finally, the final score is generated by using the

sum of the normalized (Z-Score, Min-Max) score technique at

fusion level, which is then passed to the decision module.

Preprocessing andFeature Extraction

Module

Eye

Image

Matching

Module

FusionModule

DecisionModule

YesorNo

Iris Template

Face Template

Matching

Module

Preprocessing andFeature Extraction

Module

Face

Image

Fig. 1. Block diagram of face and iris multimodal biometric system.

III. IMAGE PREPROCESSING AND FEATURE EXTRACTION

A. Image Preprocessing

Both face and iris images are preprocessed to extract only

the relevant part of the image containing useful information.

For face, all images are aligned with respect to the manually

detected eye coordinates, scaled and histogram equalized. The

iris images are segmented by using the technique proposed

by Wildes which is based on the implementation of circular

Hough transform. For more details of the image preprocessing

the authors can refer to our previews works [5], [6].

B. Face and Iris Feature Extraction Method

The face and iris has many irregular small blocks [5], [6]

which are more directional. Therefore, steerable pyramid [7]

decomposition can be used to split the features in a face and

iris images into different sub-bands at different levels, with

’approximations’ and ’details’. Since a one-level decomposi-

tion may not be adequate to effectively isolate these features,

it is necessary to explore different combination of sub-bands at

higher levels to obtain a suitable isolation. In order to capture

multi-orientation information in face and iris images better,

a straightforward solution is calculating derivatives in all

directions, but this method can present high computation cost.

Based on the theorem of steerable filter [8], the derivatives of

an image in any direction can be interpolated by several basis

derivative functions.

Figures 2 and 3, shows the analysis and synthesis rep-

resentation of the steerable pyramid transform for face and

iris. A face and iris images are decomposed into a steerable

pyramid by six oriented third-order band-pass basis filters.

In the first level, six sub-band images are obtained. In both

figures, we can see that each oriented filter is most sensitive to

the oriented information (e.g. edges) that is perpendicular to

the direction of filter. The steerable pyramid combines face and

iris images spatial multi-scale features with multi-orientation

local features. These features are exactly perceptible by V1

area (the first visual area) of human visual cortex. Therefore

it is reasonable that we choose steerable pyramid as local low-

level features for face and iris images.

Fig. 2. Tree-stage & 6 orientation steerable pyramid transform.

Fig. 3. Tree-stage & 6 orientation steerable pyramid transform.

Feature VectorsFor both face and iris to generate the image database,

each image is decomposed into 3-level and 6-orientation sub-

bands. The direct use of S-P coefficients may not extract

the most discriminative features as these coefficients contain

much redundant and irrelevant information. For an efficient

and local representation of the face and iris image, first

each S-P sub-band is partitioned into a set of equally-sized

blocks in a non-overlapping way. Based on common belief,

the statistical measures such as mean, energy, variance and

entropy of the energy distribution of the S-P coefficients for

each sub-band at each decomposition level can be used to

identify a texture. Let Iij(x, y) be the image at the specific

block j of sub-band i, the resulting feature vector νij ={Mean, V ariance,Energy or Entropy}.

Mean =1

M ×N

M∑x=1

N∑y=1

|Iij(x, y)| (1)

V ariance =1

M ×N

M∑x=1

N∑y=1

|Iij(x, y)− μij |2 (2)

Energy =

M∑x=1

N∑y=1

|Iij(x, y)|2 (3)

where M and N is the size of Iij(x, y). Entropy is a statistical

measure of randomness that can be used to characterize the

texture of the input image. Entropy is defined as

Entropy = −∑p

(p× log(p)) (4)

where p contains the histogram counts. The feature vector of

a face and iris is then constructed by concatenating each block

measure to one big feature vectors V =⋃k

i=1

⋃ki

j=1{νij}, k is

the number of S-P sub-bands and ki the number of blocks in

the ith sub-band. Therefore, we can extract the best features

and reduce the size of the data while keeping only the principal

discriminant features. Figure 4 shows the overall diagram of

the face and iris features extraction.

Fig. 4. Diagram of the Block-based S-P features extraction process

C. Face and Iris Matching

In this study, the city-bank distance between two feature

vectors V p and V q is defined as follows

dpq =

√√√√k∑

i=1

ki∑j=1

|νpij − νqij | (5)

The classification performance evaluation is based on pairwise

distance matrix. If there are m training and n test samples, then

a distance matrix should be of size m× n, with each column

representing the distances from the corresponding test sample

to all training samples (classes). The lower the distance, the

closer the two samples.

IV. EXPERIMENTAL RESULTS

A. Biometric Database

We evaluate the proposed multimodal system on a data set

including 108 subjects. Obviously, the FERET database do not

come with corresponding iris images, so to each face image,

we assign an arbitrary (but fixed) iris class. In this way, we

obtain a database of 108 subjects, with 4 face images and 7

iris images per subject. For the face:

• Our experiments are performed also on a subset of

FERET face database [9], pairs of frontal images fa and fb

image sets (fa and fb are two views taken at the same time

with different expressions). The FERET dataset contains

images of 108 individuals. There are four frontal views

of each individual: a neutral expression and a change of

expression from one session, and a neutral expression and

change of expression from a second session that occurred

three weeks after the first. All the images are aligned with

respect to the manually detected eye coordinates, scaled

to 128x128 pixels resolution and histogram equalized. For

each of the individual in the set, fa images are used for

training and the fb are used for testing purposes.Figure 5

depicts some sample images from the FERET database.

Fig. 5. Faces from the FERET Face Database

• For the iris database is authorized from the CASIA iris

image database collected by Institute of Automation,

Chinese Academy of Science [10]. Figure 6 shows four

iris samples from the CASIA iris database. The database

comprises 108 classes of eye images, and each class has

7 images (total 756 images). Whole classes come from

108 different eyes, and each image with the resolution of

320 × 280 in gray level. There are two groups in each

class, the first one has 3 and the second one has 4 images.

Each group denotes different time capturing. The first

three images of each class are selected to constitute the

training set and the other images of each class are used

as the test set. In the experiments for the preprocessing

stage, We adapted the open source code provided by

Libor Masek [11], for Hough transform segmentation,

to our algorithm. 88% of successful localization was

achieved.

Fig. 6. Iris samples from CASIA Iris Database.

B. Optimal Block Size and Statistical Measures

To extract the S-P features we have applied the same

procedure as described in section III-B. Extensive experiments

have been conducted on CASIA and FERET to determine

the optimal block size and the best statistical measures. As

demonstrated in [5], [6], the optimum block size is 8× 8 but

we prefer to use a block size of 16 × 16 since it provides

the best tradeoff between accuracy and execution times. For

the statistical measures we retains the Entropy as it give

high recognition Accuracy. We note that the experiments were

carried out in Matlab 7.5,

C. Face and Iris Fusion

1) Normalization: An important aspect that has to be

addressed in fusion at the matching score level is the nor-

malization of the scores obtained from the different domain

experts [12]. Normalization typically involves mapping the

scores obtained from multiple domains into a common domain

before combining them. This could be viewed as a two-step

process in which the distributions of scores for each domain

is first estimated using robust statistical techniques [13] and

these distributions are then scaled or translated into a common

domain. the most used normalization methods in the literature

are Z-Score and Min-Max.

2) Combination: The simplest form of combination would

be to take the sum of the scores from the two modalities. In

this experiment both the traits are combined at matching score

level using sum of score technique preprocessed by Z-Score

and Min-Max normalization.

Fig. 7. recognition accuracy of fusion by sum rule preprocessed by Min-Maxnormalization.

Fig. 8. recognition accuracy of fusion by sum rule preprocessed by Z-Scorenormalization.

The Fig. 7 and Fig. 8 shows the recognition performance

of different modalities. The results are found to be very

encouraging and promoting for the research in this field. The

overall accuracy of the system is 99.50% compared with

86.3% and 94.1% for face and iris respectively which mean

an improvement rates approximately 13.2% and 5.4%.

V. CONCLUSIONS

In this paper, we have proposed an multibiometric system

based on the fusion of face and iris at score level using an

efficient local appearance feature extraction method based on

S-P. We combine a face database FERET and iris database

CASIA (version 1) to construct a multimodal biometric ex-

perimental database with which we validate the proposed

approach and evaluate the multimodal biometrics performance.

The experimental results reveal the multimodal biometrics

verification is much more reliable and precise than single bio-

metric approach. In the future, the methods for more biometric

fusion should be probed into for multimodal biometric identity

authentication. Future work is suggested towards using differ-

ent and large databases to validate the performance of block-

based S-P approach for fusion of face and iris recognition.

Performance of employing different statistical measures and

different classification methods can also be studied.

REFERENCES

[1] Arun A. Ross, Karthik Nandakumar, and Anil K. Jain, Handbook ofMultibiometrics (International Series on Biometrics), Springer-VerlagNew York, Inc., Secaucus, NJ, USA, 2006.

[2] Yunhong Wang, Tieniu Tan, and Anil K. Jain, “Combining face andiris biometrics for identity verification,” in AVBPA’03 Proceedings ofthe 4th international conference on Audio- and video-based biometricperson authentication, 2003, pp. 805–813.

[3] K. Pan S. Z. Li Z. Zhang, R. Wang and P. Zhang, “Fusion of nearinfrared face and iris biometrics,” in ICB ’07: Proceedings of ICB’07,vol. 4642, Seoul, Korea, 2007, pp. 172–180.

[4] A. Rattani and M. Tistarelli, “Robust multi-modal and multiunit featurelevel fusion of face and iris biometrics,” in ICB ’09: Proceedings of theICB09, vol. 5558, Alghero, Italy, 2009, pp. 960–969.

[5] M. El Aroussi, S. Ghouzali, M. El Hassouni, M .Rziza, and D. Abouta-jdine, “Local appearance based face recognition method using blockbased steerable pyramid transform,” Signal Process, 2010.

[6] M. El Aroussi, K. Fakhar, M. Wahbi, and D. Aboutajdine, “Irisfeature extraction based on steerable pyramid representation,” in ISIVC’10: Proceedings of the 2010 5th International Symposium on I/VCommunications and Mobile Network (ISVC), 2010, pp. 1–4.

[7] E. P. Simoncelli and W. T. Freeman, “The steerable pyramid: a flexiblearchitecture for multi-scale derivative computation,” in ICIP ’95: Pro-ceedings of the 1995 International Conference on Image Processing (Vol.3)-Volume 3, Washington, DC, USA, 1995, p. 3444, IEEE ComputerSociety.

[8] E.H. Adelson and W.T. Freeman, “The design and use of steerable fil-ters,” IEEE Transactions on Pattern Analysis and Machine Intelligence,vol. 13, no. 9, pp. 891–906, 2000.

[9] Phillips P.J. nad Moon H., Rauss P.J., and S. Rizvi, “The feret evaluationmethodology for face recognition algorithms,” IEEE Trans. PatternAnalysis and Machine Intelligence, vol. 22, no. 10, pp. 891–906, 102000.

[10] “Institute of automation, chinese academy of science, casia iris imagedatabase.,” .

[11] L. Masek, “Recognition of human iris patterns for biometric identifica-tion,” Bachelor of Engineering Degree Thesis, The University of WesternAustralia, Australia, 2003.

[12] Brunelli R. and D. Falavigna, “Person identification using multiplecues,” IEEE Trans. Pattern Analysis and Machine Intelligence, vol.12, no. 10, pp. 955–966, 1995.

[13] Hampel F.and Rousseeuw P., Ronchetti E., and Stahel W., “The approachbased on infiuence functions.,” John Wiley & Sons., 1986.

Documents

[IEEE 2011 International Conference on Multimedia Computing and Systems (ICMCS) - Ouarzazate, Morocco (2011.04.7-2011.04.9)] 2011 International Conference on Multimedia Computing and