F - UZHsam/smartimgsys/download/smart.pdfHalbleiterherstellung und ist in zahlreic hen eingespielten Prozessen billig v erf ugbar. T rotzdem wurde diese ec hnik erst seit ein paar

School of Engineering and Architecture

Department for Electrical Engineering

Quellgasse 21, CH-2501 Biel

www.hta-bi.bfh.ch

Bachelor's thesis and Preliminary studies

Smart Image System

Bettler Matthias & Zahnd Samuel

Supervisors: Dr. J. Goette & Dr. M. Jacomet

23rd January 2001

Author: Bettler Matthias & Zahnd Samuel

Supervisors: Dr. J. Goette & Dr. M. Jacomet

Web publication: www.microlab.ch/academics/r and d/diplom

Word processor: LATEX

F�ur Alle, die zum Gelingen dieser Arbeit beigetragen haben.

F�ur meine Eltern und ihre Unterst�utzung w�ahrend meines Studiums.

Matthias

F�ur die Endlichkeit

- f�ur die Ewigkeit: Entscheidet euch heute wem ihr dienen wollt.

Ich aber und mein Haus wollen dem Herrn dienen. Jos 24,15.

Samuel

Abstract

Since the �rst realisation of an integrated circuit by Jack S. Kilby in 1958 in the Texas

Instrument laboratories, the integrated circuit technology has been improved in an amazing

manner. Today, the Cmos technology is the state-of-the-art in the �eld of semiconductor

manufacturing and various well proven processes are cheaply available. Nevertheless, the

Cmos technology has been used for a few years only for the manufacturing of image-sensors

until today. The market of image-sensors has been dominated and is still dominated by

sensors that are known as charge coupled devices (CCD). CCD sensors use special manufac-

turing processes that are not compatible with the standard Cmos technology.

The main advantages of Cmos based image-sensors are the ability to place additional

electric circuits on the same chip, e.g. gain circuits on pixel level, circuits for signal processing

etc. Another advantage is the development of the Cmos technology by numerous companies.

In the �rst part of our work we developed a Cmos image-sensor with a high dynamic range

over many orders of magnitude of illumination. Due to the use of a logarithmic pixel circuit

we could extend the dynamic range up to 6 decades. The circuit consists of a photosensitive

area and a gain circuit. The most demanding part of our image-sensor design was the

derivation of the accurate dimensions of the transistors used for the pixel gain circuit. By

calculating and simulating, we could �nally �nd the optimal sizes of the transistors and

could draw the pixel layout. The chip has been manufactured by the Alcatel Mietec 0:5�m

process.

While the chip has been manufactured we dealt with image processing algorithms. We

concentrated on a completely new approach, the cellular neural networks CNNs. CNNs,

implemented on a chip, provide an ideal structure for fast analog image processing. Image

datas can be processed in parallel and outperform any digital processors in the quest of

speed. Leon O. Chua developed the architecture of this networks in 1988. Since then, many

papers were published. First we dealt with the functionality of the CNNs, then we tried

to con�gure the networks, by using genetic algorithms. We then decided to buy a so-called

CNN Universal Machine that provide us to verify our template designs and to process our

data from the image sensor. The CNN Universal Machine consists basically on an integrated

circuit with numerous con�gurable CNNs implemented on the chip.

Another important part of our work was the measurement on our image-sensor. First of

all, we focused on the measurement of our light intensity range and we could successfully

verify our simulated datas. Another challenge was the elimination of the so-called �xed

pattern noise. This noise is well known in connection with logarithmic sensors but can

be eliminated with suitable methods. Our image-sensor was developed as a prototype and

optimized on a large dynamic light intensity range. In the context of this constraints our

senor ful�lls the demanded requirements.

Thesis Report { Smart Image System i

Inhalt

Seit der ersten Realisierung von integrierten Schaltungen im Jahre 1958 durch Jack S. Kilby

bei Texas Instruments, wurde die Technologie der integrierten Schaltungen mit rasender

Geschwindigkeit weiterentwickelt und verbessert. DieCmos Technolgie ist heute die treibende

Kraft im Gebiet der Halbleiterherstellung und ist in zahlreichen eingespielten Prozessen

billig verf�ugbar. Trotzdem wurde diese Technik erst seit ein paar Jahren f�ur die Herstel-

lung von Bildsensoren genutzt und kommerzielle Produkte sind erst seit kurzem auf dem

Markt. Die Bildsensorik wurde bisher von Sensoren dominiert, welche nach dem Ladungs-

Kopplungs-Prinzip (CCD) funktionieren. Solche Bildsensoren ben�otigen zur Herstellung

speziell angepasste Fertigungsprozesse.

Auf Cmos basierende Bildsensoren bieten haupts�achlich den Vorteil, zus�atzliche Schal-

tungselemente wie zum Beispiel Verst�arkerschaltungen auf Pixelebene oder Bildverarbeitungsal-

gorithmen direkt auf ein und demselben Chip zu integrieren. Ein weiterer Vorteil ist auch

die st�andige Weiterentwicklung der Cmos Technologie durch die Industrie. Wir haben in

einem ersten Schritt unserer Arbeit einen Cmos Bildsensor entwickelt, der eine sehr hohe

Lichtintensit�atsdynmaik von ca. 6 Dekaden aufweist. Dies haben wir mit einem sogenan-

nten logarithmischen Pixel erreicht, das neben dem �ublichen photosensitiven Element noch

zus�atzliche Schaltungselemente aufweist. Die pr�azise Dimensionierung der f�ur die Schaltung

n�otigen Transistoren stellte dabei die gr�osste Herausforderung dar. Durch Berechnungen und

genaue Simulationen konnten schliesslich die optimalen Transistorgr�ossen erruiert und das

Layout des Pixels erstellt werden. Der Chip wurde mit dem Alcatel Mietec 0:5�m Prozess

realisiert.

W�ahrend der Produktion unseres Sensors befassten wir uns mit Bildverarbeitungsalgo-

rithmen und stiessen dabei auf das v�ollig neuartige Gebiet der zellul�aren neuronalen Netze

(CNNs). CNNs, behandelt auf der Ebene der Schaltungstechnik, sind analoge Schaltungen,

die untereinander vernetzt sind und deshalb von der Struktur her geradezu pr�adestiniert

sind f�ur die Bildverarbeitung. Bilddaten werden parallel und analog verarbeitet und lassen

dabei digitale Prozessoren in Sachen Geschwindigkeit weit hinter sich. Die Architektur dieser

Netzwerke wurde von Leon O. Chua im Jahr 1988 entwickelt und seither sind zahlreiche Ar-

beiten zu diesem Thema publiziert worden. Wir haben uns zuerst mit der Funktion dieser

Netze auseinandergesetzt und in einem zweiten Schritt versucht, die CNNs zu kon�gurieren.

F�ur die Berechnung der Kon�gurationsdaten, den sogenannten Templates, haben wir einen

genetischen Algorithmus verwendet. Um die CNNs auch in Hardware auszutesten, �el die

Entscheidung f�ur den Kauf einer sogenannten CNN 'Universal Machine'. Die CNN 'Uni-

versal Machine' besteht im Wesentlichen aus einem Chip, auf dem programmierbare CNNs

implementiert sind und der vom Computer aus kon�guriert werden kann. Mit Hilfe dieses

Systems waren wir in der Lage, unseren Template-Design zu veri�zieren und Bilder von

unserem Chip zu verarbeiten.

Ein weiterer bedeutender Teil unserer Arbeit bestand aus dem Ausmessen unseres Bild-

Thesis Report { Smart Image System iii

sensors. Dabei war vor allem die Messung der Lichtintensit�at von Interesse; wir konnten

unsere simulierten Daten erfolgreich veri�zieren. Eine weitere Herausforderung war die Be-

seitigung des sogenannten Fixed Pattern Noise. Dieses Rauschen ist im Zusammenhang mit

logarithmischen Sensoren bekannt, kann aber mit geeigneten Methoden eliminiert werden.

Unser Bildsensor wurde als Prototyp auf einen maximalen Lichtintensit�atsbereich optimiert

und erf�ullte die gestellten Anforderungen.

iv M. Bettler & S. Zahnd

Foreword

The main goal of this work was to combine an image sensing system with an image processing

system in a smart image system. Figure 0.2 shows the time schedule of our work. We

completed and updated the documentation of all the parts in this paper during the bachelor's

thesis.

Figure 0.1: Smart Image System.

The project will be divided into three main parts as follows.

Figure 0.2: Project time schedule.

Image Sensing (Preliminary 1) In this part we have accomplished our �rst implemen-

tation of a Cmos image sensor. By this way we will be able to capture our �rst pictures,

verify our simulations and calculations and �nally get unknown physical parameters like the

real dependence of light on electric signals.

Image Processing (Preliminary 2) In this part we studied the properties of cellular

neural networks. We presented a complete method to design CNN templates in the frequency

domain by using genetic algorithms as an optimization algorithm. Finally we introduced the

concept of the CNN Universal Machine that we will use in the bachelor's thesis for image

processing.

Smart Image System (Bachelor's Thesis) In the bachelor's thesis we measured many

of the important properties of a Cmos image-sensor. To perform this measurements we built

speci�c hardware and software. To complete our work we combined the two systems.

Thesis Report { Smart Image System v

Additional Information

for Experts

Figure 0.3 presents the job scheduling of our bachelor's thesis. It covers the eight weeks of

the �nal project. We have started the �rst project at the end of October and terminated it

in the middle of December in the y2k.

Each of the preliminary studies covers a period of half a year, and was carried out in

winter 1999 and summer 2000.

Figure 0.3: Job scheduling.

Thesis Report { Smart Image System vii

Contents

I Image

Sensing 1

1 Basics of a CMOS image sensor 5

1.1 The perfect model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 The photodiode . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 6

2 Logarithmic APS chip 9

2.1 The sensor core . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.1 Simulation of the sensor core . . . . . . . . . . . . . . . . . . . . . . . 9

2.1.2 Implementation of the sensor core . . . . . . . . . . . . . . . . . . . . 9

2.2 Input/Output . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.2.1 Input (digital part) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.2 Input (analog part) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.2.3 Output (analog part) . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.3 The complete chip . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

3 Chip development at MicroLab 15

3.1 Process Fabrication . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2 Design- ow for the digital part . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.3 Design- ow for the analog part . . . . . . . . . . . . . . . . . . . . . . . . . . 16

3.4 Assembling (analog & digital) . . . . . . . . . . . . . . . . . . . . . . . . . . . 16

II Image

Processing 17

1 Introduction to Cellular Neural Networks 21

1.1 Cellular Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.1.1 Network Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

1.2 Mathematical Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.2.1 Cell Dynamics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.2.2 Spatial Invariance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1.3 Electrical Description . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1.3.1 Cell Schematic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1.3.2 Network operation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

1.4 Types of Processing Tasks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

Thesis Report { Smart Image System ix

Contents

1.5 Design of CNN Templates and their Robustness . . . . . . . . . . . . . . . . . 25

2 CNN Template Design for Image Processing 27

2.1 Convolution Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 27

2.2 Spatial Frequency Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . 28

2.3 Template Design Examples . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

2.3.1 Low-Pass Filter Design . . . . . . . . . . . . . . . . . . . . . . . . . . 29

3 CNN Optimization Techniques 35

3.1 Genetic Algorithms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35

3.1.1 How Genetic Algorithms are Di�erent from Traditional Methods . . . 35

3.1.2 Genetic Search Mechanism . . . . . . . . . . . . . . . . . . . . . . . . 36

3.1.3 Some Mathematical Foundations . . . . . . . . . . . . . . . . . . . . . 37

3.1.4 Design of the Fitness Function . . . . . . . . . . . . . . . . . . . . . . 39

3.1.5 GA Based Template Learning . . . . . . . . . . . . . . . . . . . . . . . 39

4 Hardware Implementation 41

4.1 CNN Universal Machine . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41

III Smart

Image System 43

1 Obscura - Image Sensing Unit 47

1.1 PCB Hardware . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

1.2 Hardware Interface Card - dSpace . . . . . . . . . . . . . . . . . . . . . . . . 47

1.3 Optics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48

1.4 Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

2 Obscura - Measurement 53

2.1 Introduction to Fixed Pattern Noise (FPN) . . . . . . . . . . . . . . . . . . . 53

2.1.1 Transistor Mismatch in Weak Inversion . . . . . . . . . . . . . . . . . 53

2.1.2 Fixed Pattern Noise Correction in Logarithmic Image Sensors . . . . . 55

2.2 Photoreceptor response and �xed pattern noise . . . . . . . . . . . . . . . . . 56

2.2.1 Response curves . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

2.2.2 Remaining �xed pattern noise . . . . . . . . . . . . . . . . . . . . . . . 56

2.2.3 Slope variations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56

2.3 Complementing measurements . . . . . . . . . . . . . . . . . . . . . . . . . . 57

2.3.1 Crosstalk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

2.3.2 Temporal Noise . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

2.4 Spectral characteristic . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

3 Aladdin - Image Processing Unit 65

4 Conclusion 67

4.1 Current state of the project . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

4.2 Post-script . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

x M. Bettler & S. Zahnd

Contents

A The Layout Structure 71

B Analysis of the pixel circuit 73

C Content of the CD-ROM 75

Bibliography 81

Thesis Report { Smart Image System xi

Part I

Image Sensing

Thesis Report { Smart Image System 1

Preface

The goal of the �rst part of our work is to develop a single-chip image sensor, which provides

high dynamic range with respect to light intensity. We use a Cmos technology in order to

achieve better results than in standard CCD implementations. By using a Cmos technology

we expect to well approximate the characteristics of the human eye relative to light intensity.

With this project we will test the current and cheap technology of Cmos image sensors

to get the knowledge for further implementations in embedded systems for theMicroLab1.

The main ideas for our work come from [23]. In article [23] the Cmos technology is used

to build an image-sensor for a retina-implant system that will provide visual sensations to

patients su�ering from photoreceptor degeneration.

1The MicroLab is the laboratory for microelectronics at the School of Engineering and Architecture in

Biel.


1 Basics of a CMOS image sensor

The overall aim of our work is to approximate the human eye by a microelectronic system

based on a Cmos technology.

1.1 The perfect model

The human eye is a brilliant creation. Its main part is the retina, which is a thin sheet of

neural tissue that partially lines the orb of the eye. This thiny outpost of the central nervous

system is responsible for collecting all the visual information about properties of objects in

the world over many orders of magnitude of illumination.

The high degree to which a perceived image is independent of the absolute illumination

level is, in a large part, the result of the initial analog stages of retinal processing, from the

photoreceptors through the outer plexiform layer. This processing relies on lateral inhibition

to adapt the system to a wide range of viewing conditions, and to produce an output that

is independent of the absolute illumination level.

The major parts of the retina are shown in cross-section in Figure 1.1. Light is transduced

into electrical potential by the photoreceptors at the top. After further processing and

combining the signals leave the retina by way of the ganglion cells.

Cross-section of a primate retina, indicat-

ing the primary cell types and signal path-

ways. The outer-plexiform layer is beneath

the foot of the photoreceptors. The invagi-

nation into the foot of the photoreceptor is

the site of the triad synapse. In the center

of the invagination is a bipolar-cell process,

anked by two horizontal cell processes. R:

photoreceptor, H: horizontal cell, IB: invagi-

nating bipolar cell, FB: at bipolar cell, A:

amacrine cell, IP: inter plexiform cell, G:

ganglion cell.

(Source: [17], p. 258)

Figure 1.1: Cross-section of a primate retina.

The primary function of the photoreceptor is to transduce light into an electrical signal.


1 Basics of a CMOS image sensor

For intermediate levels of illumination, this signal is proportional to the logarithm of the

incoming light intensity. The logarithmic nature of the output of the biological photoreceptor

is supported by psychological and electro-physical evidence. This has two important system-

level consequences:

1. An intensity range of many orders of magnitude is compressed into a manageable range

in signal level.

2. The voltage di�erence between two points is proportional to the contrast ratio between

the corresponding points in the image.

We discuss our implementation of a Cmos image-sensor in Chapter 2.1.1.

1.2 The photodiode

To detect optical radiation (photons) a photodiode is needed. The basic detection process is

illustrated in Figure 1.2 which shows a p-n photodiode. This device is reverse biased and the

electric �eld developed across the p-n junction sweeps mobile carriers (holes and electrons)

to their respective majority sides. A depletion layer is therefore created on either side of the

junction. This barrier has the e�ect of stopping the majority carriers crossing the junction

in the opposite direction to the �eld. However, the �eld accelerates minority carriers from

both sides of the opposite side of the junction, forming the reverse leakage current of the

diode. Thus, intrinsic conditions are created in the depletion region.

A photon incident in or near the depletion region of this device which has an energy

greater than or equal the bandgap energy EG of the fabricating material (i.e. hf � EG )

will excite an electron from the valence band into the conduction band. This process leaves

empty holes in the valence band and is known as the photo-generation of an electron-hole

pair, as shown in Figure 1.2(a). Carrier pairs so generated near the junction are separated

and sweep (drift) under the in uence of the electric �eld to produce a displacement by current

in the external circuit in excess of any reverse leakage current Figure 1.2(b). Figure 1.2(c)

shows the photo-generation and the separation of a carrier pair in the depletion region of

this reverse biased p-n junction.

The absorption of photons in a photodiode to produce carrier pairs and thus a pho-

tocurrent, is dependent on the quantum eÆciency QE, the area of the depletion region, the

wavelength, and the power of the incident light. Formula (1.1) gives a de�nition of the

quantum eÆciency. Formula (1.2) shows that the photocurrent is proportional to the power

of radiation for a speci�c wavelength. Note that one of the major factors which determines

the quantum eÆciency is the semiconductor material. Finally the QE is a function of the

photon wavelength and must be quoted for a speci�c wavelength.

QE =number of electrons collected

number of incident photons: (1.1)

Ip = QE(�)q�PLAD

hc

: (1.2)

6 M. Bettler & S. Zahnd

1.2 The photodiode

Figure 1.2: Operation of the p-n photodiode. (a) Photo-generation of an electron-hole pair;

(b) the structure of the reverse biased p-n junction; (c) energy band diagram of the reverse

biased p-n junction. (Source: [24], p. 423)

In the formula (1.2) q is the electronic charge, h is Planck's constant and c is the speed

of light.

In Figure 1.3 one can see a graphical plot of the formula (1.2) for two given wavelengths.

10−4

10−3

10−2

10−1

100

101

102

103

10−14

10−13

10−12

10−11

10−10

10−9

10−8

10−7

10−6

Irradiance in W/m2

Pho

tocu

rren

t in

A

QE=0.8

Area=1714um2

Lambda=700nmLambda=400nm

Figure 1.3: Photocurrent versus the power of the incident light, see formula (1.2).

In our image sensor the Cmos compatible photodiode is formed between drain di�usion

and the p�-substrate. For further information about our implementation see Chapter 2.1.2.


2 Logarithmic APS chip

The following sections presents the integration of a Cmos image-sensor with a active pixel

sensor (APS) structure on a chip.

2.1 The sensor core

2.1.1 Simulation of the sensor core

The chosen circuit topology is based on [23] and has to meet the following requirements.

1. Logarithmic light detection and conversion into voltage in a useful linear range.

2. Minimal power consumption by using the Mos-Fet in the subthreshold mode.

3. Minimal power dissipation by cutting o� inactive cells.

4. Maximal dynamic range over many magnitudes of illumination.

Figure 2.1 shows the circuit for one picture element. For a detailed mathematical de-

scription see Appendix B. By simulating1 the circuit, we �nd the optimal sizes for each

Mos-Fet. Table 2.1 shows the results. Figure 2.1 shows the simulated signals (Vout, Iphoto)

of the circuit versus the light irradiance which is proportional to the area.

Q Width[�m] Length[�m]

Q1 40 1.2

Q2 1.2 1.2

Q3 1.2 1.2

Table 2.1: Optimal sizes for the Mos-Fet of the pixel circuit.

2.1.2 Implementation of the sensor core

This was one of the most important parts of our work. The goal was to implement the

simulated cell structure; see Section 2.1.1. A further aim was to optimize the �ll factor, that

means, the relation between light active area and the rest of the pixel's area.

For our �rst chip we chose a pixel size of 50 � 50�m. This size is very large, but useful

for a �rst implementation. The result of our e�ort is the layout of one pixel, presented in

Figure 2.3.

1We are not able to simulate the dependence of light on the circuit, because there are unknown physical

parameters which depend on the fabrication process. So we model the in uence of light by varying the

area of the photoreceptor in the simulation, because this area is proportional to the light irradiance.



Figure 2.1: Schematic of the circuit for one pixel . From all picture elements there is always

one row active depending on the digital Row Select signal. Then, all the generated output

signals ow along the Column Readout lines and will later be multiplexed. By turning o�

the analog reference signal Row Preselect, we minimize the power consumption.

Figure 2.2: Simulated pixel circuit which works linearly over more than seven decades. The

in uence of the light intensity was modeled by varying the area of the photodiode.


2.2 Input/Output

Figure 2.3: Layout of the pixel circuit.

2.2 Input/Output

The global structure of the input and output con�guration is shown in Figure 2.4. For

further information on the implementation, see Appendix A.

Figure 2.4: Blockdiagram of the I/O system on chip, divided into analog & digital part and

into input & output section.



2.2.1 Input (digital part)

The digital part of our chip consists of a 6 to 48 de-multiplexer (Demux) and a Nand

unit. The Demux with its active-low2 output is used to switch-on the p-pass-transistors

that supply each row of our chip with the analog reference voltage (VpsRef). By a further

Nand combination with (VsEna) the selection of each row (vs<0:47>) of our 48 � 48 pixel

image-sensor will be done .

2.2.2 Input (analog part)

The only task of this simple circuitry is to supply the analog reference signal (VpsRef) to the

selected row. In this way we keep the power consumption low. Instead of a transmission-gate

we chose a p-pass-transistor design. So the row selection is active-low.

2.2.3 Output (analog part)

With a analog multiplexer (MUX) the 48 column signals are merged into 3 lines, because

the dSpace-Board3 has only 4 analog input ports. The MUX is accomplished with a 4 level

transmission-gate structure.

2.3 The complete chip

Figure 2.5 shows a photography of our bonded image sensor chip. The pin outline is presented

in Figure 2.6 and Table 2.2 provides an overview of features and speci�cations of the chip.

Figure 2.5: The whole CMOS image sensor chip.

2Active-low output of the Demux is needed because of the p-pass-transistor structure in the analog input

part.3See Chapter 1.2 in Part III for an introduction to dSpace


2.3 The complete chip

Figure 2.6: Bonding plan.

Parameter Specification Unit

Features of the Single-Chip Cmos Image Sensor

Resolution 48 � 48 pixel

Dynamic range 130 dB

Slope 69.5 mV/decade

Chip and Package

Name CH011

Die Size 11.27 mm2

Package LCC44

Electrical properties

Supply (Vdd) 3.3 V

Reference (VpsRef) 1.1 V

Digital inputs 3.3 V Cmos V

Analog outputs 1.8 to 2.25 (linear range) V

Pins and Usage

Row[0..5] Binary coded row selection.

0xb00000 means the top row.

Row[0..3] Binary coded column selection.

0xb000 means the most left column of one third.

VpsRef Reference Voltage for the pixel circuit.

Only the selected row is supplied.

EnaVs Digital control signal for the

internal row selection. Only when

EnaVs is low, the addressed row is selected.

Vdd Power supply

Vss Ground

Table 2.2: Chip features and description.


3 Chip development at MicroLab

3.1 Process Fabrication

We will make use of a multi-chip module (MCM) provided by Europractice1. A MCM is

a complete electronic system with complex functionality, using bare (unpackaged) Ic's to

achieve a very high integration density.

The �nal production will be done by Alcatel by using the Cmos 0:5�m Mietec C05M-D

process.

3.2 Design-flow for the digital part

We describe the whole Demux/Nand design in the Vhdl language and did the simulation

as well as the synthesis to the Alcatel 0:5�m Cmos technology by Synopsis. The design- ow

of the digital part is shown in Figure 3.1 below.

Figure 3.1: Design- ow of the Digital Part.

The written Vhdl code needs to be checked with a simulation on the functional level

�rst. For that reason and for further simulations on other levels of the design- ow a test

bench has been written. Following the veri�cation of the Demux on the functional level,

the design has been synthesized to the Register Transfer Level RTL. The next step is the so

called post synthesis simulation that should verify the design including the cell delays. The

�nal task is the oor-planing with Silicon Ensemble from Cadence.

1Europractice is a European organisation, divided into many universities and supported by the European

Union. Europractice allows to use design tools from vendors such as Cadence and to produce microchips

at a�ordable prices.


3 Chip development at MicroLab

3.3 Design-flow for the analog part

All the analog parts of our chip are designed as full custom parts. The main tool for the

design of the layout, the veri�cation of the design layout rules, the schematic entry and the

layout versus schematic test are performed with the design tools of Cadence.

3.4 Assembling (analog & digital)

To link all the parts of our design together, the Silicon Ensemble and the library develop-

ment tool Auto Abgen as well as the Synopsis synthesis tools have been used. The synthe-

sis tool generates the Vhdl source for simulation and the verilog source for oor-planning.

For the analog part, a LEF-�le2 needs to be generated by the library development tool men-

tioned above. Figure 3.2 shows a very simple representation of the oor-planing/place&route

tool Silicon Ensemble.

Figure 3.2: Simpli�ed representation of the design- ow for the place and route task of our

design.

2A Library Exchange Format(LEF) �le contains library information for a class of designs. Library data

includes layer, via, placement site type, and macrocell de�nitions.


Part II

Image Processing


Preface

In our �rst preliminary study (Part I) we presented the development of a Single-Chip Cmos

Image Sensor. It enables us to capture grayscale images with a resolution of 48 x 48 pixels

and a high dynamic range. Caused by the physics, these raw data are noised, so they need

to be free from this in uence. We therefore proposed to develop signal processing algorithm,

which can be implemented later on the same chip. A possible structure for this, could be the

use of a pseudo-resistive di�usive averaging network based on MOS transistors as presented

in [23].

An absolutely new approach to eliminate noise would be the use of Cellular Neural

Networks (CNNs). Furthermore this technique could be useful for pattern recognition and

more. This way we are able to build an intelligent sensor system. The implementation of

the CNN and the Imager on the same chip will be more diÆcult. Finally, we preferred using

CNNs instead of an averaging network, thanks to its exibility and its current actual aspects.

This minor change in the project realisation has a major e�ect on the bachelor's thesis

contents. Therefore, we will not develop the �nal chip as previously suggested. But with a

CNN Universal Machine Prototyping System linked together with our Imager, we can show

the way to build a powerful and intelligent sensor system.


1 Introduction to Cellular Neural

Networks

In this chapter we give a brief [15] introduction to cellular neural networks, their description,

and their processing. A general introduction can be found, e.g., in [5, 2].

1.1 Cellular Neural Networks

The Cellular Neural Network (CNN) architecture was invented by Leon O. Chua1 and his

graduate student Lin Yang in 1988 [5, 4]. The properties of this net are: nonlinear continuous

time dynamic elements placed in a cellular array. This results in a nonlinear system in space,

which is very complex to handle. The inventors, however, showed that these networks can

be designed and used for a wide variety of engineering purposes, while maintaining stability

and keeping the dynamic range within well designed limits.

Since then a lot of studies have been presented on this subject. There is also a bi-annual

conference dedicated to CNNs and their applications2. CNNs become of interest in many

applications, e.g., 2-D processing and recognising (picture, di�erential equations,. . . ) and

1-D processing and representation (audio, cryptography, data compression,. . . ).

To perform such processing by digital signal processors (DSP) requires fast and/or paral-

lel machines, because there are usually many pixels to process. Using CNNs, the processing

may be performed by analog circuits. In applications, such circuits do not require extreme

accuracy for a correct functionality. Furthermore, CNNs can be built by using less silicon

area and less power for the same task and throughput rates.

1.1.1 Network Structure

The cells in a CNN are arranged and connected in a certain way, which can be charac-

terised by the following attributes. There are basically two types of dimensions in CNN,

one-dimensional and planar respectively. Further we distinguish between several types of

topology, e.g., square or hexagonal grid. The connections from a particular cell to the oth-

ers is de�ned by a set of neighbours which are inside a certain number of connection units

1Leon O. Chua received the M.S. degree from the Massachusetts Institute of Technology in 1961 and the

Ph.D. degree from the University of Illinois, Urbana, in 1964.

He is currently a Professor of Electrical Engineering and Computer Sciences at the University of Cali-

fornia, Berkeley. His research interests are in the areas of general nonlinear network and system theory.

He has been a consultant to various network analysis, modeling, and computer-aided design. He is the

author of several books and papers.

Professor Chua is holder of �ve U.S. patents and was awarded with multiple Honorary Doctorate titles.2The IEEE International Workshop on Cellular Neural Networks and their Applications.


1 Introduction to Cellular Neural Networks

(Fig.1.1). Each connection should be understood as being bi-directional, i.e., connected cells

in uence each other.

Figure 1.1: Network structure of a planar, square grid CNN with nearest neighbor connec-

tions.

1.2 Mathematical Description

CNNs may be studied in a purely mathematical point of view.

1.2.1 Cell Dynamics

State Voltage and Output Function. The dynamics of the simplest CNN, as presented

in Chapter1.3, is described by

d

dt

xi;j(t) = �xi;j(t) +P

k;l2N

Ak;lyi+k;j+k(t) +P

k;l2N

Bk;lui+k;j+k + I (1.1)

with the output nonlinearity, called unity gain,

y(x) = 12 [jx� 1j � jx+ 1j] (1.2)

as shown in Fig. 1.2. The input, state, output, represented by ui;j , xi;j , and yi;j , respectively,

are de�ned on 0 � i � N1 and 0 � j � N2. N , Nr, respectively, denotes the set of all cells

with which the i; jth cell is directly connected, where r is the within neighborhood radius.

Ak;l, Bk;l and I are the network coeÆcients.

Other output functions have been proposed [2], such as rather complicated, multi-

breakpoint piecewise linear, Gaussian or simply thresholding output functions.

Block diagram. The emphasis on A template can be easily understood by writing (1.1)

in block diagram form, as shown in Fig. 1.3. From the diagram, it can be seen that the B


1.2 Mathematical Description

Figure 1.2: The unity gain output function, piecewise-linear function (PWL), respectively.

template forms a simple feed forward �nite impulse response (FIR) �ltered version of the

input, while A template is operating in a feedback loop along with a nonlinearity.

Figure 1.3: A block diagram showing the standard CNN.

Initial State. xi;j(0) represents the initial state. It is convenient to restrict the range

to [+1, -1].

Boundary Values. In order to guarantee that all pixels have the same number of

neighbors, it is necessary to surround the image with a ring of boundary pixels. Their state

is �xed to a boundary value.

Stability issues. In connection with linear system theory, \stability" means that the

e�ect of a suÆciently small disturbance will decay in time and the network will return to

the same equilibrium state. There exist a lot of reports which describe stability analysis and

conditions.

Settling time. In the case of a nonlinear dynamical system, the processing speed is

de�ned by its settling time, i.e., the time it takes the system to reach its equilibrium state.

The settling time depends on its input u, the initial state xi;j(0) and, in a complex and

highly nonlinear manner, on the template set A;B; I.

Having an estimate of the settling time at one's disposal allows template optimization

with respect to processing speed [10]. Design rules for faster templates can be derived.


1 Introduction to Cellular Neural Networks

1.2.2 Spatial Invariance

Each term in (1.1) carries the (i; j) index, implying that all quantities depend on the grid

position. Usually this is not the case, as long as we work with a subclass of CNN, the spatially-

invariant networks. Spatial-invariance implies that each cell has the identical controlled

sources, i.e., that the nature of the source (i; j : k; l) depends only on the relative position

of (i; j) and (k; l), and that the constant sources I(i; j) are all identical. One can speak

of a cloning template where each cell is repeated. The operation of the CNN is de�ned by

specifying the various controlled sources for each di�erence (k � i; l � j) and the constant

source, by de�ning the feedback template A and the control template B, and the constant

source I.

1.3 Electrical Description

In order to use the mathematical description of a cellular nonlinear network in practical

problems mentioned before, this network has to be implemented in hardware.

1.3.1 Cell Schematic

The translation of the original CNN publication [5] results in the simple CNN cell circuit

shown in Fig.1.4.

Figure 1.4: The CNN cell schematic. The output function is shown as a functional block

rather than a current source and a non-linear resistor.

The cell at the grid location (i; j) consists of the parallel Rc circuit (Rx; Cx), several

current sources, and a voltage source. The cell input is represented by the independent

voltage source Eij whose output voltage is uij . The capacitor voltage xij is the state voltage

of the cell. The state voltage observed through the PWL output function is the output

voltage yij . The PWL block can be built with Chua's circuit [1].

The controlled current sources enable the interconnections between the current cell (i; j)

and the neighboring cells (k; l). The voltage controlled current sources (Vccs) B take as

their controlling voltage ukl. The VCCS A take as their controlling voltage ykl. The current

source I is a constant current source.

1.3.2 Network operation

Generally, a CNN is programmed by choosing a template set fA;B; Ig and assigning the

appropriate initial data to uij and xij(0). The CNN circuit then operates as follows: at time

t = 0� the state voltage of each cell is set to some initial value xij(0), all the current sources

A;B; and I are inhibited,i.e., output no current, and the voltage uij is provided in each cell.


1.4 Types of Processing Tasks

At t = 0+ the current sources are switched on and allowed to operate. The state voltage

will evolve in time, as per (1.1). Provided that the network is stable, starting from an initial

state, an equilibrium state will eventually be reached after transients have decayed. The

evolution to an equilibrium state will be called a CNN transient. Either one or both of the

voltages xij(0) and uij can be considered to represent the input image(s), depending on the

desired processing task.

1.4 Types of Processing Tasks

CNN processing tasks may be divided in two classes: based on the type of interaction between

cells, and based on the type of input data.

Coupled and Uncoupled Processing Tasks. One can divide processing tasks in

coupled and uncoupled ones, based on the type of interaction implied by the A template.

Coupled processing tasks have non-zero o�-center A template entries, i.e., the interaction

between cells involves feedback. On the other hand, uncoupled tasks have, at most, the

self-feedback template entry non-zero, so the interaction between cells is feed-forward only.

Coupled templates can exhibit propagation behavior, and thus perform operations of a global

nature, in contrast to uncoupled templates which usually have fast settling times.

Bipolar and Gray-Scale Processing. Some tasks assume the input pixels take on

only +1 and -1 (bipolar). With no intermediate values. For other tasks, particularly when

\real world" data is involved, the inputs and initial states take on gray-scale values between

�1.

1.5 Design of CNN Templates and their Robustness

CNNs are a subclass of neural networks and resemble a Hop�eld network. But they are

di�erent in many respects from general neural networks, i.e., in implementation issues and

kind of programming. CNNs do not normally go through a training phase, and the connec-

tions weights for a certain task can be deduced or even computed in closed-form. Therefore

one can use di�erent tools, like statistical methods, genetic algorithms, by experience, lin-

ear algebra and ad hoc reasoning. Unlike conventional arti�cial neural networks (ANN),

CNNs are not usually used as approximators of a unknown function, but rather to perform

a well-de�ned mapping.

However, the absence of training and the fact that the connection weights are \pre-

scribed" rather than found by experiencing, means that the circuit has to achieve these

weights with a high degree of precision. considering the physical limitations that analog

implementations entail, robust operation of a CNN chip with respect to parameter varia-

tions has to be ensured. So far not all mathematically possible CNN tasks can be carried

out reliably on an analog chip. By applying other techniques CNN templates can be found

which guarantee a satisfactory optimal robustness [10].

The design of CNN templates { even more the design of optimally robust templates {

are a key element in CNN research.


2 CNN Template Design for Image

Processing

Many image Processing and Pattern Formation e�ects of the simple Cellular Neural Network

(CNN) 1 can be understood by means of a common approach as shown in [6]. By examining

the dynamics in the frequency domain, when all CNN cells are in the linear region, the

mechanism for IIR spatial �ltering, pattern formation, morphogenesis, and synergetics can

be shown to be present, even though each cell has only �rst order dynamics. In addition,

the method allows many of the standard CNN templates, such as the nonlinear \averaging",

\halftoning" and \di�usion" templates to be explained in a new light. With one example in

Chapter 2.3 it is shown how generalizations of these templates can be used to design linear

�lters 2 for image processing tasks. Another, more direct approach to design templates is

brie y introduced in Chapter 3.1.5 for completeness.

2.1 Convolution Formulation

In Chapter 1.2 the dynamics of the CNN was introduced by equation (1.1), which is repeated

here for convenience:

d

dt


k;l2N

Ak;lyi+k;j+k(t) +P

k;l2N


If all j xi;j(t) j< 1 then, because of the unity gain output function (Fig. 1.2), yi;j = xi;j

and the whole system behaves according to the linear system

d

dt


k;l2N

Ak;lxi+k;j+k(t) +P

k;l2N


which we now assume to operate over all state space. For simplicity, de�ne the linearized

template mask as follows:

a(n1; n2) =

8><>:

A0;0 � 1 (n1; n2) = (0; 0)

A�n1;�n2 for �n1;�n2 2 N

0 otherwise

b(n1; n2) =

(B�n1;�n2 for �n1;�n2 2 N

0 otherwise

(2.3)

1Please see Chapter 1 for an introduction to CNN's2Linear Filtering is �ltering in which the value of an output pixel is a linear combination of the values of

pixels in the input pixel's neighborhood. For example, an algorithm that computes a weighted average of

the neighborhood pixels is one type of linear �ltering operation.


2 CNN Template Design for Image Processing

For now, assume that the sequences x0(n1; n2) and u0(n1; n2) are de�ned to be zero for

all integers n1 and n2 where the supplied data are not de�ned. Then the dynamics can be

written in convolution form:

d

dt

xt(n1; n2) = a(n1; n2) � xt(n1; n2) + b(n1; n2) � u(n1; n2) + I: (2.4)

2.2 Spatial Frequency Formulation

We will now make use of the two-dimensional Discrete Space Fourier Transform (DSFT),

which gives the representation of a sequence on the basis of complex exponentials. Assuming

all the DSFT's exist, the dynamics can be written in the new basis by transforming (2.4)

into

d

dt

X (!1; !2) = A(!1; !2)Xt(!1; !2) + B(!1; !2)U(!1; !2) + IÆ(!1; !2) (2.5)

which is uncoupled in spatial frequency, i.e., for each of the uncoupled numbers of !1; !2,

this is a single linear �rst order ordinary di�erential equation. This equation describes the

manner in which the coeÆcients of each of the basis functions of the basis changes over time.

The dynamics of the modes evolve independently for each spatial frequency.

If, for all (!1; !2), we have A(!1; !2) < 0 then the central linear system is stable, and all the

exponential terms in the time solution will tend to zero as time goes to in�nity. The stable

equilibrium, which is independent on the initial conditions, can be found easily by �nding

the limit of the time solution of (2.5):

X1(!1; !2) =H(!1; !2)U(!1; !2) (2.6)

H(!1; !2) =�1

A(!1; !2)B(!1; !2): (2.7)

The spatial transfer function H(!1; !2) can be shown to have two important properties:

\zero phase" and \in�nite impulse response" (IIR). Because the A and B templates are

real and symmetric, their DSFT's are as well. Therefore, the phase of a spatial sinusoid in

the input image is not modi�ed by the transfer functions A(!1; !2) and B(!1; !2). Since

H(!1; !2) is a simple function of these transfer characteristics, it will inherit the zero-phase

property. And, because the transfer function is made by �nding the inverse of the FIR

�lter a(n1; n2) it is, typically, spatially in�nite in extend. That is, due to the feedback in the

dynamics, the local connections of the A template can be used to perform non-local �ltering.


2.3 Template Design Examples


Linear spatial �ltering is the work horse of image processing algorithms, and the many

applications that use linear �ltering, such as interpolation, visual modeling, and image com-

pression, could bene�t.We now give some examples of possible approaches to such templates.

2.3.1 Low-Pass Filter Design

Good quality low-pass �lters have many important uses, such as image interpolation, and

therefore provide a speed/performance benchmark for any image processing hardware. It

is interesting to see how close a single CNN template pair can come to performing ideal

low-pass �ltering of the input by using the equilibrium approach described above.

There are two important system constraints on the design process in both the template

domain and the frequency domain. The most important point may be that the gains of

A(!1; !2) are strictly negative. Some other concerns are the available range and the accuracy

of the template elements of a particular CNN implementation and a slow convergence speed

or stability sensitivity if the eigenvalues A(!1; !2) are too small.

As we would like to perform the parameter minimization process in the frequency domain

while retaining control over template size we make use of a transformation method similar

to that used in FIR �lter design. In addition, the method reduces the number of parameters

to be minimized by imposing circular symmetry.

Let C(!1; !2) be a �lter with the desired contours. Then, if we specify

A(!1; !2) =RPr=0

�rCr(!1; !2) (2.8)

B(!1; !2) =RPr=0

�rCr(!1; !2) (2.9)

both A(!1; !2) and B(!1; !2) are simple continuous functions of C(!1; !2), and they will

both have the same shaped constant contours as C(!1; !2) and therefore, H(!1; !2) will as

well. Also of importance, by choosing a c(n1; n2) to be nonzero only on a �nite support, the

size of the support of the nonzero parts of a(n1; n2) and b(n1; n2) can be controlled.

This method will now be used to design 5�5 A and B templates with the goal of a circularly

symmetric low-pass �lter at equilibrium with passband extending to 0:4� and stopband

starting at 0:5�.

The sequence with the nonzero elements

c(n1; n2) =

264 0:25 0:50 0:25

0:50 1:0 0:50

0:25 0:50 0:25

375 (2.10)

with the center element 1.0 is known to have contours with good circular symmetry and

radial monotonicity. Because we want our templates to be 5 � 5, we have to perform a

frequency-weighted minimization with respect to the parameters �0; �1; �2; �0; �1; �2:

A(!1; !2) = �0 + �1C(!1; !2) + �2C2(!1; !2) (2.11)



B(!1; !2) = �0 + �1C(!1; !2) + �2C2(!1; !2): (2.12)

We eventually used Genetic Algorithms GA's to perform the minimization task.

There are a number of parameters in a genetic algorithm which have to be speci�ed. The

following parameters were used in the simulation:

� population size= 50

� bit mutation rate= 0:05

� non-overlapping populations

� two-point crossover

� direct mapping as �tness technique

For an introduction into genetic algorithms see Chapter 3.1.

For the design of the �tness function we used �lter design techniques that are very com-

mon in image processing, the frequency transformation method and the frequency sampling

method.

The frequency transformation method transforms a one-dimensional �lter into a two-dimensional

�lter. It preserves most characteristics of the one-dimensional �lter, particularly the transi-

tion bandwidth and ripple characteristic. This method uses a transformation matrix (2.10)

i.e., a set of elements that de�ne the frequency transformation.

The frequency sampling method creates a �lter based on a desired frequency response given

a matrix of points that de�nes its shape. This method creates a �lter of which the frequency

characteristic passes through those points. Frequency sampling places no constraints on the

behaviour of the frequency response between the given points; usually, the response ripples

in these areas.

For the low-pass �lter design we sampled the spectrum every 0:05� as shown in Fig. 2.1.

There you can also see a possible �lter characteristic of a high order low-pass �lter.

However, another important property of the �lter design is not visible in Fig. 2.1: the

weights of the individual samples. We will refer to this problem later and will now go on to

the design of the �tness function.

The �tness function, in fact, has to perform a least square minimization with respect to the

weights of the individual samples in the frequency domain. Now consider the transformation

matrix (2.10), which is to be transformed in the frequency domain C(!1; !2):

c(n1; n2) =

264 0:25 0:50 0:25

0:50 1:0 0:50

0:25 0:50 0:25

375

�j�

C(!1; !2) = 1 + cos(!1) + cos(!2) + cos(!1) cos(!2):

(2.13)

Then if we set the frequency !2 = 0 we can easily �nd the magnitude of the one-

dimensional transfer function j H(!) j:



0 0.5 1 1.5 2 2.5 3 3.5−0.2

0

0.2

0.4

0.6

0.8

1

1.2

spatial frequency

mag

nitu

de

Figure 2.1: Sampled low-pass �lter characteristic.

j H(!) j =j �0 + 2�1(1 + cos(!)) + 4�2(1 + 2 cos(!) + cos2(!)) j

j �0 + 2�1(1 + cos(!)) + 4�2(1 + 2 cos(!) + cos2(!)) j: (2.14)

Now we evaluate this transfer function at every frequency sample and formulate the

results in a matrix Zi multiplied by the two coeÆcient column vectors, that represent the

parameters to be minimized, to perform the frequency sample error vector Ei. For the zero

magnitude case (stopband) the �tness function turns out to be

0BBBBBBBB@

��Z0 �

0B@ �2

�1

�0

1CA��Z0 �

0B@ �2

�1

�0

1CA��

1CCCCCCCCA

2

= E0 (2.15)

with the error vector E0. In the unity magnitude case (passband) with the error vector E1

is straightforward:

0BBBBBBBB@

��Z1 �

0B@ �2

�1

�0

1CA��Z1 �

0B@ �2

�1

�0

1CA�� 1

1CCCCCCCCA

2

= E1: (2.16)



It is important to mention that in the two formulas above the division and the square

exponent are element by element operations.

The two matrices Z0 and Z1 evaluated numerically will lead to

Z0 =

0BBBBBBBBBBBBBBBBBBB@

4:0000 2:0000 1:0000

2:8464 1:6871 1:0000

1:9098 1:3820 1:0000

1:1925 1:0920 1:0000

0:6797 0:8244 1:0000

0:3431 0:5858 1:0000

0:1459 0:3820 1:0000

0:0475 0:2180 1:0000

0:0096 0:0979 1:0000

0:0006 0:0246 1:0000

0 0 1:0000

1CCCCCCCCCCCCCCCCCCCA

(2.17)

Z1 =

0BBBBBBBBBBBBBB@

16:0000 4:0000 1:0000

15:8036 3:9754 1:0000

15:2265 3:9021 1:0000

14:3036 3:7820 1:0000

13:0902 3:6180 1:0000

11:6569 3:4142 1:0000

10:0842 3:1756 1:0000

8:4564 2:9080 1:0000

6:8541 2:6180 1:0000

1CCCCCCCCCCCCCCA

(2.18)

where each row represents one sample in the frequency domain. If we combine the two error

vectors E0 and E1 in a way that the �rst element of the vector represents the frequency

! = 0 and the last element ! = � to form the vector E(! = [0; �]) we can easily apply a

frequency weight by using the vector dot product. For example, consider the weight-vector

W (! = [0; �]) = (1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1; 1)T ; as now all frequency weights

are equal to the dot product,

� = E:W (2.19)

would simply sum up the vector E(!) there � is the �nal �tness function that has to be

minimized by a suitable algorithm 3. The \equal-weight" approach would lead to the low-

pass �lter characteristics shown in Fig. 2.2.

Although this result is already very close to our desired �lter properties, we were still

able to improve the characteristic by applying the weight-vector

W = (1; 1:1; 1:2; 1:3; 1:1; 0:9; 0:7; 1; 1:3; 1:3; 1; 0:8; 1; 1:2; 1:1; 1; 0:8; 0:8; 0:8; 0:8)T (2.20)

to our search algorithm. This error vector emphasize the transition region and allows the

3see Chapter 3 for further information about optimization techniques



0 0.5 1 1.5 2 2.5 3 3.50

0.2

0.4

0.6

0.8

1

1.2

1.4System Transfer Function

spatial frequency

mag

nitu

de

Figure 2.2: Low-pass �lter.

response to ripple near the transition region4. Performing the minimization with respect to

�0; �1; �2; �0; �1; �2 returned

A(!1; !2) = �4:60 + 3:47C(!1; !2)� 0:74C2(!1; !2) (2.21)

B(!1; !2) = 0:30� 0:75C(!1; !2) + 0:32C2(!1; !2): (2.22)

By taking the inverse DSFT of (2.21) and (2.22), the sequences a(n1; n2) and b(n1; n2),

and therefore, the A and B templates can easily be found:

A =

0BBBBB@

�0:0462 �0:1850 �0:2775 �0:1850 �0:0462

�0:1850 0:1275 0:6250 0:1275 �0:1850

�0:2775 0:6250 �2:7950 0:6250 �0:2775

�0:1850 0:1275 0:6250 0:1275 �0:1850

�0:0462 �0:1850 �0:2775 �0:1850 �0:0462

1CCCCCA (2.23)

B =

0BBBBB@

0:0200 0:0800 0:1200 0:0800 0:0200

0:0800 0:1325 0:1050 0:1325 0:0800

0:1200 0:1050 0:2700 0:1050 0:1200

0:0800 0:1325 0:1050 0:1325 0:0800

0:0200 0:0800 0:1200 0:0800 0:0200

1CCCCCA : (2.24)

Fig. 2.3 shows the characteristic of the improved low-pass �lter with its very sharp cuto�

and little ripple. The �nal 2-dimensional �lter characteristic with good circular symmetry

is shown in Figure 2.4.

4A good way to �nd this error vector is by trial and error. It is also possible to use frequency masks which

are often used in classical �lter designs even tough they cannot easily be applied to genetic algorithms



0 0.5 1 1.5 2 2.5 3 3.50

0.2

0.4

0.6

0.8

1

1.2

1.4System Transfer Function

spatial frequency

mag

nitu

de

Figure 2.3: Improved low-pass �lter.

−3−2

−10

12

3

−3−2

−10

12

30

0.2

0.4

0.6

0.8

1

1.2

spatial frequencyspatial frequency

gain

Figure 2.4: 2-D low-pass �lter characteristic.


3 CNN Optimization Techniques

In order to perform a frequency weighted minimization for CNN template design as described

in Chapter 2 we were faced with the problem to �nd a suitable search method. The current

literature identi�es three main types of search methods: calculus-based, enumerative, and

random.

We �rst tried to apply calculus-based and enumerative methods to our problem but our

functions turned out to have multiple-peaks that are even very close to each other, and the

search space was far too big as we have to minimize at least six parameters with a reasonable

accuracy.

The combination of random and calculus-based methods led to quite good but still insuf-

�ciently accurate results. Thats why we applied a genetic algorithm to our search problem

which has shown to be very e�ective for CNN template learning [13]. In the following chapter

we show how genetic algorithms work and how they are di�erent from other search methods.

3.1 Genetic Algorithms

Genetic algorithms (GA's) are stochastic optimization algorithms that were originally mo-

tivated by the mechanism of natural selection and genetics, and have proven to be e�ective

in a number of applications. Genetic algorithms have been developed by John Holland, his

colleagues, and his students at the University of Michigan. The central theme of research on

genetic algorithms has been robustness, the balance between eÆciency and eÆcacy necessary

for survival in many di�erent environments.

3.1.1 How Genetic Algorithms are Different from Traditional

Methods

In order for genetic algorithms to surpass their more traditional cousins in the quest for

robustness, GA's must di�er in some very fundamental ways. Genetic algorithms are di�erent

from more normal optimization and search procedures in four ways:

� GA's work with parameter set coding, not the parameters themselves.

� GA's search from a population of points, not a single point.

� GA's use payo� information, not derivates or other auxiliary knowledge.

� GA's use probabilistic transition rules, not deterministic rules.

What might make a genetic algorithm attractive is its simplicity and the fact that its

applicability is not limited by restrictive assumptions about the search space (continuity,

unimodality, existence of derivates, etc.). Despite their relative simplicity, GA's outperform



any random search because they can exploit information cumulated during the evolution of

the search. Calculus based methods are inevitably superior in the problem domain where

they can be used, but GA's provide a robust search in discontinuous and multimodal noisy

search spaces.

3.1.2 Genetic Search Mechanism

Because genetic algorithms are rooted in both natural genetics and computer science, their

terminology mixes natural and arti�cal expressions. The scope of GA's is global, as they

use a population of binary strings - called chromosomes - to explore the search space. Each

chromosome encodes a point in the parameter space, i.e., a possible solution for the problem

to be solved. These binary strings are evaluated through a �tness function which contains

all the information about the problem. Evaluation means that the �tness value of the

corresponding chromosome is calculated accordingly. The better the solution is encoded by

a chromosome, the higher the �tness. The genetic algorithm then tries to improve the �tness

of the population by combining information contained in high �tness chromosomes.

A common GA implementation consists of the following four steps that are also illustrated

in Figure 3.2:

1. Determine the Initial Population

A population of binary strings is randomly determined. We have chosen relatively

small population sizes with respect to the whole set of possible binary strings. Due to

the constraints coming from the Vlsi technology we coded the parameters with 10bits

in the range of [�5; 5].

2. Reproduction or Selection

Reproduction is a process in which individual strings are copied according to their

objective function values (biologists call this function the �tness function). Copying

strings according to their �tness values means that strings with a higher value have a

higher probability of contributing one or more o�spring in the next generation. The

reproduction can be implemented in algorithmic form in a number of ways. Perhaps

the easiest is to create a roulette wheel where each current string in the population has

a roulette wheel slot sized in proportion to its �tness. Figure 3.2 shows this approach

graphically.

3. Crossover

Crossover means exchange of substrings between two parent chromosomes combining

valuable information of the parents. We used the two-point crossover operator in

our GA implementation which has two crossing sites that are selected and substrings

between the crossing sites are exchanged as shown as an example in Fig. 3.1.

4. Mutation

Mutation maintains diversity in the string population by ipping an arbitrary bit in

the chromosomes with a given probability that is generally low.

Our GA has been implemented in Matlab1 and complied to C-code which has been

shown to be an e�ective way to implement genetic algorithms because the �tness function to

1Matlab handles a range of computing tasks in engineering and science, from data acquisition and analysis

to application development. The Matlab environment integrates mathematical computing, visualiza-



parents 1: 0011 j 101101 j 1000

2: 1001 j 110010 j 0101

m m

o�spring 1: 0011 j 110010 j 1000

2: 1001 j 101101 j 0101

Figure 3.1: The two-point crossover operator.

Figure 3.2: Graphical representation of a genetic algorithm.

be evaluated during the reproduction task would be diÆcult and therefore time consuming

to implement in C.

3.1.3 Some Mathematical Foundations

The operation of genetic algorithms is remarkably straightforward. After all, we start with a

random population of n strings, copy strings with some bias toward the best, mate and par-

tially swap substrings, and mutate an occasional bit value for good measure. In this section

we would like to brie y give an overview on the mathematical background of GA's in order

to give the reader a better basis to understand GA's and therefore for our Matlab source-

code. However, this description will be very short and for further readings we recommend [7].

Let us consider a schema H taken from the three-letter alphabet f0; 1; �g. The asterisk

or star * is a \don't care" symbol which matches either a 0 or a 1 at a particular position.

For example, consider the length l = 7 schema H = �11 � 0 � �.

Not all schemata are created equal. Some are more speci�c than others. For example, the

schema 011 � 1 � � is a more de�nite statement about important similarity than the schema

0 � � � � � �. Furthermore, certain schema do span more of the total string length than

tion, and a powerful technical language. Built-in interfaces let you quickly access and import data from

instruments, �les, and external databases and programs. In addition, Matlab lets you integrate external

routines written in C, C++, Fortran, and Java with your Matlab applications.



others. For example, schema 1 � � � �1� spans a larger portion of the string than schema

1 � 1 � � � �. To quantify these ideas, we introduce two schema properties: schema order and

length de�nition.

The order of a schema H, denoted by o(H), is simply the number of �xed positions (in a

binary alphabet, the number of 1's and 0's) present in the template. In the example above,

the order of the schema 011 � 1 � � is 4, whereas the order of the schema 0 � � � � � � is 1.

The de�ning length of a schema H, denoted by Æ(H), is the distance between the �rst and

last speci�c string position. For example, the schema 011 � 1 � � has de�ning length Æ = 4.

Schemata and their properties are interesting notational devices for discussing and classify-

ing string similarities rigorously. More than this, they provide the basic means for analyzing

the net e�ect of reproduction and genetic operators on building blocks contained within the

population.

The e�ect of reproduction on the expected number of schemata in the population is particu-

larly easy to determine. Suppose at a given time step t there are m examples of a particular

schema H contained within the population A(t). If we recognize that the average �tness of

the entire population may be written as f =Pfj=n then we may rewrite the reproductive

schema growth equation as follows:

m(H; t+ 1) =m(H; t)f(H)

f(3.1)

In words, a particular schema grows as to the ratio of the average �tness of the schema

to the average �tness of the population. Schemata with �tness values above the population

average will receive an increasing number of samples in the next generation and vice versa.

Now suppose we assume that a particular schema H remains above average an amount cf

with c a constant. On this assumption we can rewrite the schema di�erence equation as

follows:

m(H; t+ 1) = m(H; t)f + cf

f

= (1 + c)m(H; t) (3.2)

Starting at t = 0 and assuming a stationary value of c, we obtain the equation:

m(H; t+ 1) =m(H; 0)(1 + c)t (3.3)

We can recognize a geometric progression or the discrete analog of an exponential form.

Reproduction allocates exponentially increasing numbers of trials to above-average schemata.

In [7] it has been shown that if crossover is itself performed by random choice, say with

probability pc at a particular mating, the survival probability may be given by the expression:

ps � 1� pcÆ(H)l�1

(3.4)

The combined e�ect of reproduction and crossover may now be considered. As when we

considered reproduction alone, we are interested in calculating the number of a particular

schema H expected in the next generation. Assuming independence of the reproduction and

crossover operations, we obtain the estimate:



m(H; t+ 1) �m(H; t)f(H)

f

h1� pc

Æ(H)l�1

i(3.5)

Comparing this to the previous expression for reproduction alone, the combined e�ect

of crossover and reproduction is obtained by multiplying the expected number of schemata

for reproduction alone by the survival probability under crossover ps. Schema H grows or

decays depending upon a multiplication factor. With both crossover and reproduction, that

factor depends on two things: whether the schema is above or below the population average

and whether the schema has relatively short or long de�ning length. Clearly, those schemata

with both above-average observed performance and short de�ning lengths are going to be

sampled at exponentially increasing rates.

The last operator to consider is mutation. Mutation is the random alteration of a single

position with probability pm. It can be shown that a particular schema H receives an ex-

pected number of copies in the next generation under reproduction, crossover, and mutation

as given by the following equation:

m(H; t+ 1) �m(H; t)f(H)

f

h1� pc

Æ(H)l�1 � o(H)pm

i(3.6)

The addition of mutation changes our previous conclusion little. The following impor-

tant conclusion is named the Schema Theorem , or the Fundamental Theorem of Genetic

Algorithms:

Short, low-order, above-average schemata receive exponentially

increasing trials in subsequent generations.

Although the calculations to prove the schema theorem are not too demanding, the theorem's

implications are far reaching and subtle as shown in [7].

3.1.4 Design of the Fitness Function

The �tness function which is used during the selection process must be adapted to the

current problem. We used the genetic algorithm to design CNN templates in general. To be

more precise, we performed a frequency weighted minimization of some �lter parameters to

achieve a certain �lter characteristic. For further details see Chapter 2.3.

3.1.5 GA Based Template Learning

Originally, CNN templates were not designed in the frequency domain. In this section we

give a brief description of the GA based template learning for completeness2.

Operations performed by an asymptotically stable CNN can be described by a triplet

of signal arrays, e.g., images: the input, initial state, and settled output of the network

mapped into gray scale values of pixels. The problem of learning is to �nd the template

of an operation given by the image triplet. The template to be found should de�ne the

dynamics such that the desired output is a stable equilibrium point in the state space and

the initial state is in its basin of attractions.

2We did not implement this approach. Our design was carried out in the frequency domain. Please see

Chapter 2.3 for further details



We can meet both requirements by considering the trajectory of the transient. The simplest

way to attain this is to create a cost function which compares the desired output to the

result of the transient de�ned by a given template and the input and initial state from the

image triplet. The following formula gives such a function:

g(p) =kPi=1

(ydi � yi(1))2 (3.7)

where p denotes the parameter vector, i.e., the template, k is the size of the network (the

number of cells), ydi is the value of the ith pixel of the desired output and y(1) stands for

the corresponding pixel of the settled output. g(p) = 0 if the result of template p is identical

to the desired output and gives a quadratically increasing distance elsewhere. By using g(:)

as a cost function, the problem of learning can be formulated as an optimization problem.

Applying genetic algorithms g(:) is minimized indirectly: its value is mapped into a �tness

value f(:) which is to be maximized.

Implementations of the GA based template learning are discussed in [13] and [10].


4 Hardware Implementation

After all these theoretical and numerical results we'd like to implement this network in

hardware. Furthermore, we will later link this system with our imager. In this way we are

able to process the noise cancellation in realtime. This platform could be later used for

further image processing tasks to recognition of objects.

4.1 CNN Universal Machine

In Chapter 1 we described the properties of a CNN and in Chapter 3 we proposed a method

to calculate coeÆcients for the network, so that a particular spatial �lter results. If we want

to implement a CNN in Vlsi we have to make for each template set another hardwiring.

The invention of the CNN universal machine (Cnn-Um ) [20, 19] has overcome the problem.

It is the �rst stored program array computer with analog nonlinear array dynamics. One

CNN operation , for example, solving thousands of nonlinear di�erential equations [21, 12]

in a microsecond, is just one single instruction. In image processing application we often

need a sequence of several templates to calculate the output. One key point is that, in order

to exploit the high speed of the CNN chips, intermediate results have to be stored cell by

cell. Therefore local analog memory is needed.

The term 'universal' comes from the fact that there is a theoretical basis for the statement

that virtually any processing task can be somehow solved by a Cnn-Um . In [3] it is indirectly

shown that a Cnn-Um is a so-called Turning Machine, a hypothetical computer capable of

solving any problem whose solution can be formulated as an algorithm.

The Cnn-Um now consists of CNN array with additional data ow elements like analog

memory, switches, converters and a superposed control structure. This necessary design

concept was de�ned in 1993 [20]. It is the basis for further Cnn-Um chip designs. Fig. 4.1

gives a small insight in the analogic1 computer architecture.

In 1999 a chip prototyping system for a Cnn-Um was presented in [19]2. It has all the

software and hardware ingredients of the stored programmable computer (highlevel language,

1Analogic is a contraction for \analog" and \logic" computation|two key features of the CNN universal

chip.2This system as well as a lot of CNN applications were developed at the University of Budapest at Hungary.

The head of the Analogical & Neural Computing Laboratory is Tam�as Roska.

Tamas Roska received the Diploma in Electrical Engineering from the Technical University of Budapest

in 1964 and the Ph.D. and D.Sc. degrees in Hungary in 1973 and 1982, respectively.

He is the head of the Analogical & Neural Computing Laboratory at the University of Budapest, Hungary.

His main research areas in electronic circuits and systems and computing have been: active circuits,

computer-aided design, nonlinear circuit and systems, neural circuits and analogic computing systems.

He has published several papers and books. Dr. Roska is a co-inventor of the CNN (Cellular Neural

Network) Universal Machine and Supercomputer (with L.O. Chua) and the analogic CNN Bionic Eye

(with F. Werblin and L.O. Chua)

Professor Roska was awarded with several titles.


4 Hardware Implementation

Figure 4.1: The CNN universal machine{global architecture.

compiler, operating system, assembly and machine code, analogic central processing unit,

analog and digital memory, peripherals, etc.). For our project we made use of the system,

called Aladdin .


Part III

Smart Image System


Preface

In our preliminary studies we presented the development of a Cmos image-sensor (Part I)

and image processing by cellular neural networks (Part II). In this part we present how this

systems behave in practice. To get the image data from the chip to a computer we develop

a system called Obscura. After the capturing we process our images on a CNN universal

machine. The properties of this system with the name Aladdin were already mentioned in

Chapter 4.1 in Part II.

We present many measurement results that provide a valuable basis for further develop-

ments and research in the �eld of smart image systems.


1 Obscura

Image Sensing

The image sensing unit is one of the main parts of a smart image system. Therefore, we built

the imager Obscura, which allows us to capture images and link them to the Matlab1

platform. Obscura is composed of the single-chip Cmos image-sensor (Part I), a printed-

circuit-board (PCB), an optical lens system, and a hardware interface card. Figure 1.1 shows

the data- ow between the sensor chip and a computer.

Figure 1.1: Data- ow of Obscura .

1.1 PCB Hardware

The PCB contains besides the sensor chip some input and output drivers to protect the

sensor and to handle the di�erent digital voltage levels, like 3.3V Cmos and 5V TTL. The

schematic of the board is shown in Fig 1.2.

1.2 Hardware Interface Card dSpace

The DS1102 interface card from dSpace is a single board system, which is speci�cally designed

for development of high-speed multivariable digital controllers and real-time simulations in

various �elds. It is also well suited for general digital signal processing related tasks.

1Matlab handles a range of computing tasks in engineering and science, from data acquisition and analysis

to application development. The Matlab environment integrates mathematical computing, visualiza-

tion, and a powerful technical language. Built-in interfaces let you quickly access and import data from

instruments, �les, and external databases and programs. In addition, Matlab lets you integrate external

routines written in C, C++, Fortran, and Java with your Matlab applications.


1 Obscura - Image Sensing Unit

Figure 1.2: Electrical Schematic of the PCB for OBSCURA.

The DS1102 is based on the Texas Instruments TMS320C31 third generation oating-

point Digital Signal Processor (DSP), which builds the main processing unit, providing fast

instruction cycle time for numeric intensive algorithms.

The DSP has been supplemented by a set of on-board peripherals frequently used in digi-

tal control systems. Analog to digital and digital to analog converters, a DSP-microcontroller

based digital-I/O subsystem and incremental sensor interfaces make the DS1102 an ideal

single board solution for a broad range of digital control tasks, not at least for our Image

capturing system Obscura.

The TMS320C31 supports a total memory space of 16M 32-bit words including program,

data and I/O space. All o�-chip memory and I/O can be accessed by the host even while

the DSP is running thus allowing easy system setup and monitoring.

We used the three analog digital controllers (ADC) for our application in order to read

in the analog sensor data. Channel 1 and 2 provides a resolution of 16 bit at a sampling rate

of 250kHz, channel 3 provides 12-bit at 800kHz, respectively. Further we used one of the

digital analog converters (ADC) to set the VpsRef voltage for the pixel circuit, which has a

precision of 16 bit and achieves a conversion time of 4�s. Finally we needed eleven digital

I/O lines to address our image-sensor.

1.3 Optics

The needed optics depends mainly on the application. Obscura can be easily adapted to

microscopic or macroscopic requirements. For the measurement of the imager chip, the fol-

lowing points must be taken into account. Obscura represents a sensor system converting


1.4 Software

optical information into electrical information. Therefore, most of the sensor measurements

require a possibility to optically stimulate the image sensor. For this reason, the measure-

ments were carried out in an optical laboratory providing di�erent kinds of light sources

like monochromatic laser light or white light from a xenon-arc lamp. To achieve the high

dynamic range necessary for evaluating the logarithmic photoreceptors, a number of neutral

density �lters were used. Finally, a spectrograph is needed to select a color out of the white

spectrum.

In order to focus the laser beam to a single pixel, to expand it over the hole chip, or to

get a homogenous illumination, some lenses are needed. In the focused case we calculated a

minimal beam diameter of 14�m. The Gaussian beam pro�le decreases at this distance to

1=�2. In reality this diameter is larger because of non-idealities of the lens.

1.4 Software

The main software consists of modules for our imager. The �rst basic module is a Simulink

model that includes some I/O's and S-Function which was coded in the C-Language to

perform permanent addressing and readout of the image-sensor data as well as storage of

the current picture on the dSpace level. It is important to mention that this application is

able to run as an independent program on dSpace and can be accessed by Matlab by the

so-calledMatlab-DSP interface which provides basic functions for reading and writing data

to the dSpace processor boards. The functions that provide this access from the Matlab

command-line or from Matlab M-�les are called Mlib/Mtrace functions and are an

important part of our second software module. This second module consists of several M-

�les that read the image data from dSpace and visualize them with the powerful numerical

and graphical tools running under Matlab .

In the �rst part of our software description we will have a closer look at the syntax and the

conventions for the design of S-Functions in general. We will introduce our Simulink model

and describe the state machine that is implemented in our S-function. In the second part of

the description we explain the functionality of our Matlab functions.

S-functions (system-functions) provide a powerful mechanism for extending the capabilities

of Simulink. They allow the user to add its own algorithms to Simulink models. S-functions

can either be coded in Matlab or in C; however, to build models that are executable

on dSpace, they must be written in C. The most common use of S-functions is to create

custom Simulink blocks. Each block within a Simulink model has the following general

characteristics: a vector of inputs, u, a vector of outputs, y, and a vector of states, x, as

shown in Fig. 1.3.

Figure 1.3: General Simulink model.

The state vector may consist of continuous states, discrete states, or a combination of

both. The mathematical relationships between the inputs, outputs, and states are expressed

by the following equations:



y = f0(t; x; u);

_xc = fd(t; x; u);

xdk+1 = fu(t; x; u);

x = xc + xd:

(1.1)

Simulink makes repeated calls during speci�c stages of simulation to each block in the

model, directing it to perform tasks such as computing its outputs, updating its discrete

states, or computing its derivates. Additional calls are made at the beginning and end

of a simulation to perform initialization and termination tasks. The so called S-functions

routines that are listed in Table 1.1 are called by Simulink during the simulation task and

must therefore be implemented with exactly the same names in every C MEX S-function.2

Simulation Stages S�Function Routine

Initialization mdlInitializeSizes

Calculation of next sam-

ple hit (optional)

mdlGetTimeOfNextVarHit

Calculation of outputs mdlOutputs

Update discrete states mdlUpdate

Calculation of derivates mdlDerivates

End of simulation tasks mdlTerminate

Table 1.1: Simulink simulation stages

Other important topics are the hardware independent datatypes and the access to vari-

ables from outside the system. The S-function speci�c C code provides the following three

datatypes: real T, int T and boolean T. These datatypes have a corresponding represen-

tation on the dSpace level, depending on the dSpace board version. In any case, int T

corresponds to integer, boolean T to unsigned integer and real T to oat.

To access variables from Matlab via the Mlib interface a Variable Description �le (TRC-

�le) is needed. TRC-�les describe the names and types of the variables that can be accessed

by ControlDesk3 and therefore also by an Mlib function. By changing these variables, the

application can easily be updated to the requirements needed. Global, non-static variables

or pointer variables in the compilation unit of an S-function or any other C-coded module

can be included in the Variable Description �le, thus making them accessible from outside.

For a better understanding of the topics just mentioned, consider our Simulink model shown

in Fig. 1.4. As already described in Chapter 1.1, the chip provides three analog outputs for

the image data, one digital input for the capturing signal, one analog input for the reference

voltage, and twelve digital inputs to address and enable our image sensor.

Therefore we need analog-digital and digital-analog converters that are located on the

dSpace board. The most important part in our model is the S-function. With the RTW4

2The C MEX S-Functions are S-Functions, that are coded in C and then compiled by the MEX compiler.

The MEX compiler can be e.g. a Borland C-compiler and must be installed before the �rst usage on

the Matlab command.3ControlDesk is dSpace's experiment software that provides all the tools for controlling, monitoring, and

automating real-time experiments.4RTW is the abbreviation for Real-Time Workshop. The Real-Time Workshop, for use with Matlab and


1.4 Software

Figure 1.4: Image sensor Simulink model.

fromMatlab 5.3 it is possible to generate a Variable Description �le (TRC �le) for the whole

Simulink model. However, in this TRC-�le only the input and output ports are speci�ed; to

access variables that are used insight the C code and which are not mapped to the output,

you have to write a user TRC-�le. The TRC-�le generated by RTW is located in hmodeli:trc

and the user �le must be named to hmodeli usr:trc. In this �le you can specify the variables

that you want to access while the application is running. These variables must be declared

as global and non-static in the C-code that describes the S-function. Note that, at the end

of the Variable Description �le an empty line has to be inserted to avoid an error message

caused by the TRC �le parser.

We next describe the main functionality of our S-function. We have implemented a state

machine with six di�erent states that control the slower column counter, the faster row

counter, the enable and disable of the reference voltage and the read-in of the pixel voltages

into a two-dimensional array. Figure 1.5 shows a graphical representation of the state model.

After the initialization process the �rst column is selected in the �rst state and enabled in

the second. The third state is to ensure that the �rst pixels in the column are settled before

any readout.The fourth state is a wait state. After this state the image data are in their

speci�c variables and are ready to be read into the memory. In the �fth state the row counter

is incremented until the whole row is read. The last state disables the current row and a new

program cycle begins. The timing diagram in Fig. 1.5 gives more detailed information about

the handshake principle used in the state machine just mentioned. During each new entry

in the program one state is executed. The sample time of the program, i.e. the length of one

state, can be set as a static variable in the C-code. However, the time between two states

is still depending on the complexity of the S-function and we have no control over the time

of initialization and execution in the model's overall execution order. The time overhead for

Level 25 S-functions on our dSpace board DS1102 is 2:8�s.

The second part of our software, as already mentioned above, consist of several M-functions

that are listed in Table 1.2. These functions provide a comfortable software package to

collect and process image data with the numerical Matlab tools and the Mlib/Mtrace

interface. For further information please use help hfunctioni on theMatlab command line.

Simulink, produces code directly from Simulink models and automatically builds programs that can be

executed in a variety of environments, including real-time systems and stand-alone simulations.5Level 2 S-functions are generated by Simulink2.2 and have always a shorter time overhead than Level 1

S-functions generated by Simulink2.1



Figure 1.5: State machine implemented in a Simulink S-function.

FunctionName InputParameters Output ShortDescription

initImager reference voltage and

switch threshold volt-

age.

- sets the reference voltage and

the switch threshold voltage.

singleShot - image data in a col-

umn vector.

captures a single image from

the sensor which is located on

the carrier board.

startTrackingMode - image data in a col-

umn vector.

captures a single image af-

ter the switch on the carrier

board has been actuated.

data2img image data generated

either by singleShot

or by startTracking-

Mode.

intensity image data

in double array.

converts the input data to

an intensity image, calls

the function preProcessIm-

age, and displays the image.

preProcessImage intensity image data

in a double array.

corrected intensity

image in a double

array.

limits the image data to the

linear operating range of our

image-sensor.

videoMode recording time. video data in an 48�

48� k array, where k

is the number of im-

ages.

captures images during the

time speci�ed in the input ar-

gument.

data2video video data collected

by the function video-

Mode.

multiframe image

data in an 48 � 48 �

1 � k array, where

k is the number of

images.

displays the frames recorded

by the function videoMode in

one picture and in a video an-

imation.

Table 1.2: Overview over our Mlib M-functions


2 Obscura

Measurement

2.1 Introduction to Fixed Pattern Noise (FPN)

Many integrated circuits rely on the assumption that devices identically drawn in the layout

also show an identical behaviour in reality. If this assumption does nearly hold, they are

called well-matched devices. However, transistors, resistors, or capacitors with the same

geometrical extensions usually di�er from each other due to a spatial variation of the pro-

cess parameters. Due to this mismatches there result improper current mirrors, operational

ampli�ers with high o�set voltages and, in the case of image sensors, non-uniform output

signals of the individual sensor pixels. Since the pixel variations are �xed but randomly

distributed across the chip, the image shows the so-called �xed pattern noise. For practical

use, the FPN has to be reduced to a value that is below the minimum intensity di�erence

to be detected.

The distribution of the mismatches between two supposedly identical Cmos devices is pri-

marily the result of two factors:

1. Variations in the location of the transistor, resistor, or capacitor edges resulting from

the limited imaging quality of the photolithographic process itself. This causes mis-

matches in length and width and thus di�erent electrical behaviour.

2. Variations of the process parameters like gate oxide thickness and doping concentration

across the waver resulting from non-uniform conditions during the redeposition and

di�usion. These variations cause the sheet resistances and the threshold voltages of

the transistors to vary with distance across the die.

Rotating or mirroring Cmos structures causes additional mismatch because some process

parameters depend on the geometrical direction.

2.1.1 Transistor Mismatch in Weak Inversion

Logarithmic photodetectors, as used in our implementation, show a very high �xed pattern

noise due to the weak inversion mismatch. The following equations show the current-voltage

law of the subthreshold region, assuming the bulk-source potential VBS to be zero and

VDS � Vt.

ID = ID0WLe

VGS � VT

nVt ;

(2.1)

ID0 ' �LW

2(nVt)2

e2

: (2.2)


2 Obscura - Measurement

Here, � is the MOS transconductance and n is a process parameter (subthreshold slope

factor) which is typically between one and two: 1 � n � 2. The temperature potential Vt is

equal to kT=q and must not be mistaken for the threshold voltage VT1.

Due to the exponential relation between drain current and gate-source voltage, ID varies

over a large dynamic range of several decades (fA to nA) when VGS varies from 0V to VT .

Therefore, a high mismatch sensitivity, especially for variations of VT , may be expected. The

standard deviation of the ID mismatch can be derived from (2.1):

�(ID) =

s(@ID

@�

)2�2(�) + (@ID

@VT

)2�2(VT )

=

vuut(2(nVt)

2

e2

e

VGS � VT

nVt )2�2(�) + (�2nVt

e2e

VGS � VT

nVt )2�2(VT ):

(2.3)

To obtain the relative error of current mismatch, the absolute error from equation (2.3)

is divided by the expression for ID in equation (2.1):

�(ID)

ID

=

s�2(�)

�2

+�2(VT )

(nVt)2: (2.4)

From the above relation we deduce that �(ID)=ID is not dependent on VGS . Hence it is

constant in the complete subthreshold region.2 On the other hand the contribution of the

VT mismatch is obvious. Since �(VT ) is only divided by nVt which is in the order of 25mV,

common threshold voltage variations of about 20mV can lead to a drain current mismatch

of nearly 100%. Therefore, subthreshold devices have to be designed very carefully to reduce

mismatch itself or at least its in uence on the circuit behaviour.

Particularly in the case of image sensors, the subthreshold devices are frequently used as

logarithmic compressors. Thus the exponential current-voltage law is reversed into a loga-

rithmic voltage-current law as used in our pixel circuit. Solving equation (2.1) for the gate

voltage VGS gives

VGS = VT + 2nV t� nVt ln2�(nVt)

2

ID

: (2.5)

Since VGS is a function of ID, current ratios are converted into voltage di�erences. Hence

not the relative but the absolute error �(VGS) is the interesting magnitude for mismatch

considerations. Using equation (2.5) and following the derivation in equation (2.3) yields

�(VGS) =

s�2(VT ) + (

nVt

�

)2�2(�): (2.6)

Here, the mismatch of VGS directly depends on the threshold voltage variations �(VT ).

Besides, it is independent of the drain current ID and therefore constant in the complete

subthreshold region.1The gate-source voltage, for which the concentration of electrons under the gate is equal to the concentration

of holes in the p- substrate far from the gate, is known as the transistor threshold voltage VT .2This statement is only valid in �rst approximation, because the used equations and assumptions are ap-

proximations in many respects, too. For example, the parameter n also shows a slight mismatch destroying

the independence of VGS .


2.1 Introduction to Fixed Pattern Noise (FPN)

2.1.2 Fixed Pattern Noise Correction in Logarithmic Image Sensors

As discussed in the previous section, logarithmic photodetectors show a high FPN due to

weak inversion mismatch. Figure 2.1 shows the distribution of the pixels while our chip is

exposed to homogeneous light, and Figure 2.2 shows a bar plot of the homogeneously illumi-

nated pixels. The peak-peak variations of the pixel voltages are approximately 100mV which

corresponds almost to 2 decades of light intensity. It turns out to be more complicated to

correct this variations than in the case of integration-based image sensors.3 The reason is the

missing reset state of all continously working receptors. There is no de�ned reference state

whose corresponding pixel signal could be subtracted from the intensity-dependent output

signal. Thus we have to apply other calibration concepts.

Figure 2.1: Distribution of the pixels of the homogeneously illuminated image sensor.

Figure 2.2: Bar plot of the pixels of the homogeneous illuminated image sensor.

3A technique commonly used to eliminate �xed pattern noise in integration-based image sensors is the

correlated double sampling (CDS) method.



The most common way to correct the non-uniformities of logarithmic image sensors is to

carry out a digital correction method. Initially, the pixel errors are measured by illuminating

the sensor array homogeneously and stored in a digital memory. During readout operation,

the actual pixel signals are converted into digital values and then digitally corrected according

to the stored pixel errors. A one point (only o�set), two point (o�set and slope) or even

higher order calibration algorithm can be utilised. The digital �xed pattern noise correction

is usually carried out outside the chip although there is no problem to directly perform it on

the sensor itself. However, an on-chip solution would require a large area for digital memory

and additional control logic leading to larger chips and worse yield. Therefore, in case of

a digital correction, the o�-chip method is preferred. However, another concept, which is

manly based on a self-calibrating photoreceptor is given in [16].

2.2 Photoreceptor response and fixed pattern noise

The following sections describe the measured behavior of the photoreceptors with respect to

the incident light intensity J . Alternatively, a di�erent quantity is often used, the photomet-

ric quantity lux. It is adapted to the human eye spectral sensitivity. The exact correlation

between physical and photometric quantities is given in [16].

2.2.1 Response curves

The measurement of the photoreceptor response as a function of the light intensity lead to

a dynamic range of 7 decades. One pixel was illuminated with red light of the wavelength

632.8nm of a helium-neon laser. Figure 2.3 shows the response curve of a single photore-

ceptor. The cell works linearly over more than 6 decades, as expected. The slope, i.e. the

voltage increase per intensity decade, amounts to 73mV when the reference voltage VpsRef

is 1:1V. The results, both, the linear range and the slope, �t very well with the simulation

of Part I, Chapter 2.1.

2.2.2 Remaining fixed pattern noise

We have done di�erent e�orts to reduce the �xed pattern noise, but without any remarkable

success. Figure 2.4 visualize such a FPN correction operation. The correcting data are based

on several homogeneous illuminated pictures by di�erent light intensities. For each pixel its

own response curve was approximated, and the dependence correcting information (o�set

and gain) was calculated. The operations failed because the homogeneous images are not

enough constant with the time.

2.2.3 Slope variations

Due to slight variations of the subthreshold factor n introduced in section 2.1.1, the slope

of the photoreceptor response varies from pixel to pixel. These non-uniformities can result

in a considerable contribution to the total �xed pattern noise. Their in uence increases

with the distances between the calibrating points and the actual illumination. For detailed

information about the �xed pattern noise correction see section 2.1.2.

Figure 2.5 shows the histogram of the individual slopes averaged over a range of 2 decades.

The values correspond to the 2304 pixels of our image sensor.


2.3 Complementing measurements

Figure 2.3: Response curves of the logarithmic cell.

Figure 2.4: Fixed pattern noise correction.

2.3 Complementing measurements

2.3.1 Crosstalk

The e�ective resolution of an image sensor depends not only on the number of pixels but

also on the crosstalk of adjacent pixels. In order to examine the crosstalk behavior, a single

pixel of the sensor was stimulated with a bright laser-spot. Ideally, the stimulated cell would

show an increased output signal whereas all others show a constant low level. The results

corresponding to a laser spot intensity 4 decades higher than the background intensity are

shown in Figure 2.6. Values below 3 decades under the maximum were set to zero. The



Figure 2.5: Distribution of the individual pixel slopes of our image sensor.

crosstalk e�ect shows a decrease of 2 decades of light intensity between the stimulated pixels

and its neighbours.

Figure 2.6: The crosstalk e�ect shows a decrease of 2 decades of light intensity, between the

stimulated and the neighboring pixels.

2.3.2 Temporal Noise

So far we regarded only the �xed pattern noise referring to the mean output signals of the

individual pixels. However, the pixel signals have an additional noise component which is

the temporal noise. To measure this noise we illuminated one pixel with a �xed intensity and

read out 500 frames with a frame rate of about 8 frames per second. The signal was band-

limited by a simple RC-�lter to 48kHz4. Figure 2.7 shows the distribution of our temporal

4With this frequency band we would be able to read out 50 frames per second.


2.4 Spectral characteristic

noise with it's very low standard deviation of 850�V.

Figure 2.7: Temporal noise distribution.


Generally, the inner photoelectric e�ect of a semiconductor describes the absorption of elec-

tromagnetic radiation by transferring the photon's energy to an electron and lifting the

electron to the conduction band. Depending on the material this interaction occurs in dif-

ferent wavelength ranges. See section 1.2 in Part I for further details.

Assume that a semiconductor is illuminated from a light source with the intensity I0. The

number of photons dJ absorbed along the distance dx is proportional to the local intensity

J(x):

dJ

dx

= ��J(x): (2.7)

The proportionality constant � is de�ned as the absorption coeÆcient. Due to Lambert's

law of absorption, the light intensity decreases exponentially along the distance x.

J(x) = J0e��x

: (2.8)

The absorption coeÆcient � depends on the photon energy and therefore on the wave-

length � because E = hc=�. For energies below the bandgap energy Eg, which is the

di�erence between the energy levels of valence band and conduction band, photons do not

possess enough energy to lift electrons to the conduction band. Silicon has a very low �

and is nearly transparent for wavelengths larger than 1100nm corresponding to the silicon

bandgap of Eg = 1:12eV. Nevertheless, a weak absorption even occurs in the region above

1100nm. The reasons are the states in the bandgap due to crystal defects and absorption

by free electrons brought to the conduction band by thermal excitation.



The spectral variation of � has consequences for the spectral sensitivity of silicon pho-

todetectors. Due to weak absorption infrared and red light penetrates deeply into the semi-

conductor crystal whereas ultraviolet radiation is absorbed directly below the surface. There-

fore, infrared receptors have to be largely extended into the substrate. Ultraviolet detectors,

which are very ineÆcient in silicon, must be located near the surface. Table 2.1 gives three

examples of the absorption length5 at a temperature of 300K [22].

wavelength photon energy absorption length

(nm) (eV ) (�m)

400 3.10 0.76

700 1.77 4.5

1000 1.24 110

Table 2.1: Absorption length in silicon at three di�erent wavelengths.

Figure 2.8 shows the overall spectral characteristic of our Cmos image-sensor. The best

responsivity, as expected, is around 600nm. The spectrum characteristic is limited between

the wavelengths 500nm and 1100nm because of the e�ects we discussed earlier in this section.

Figure 2.8: Spectral characteristic of our image-sensor.

However, another important property of Cmos image-sensors can be derived from the

measured spectrum. It is the presence of interference phenomenas due to the passivation

and oxide layers6 shown in Figure 2.10.

5After covering the absorption length the incident intensity has decreased by the factor of e.6The passivation layer is a protection layer usually made of Nitride which is located on the top of the chip's

layers. The oxide layers are used as isolation, e.g. between metal1 and metal2 layer. See also Figure 2.10.



Due to the very low extinction coeÆcient k of SiO2 and Si3N4 in the visible spectrum,

the absorption can be neglected. The extinction coeÆcient is de�ned as

k =�

2��; (2.9)

where � is the absorption coeÆcient used in equation (2.8). However, we cannot ignore

the in uence of re ection and interferences. We used the quantitative approach to calculate

the in uence of multilayer �lters from [9]. The approach is called the method of resultant

waves or the E+, E� matrix method. The boundary conditions associated with Maxwell's

equations are placed into a matrix equation format. To accomplish the reformulation, the

boundary conditions are manipulated so that the information about the angle of incident

and the polarization are placed into an e�ective index of refraction:7

ri;j =nj � ni

nj + ni; (2.10)

ki;j =2ki

nj + ni

: (2.11)

Here, r is the re ection coeÆcient and k is the transmission coeÆcient between the

medium i and j. The �elds on each side of the boundary can then be represented by

plane waves incident normal to the interface. Figure 2.9 shows the geometry for waves in a

dielectric �lm.

Figure 2.9: Geometry for waves in a dielectric �lm.

Each dielectric layer has two interfaces which can be represented as two interface matrices,

e.g. I0;1 and I1;2. Formula (2.14) shows the interface matrix equation

0B@ E

i+

Ei�

1CA =

0BBBB@

1

ti;j

ri;j

ti;j

ri;j

ti;j

1

ti;j

1CCCCA0B@ E

j+

E

j�

1CA : (2.12)

Normally, this formula is written in a more compact notation

Ei = Ii;jE

j; (2.13)

7To simplify the following derivation we assume that all the waves incident orthogonal to the chip's surface,

which is also the case in reality



where Ii;j is the interface matrix

Ii;j =

0BBBB@

1

ti;j

ri;j

ti;j

ri;j

ti;j

1

ti;j

1CCCCA ; (2.14)

and Ei and E

j are the waves in the �rst medium and the second medium.

The problem of �nding the values of A+ and A� in Figure 2.9 is a simple propagation

problem. The �elds A+ and E� must be modi�ed by the phase shift they experience after

propagating through the dielectric layer, here labeled 1

A+ = eiÆ1E1+; (2.15)

E1�= e

iÆ1A�; (2.16)

where Æi is given by

Æi =2�nidi

�

: (2.17)

Equation (2.15) and (2.16) can be combined into a matrix equation

A = T1E1; (2.18)

which can be generalized for the ith dielectric layer by de�ning a transmission matrix of the

form

Ti =

0B@ e

iÆi 0

0 e�iÆi

1CA : (2.19)

The e�ect of an m layer dielectric �lm can be described by the matrix equation

0B@ E

0+

E0�

1CA =M

0B@ E

f+

E

f�

1CA (2.20)

where

M = I0;1 � T1 � I1;2 � T2 � ::: � Im�2;m�1 � Tm�1 � Im�1;m; (2.21)

and the Ef are the �elds in the �nal medium. If we assume that Ef�= 0 then the re ection

of the stack is

R = jM1;0

M0;0j2; (2.22)

and from energy conversation it follows that the transmission of the stack IIN=IOUT is



� = 1�R: (2.23)

Now we apply this approach to our layer con�guration. Figure 2.10 shows a simpli�ed

cross section of our diode. However, the important layers are the nitride passivation and the

dielectric layers8.

Figure 2.10: Simpli�ed cross section of our n+-substrate-diode showing the layers that can

potentially cause interferences.

We are now interested in the the overall transmission � of one wavelength. Figure 2.8

shows that the minimum transmission loss is at the wavelength � � 580nm. Table 2.2 shows

the refraction coeÆcients we used for the calculation.

parameter value

� = 580nm [:]

n0, n3 1

n1 = nSi3N42.028

n2 = nSiO21.544

Table 2.2: Refraction coeÆcients for the overall transmission calculation [18].

The distances9 d1 and d2 from Figure 2.10 are d1 = 1:1�m and d2 = 2:8�m respectively.

The results of the calculation are presented in Table 2.3.

8The dielectric layers are used as isolation layers, e.g. between metal3 and metal2 that are used for wiring.9The thickness of the oxide and protection layers are process dependent and could also be calculated by

regarding the wavelengths of two neighbouring sensitivity maximums at �1 and �2.



constellation reflection R transmission �

Si3N4 and SiO2 layer 36% 64%

SiO2 layer only 2% 98%

Table 2.3: Results of the overall transmission calculation.

The calculation shows, that the transmission is strongly dependent on the presence of

the Si3N4 nitride layer. For further implementations we should either omit the nitride layer

or choose another process with a SiO2 passivation only.


3 Aladdin

Image Processing

We have presented the concept of a cellular neural network-universal machine (Cnn-Um) in

Part II, Chapter 4.1. For our smart image system we used such a machine, called Aladdin.1

Aladdin is able to program a Cnn-Um on a high level of abstraction and is very useful

during the prototyping phase. Figure 3.1 shows an example of a CNN operation. The used

template is presented in Part II, Chapter 2.3.1.

Figure 3.1: Linear spatial �lter template processing on a captured picture. The �lter has

a circulary symmetric low-pass characteristic. It is de�ned by A and B templates with

passband extending to 0:4� and stopband starting at 0:5�.

1We bought Aladdin form the University of Budapest, Hungary.


4 Conclusion

4.1 Current state of the project

Within the scope of this thesis, the concept of a smart image system, using a Cmos image

senor (Part I) and a Cnn-Um (Part II), has been tested. A �rst image-sensor with 48x48

pixels has been developed and examined with regard to its optical properties. Furthermore,

we presented a complete method to design CNN templates in the frequency domain using

genetic algorithms. We could �nally present some results of the two systems Aladdin and

Obscura working together.

Unfortunately, the working range of our sensor is shifted by 4 decades of light illumination

compared with several other implementations [16, 23]. That means that our imager is less

sensitive than expected; nevertheless, we were able to capture images.

With this work we provide a valuable basis for further research and development in the

�eld of smart image systems.

4.2 Post-script

The �rst part of our whole project included a lot of interesting work. For the �rst time we

had the chance to bring our ideas to silicon. The chip makes technology very palpable by

generating pictures from the world.

Thanks to the second preliminary study we became a good insight in the theory of

Cellular Neural Network and their applications. We have analyzed a lot of reports, and got

in contact with the Universities of Budapest and Berkeley.

Finally, the bachelor's thesis gave us the possibility to capture our own pictures with

Obscura and to process them on Aladdin.

Matthias Bettler & Samuel Zahnd

Biel/Bienne, School of Engineering and Architecture


Appendices


A The Layout Structure

Figure A.1: This picture shows the hierarchical decomposition of the 48 � 48 pixel image

sensor layout (analog part), in which the Analog Core presents the top level design.


B Analysis of the pixel circuit

The function of the pixel circuit (Figure B.1) will be described in mathematical terms:

In the following mathematical description

ID1. . . ID1 means the Drain-Source current.

ID means the photocurrent.

VD is the inverse photodiode voltage.

VPS is the RowPreselect voltage.

Figure B.1: Schematic of the pixel circuit.

Q basic equation

Q2 at saturation ID2 =�p2 (VDD � V PS � Vtp)

2

Q3 at saturation ID3 =�n2 (VD � Vtn)

2

Q1 at weak inversion ID1 =W1

L1ID0�e

(Vout�V1)�1

n�Vt

Photodiode at inversion �ID = Iph(e�

VDn�Vt � 1) = �Iph � � � intensity

ID2 = ID3 )�p

2(VDD � V PS � Vtp)

2 =�n

2(VD � Vtn)

2

VD =

s�n

�p

(VDD � V PS + Vtp) + Vtn

ID1 = ID )W1

L1ID0 � e

(Vout�V1)�1

n�Vt = Iph

Vout = n � Vt � ln(Iph

W1

L1ID0

) + VD

= n � Vt � ln(Iph)� n � Vt � ln(W1

L1ID0) +

s�n

�p

(VDD � V PS + Vtp) + Vtn

= n � Vt �1

log(e)| {z }89mV

decwith n=1:5

� log(Iph) + VA


C Content of the CD-ROM

The whole quantity of �les was splited up on two CD-Rom's and arranged in the following

structure.

diskh1i data/ .

preliminary study 1/ Image sensing related documents.

preliminary study 2/ Image processing related documents.

bachelor's thesis/ Smart Image System related documents.

diskh2i over head/ .

documentation/ Report, poster and abstract.

presentation/ Presentation slides used in Biel and Istanbul.

web page/ Web page of the Smart Image Sensor project.


List of Figures

0.1 Smart Image System. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

0.2 Project time schedule. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . v

0.3 Job scheduling. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vii

1.1 Cross-section of a primate retina. . . . . . . . . . . . . . . . . . . . . . . . . . 5

1.2 Operation of the p-n photodiode. . . . . . . . . . . . . . . . . . . . . . . . . . 7

1.3 Photocurrent versus Irradiance. . . . . . . . . . . . . . . . . . . . . . . . . . . 7

2.1 Schematic of the circuit for one pixel. . . . . . . . . . . . . . . . . . . . . . . 10

2.2 Simulated pixel circuit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10

2.3 Layout of the pixel circuit. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2.4 Blockdiagram of the I/O system on chip. . . . . . . . . . . . . . . . . . . . . . 11

2.5 The whole chip. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12

2.6 Bonding plan. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

3.1 Design- ow of the Digital Part. . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2 Design- ow for place and route of our design. . . . . . . . . . . . . . . . . . . 16

1.1 Network structure. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

1.2 The unity gain output function. . . . . . . . . . . . . . . . . . . . . . . . . . . 23

1.3 A block diagram showing the standard CNN. . . . . . . . . . . . . . . . . . . 23

1.4 The CNN cell schematic. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

2.1 Sampled low-pass �lter characteristic. . . . . . . . . . . . . . . . . . . . . . . 31

2.2 Low-pass �lter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33

2.3 Improved low-pass �lter. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34

2.4 2-D low-pass �lter characteristic. . . . . . . . . . . . . . . . . . . . . . . . . . 34

3.1 The two-point crossover operator. . . . . . . . . . . . . . . . . . . . . . . . . . 37

3.2 Graphical representation of a genetic algorithm. . . . . . . . . . . . . . . . . . 37

4.1 The CNN universal machine{global architecture. . . . . . . . . . . . . . . . . 42

1.1 Data- ow of Obscura . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47

1.2 Electrical Schematic of the PCB for OBSCURA. . . . . . . . . . . . . . . . . 48

1.3 General Simulink model. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49

1.4 Image sensor Simulink model. . . . . . . . . . . . . . . . . . . . . . . . . . . . 51

1.5 State machine implemented in a Simulink S-function. . . . . . . . . . . . . . . 52

2.1 Distribution of the pixels of the homogeneously illuminated image sensor. . . 55

2.2 Bar plot of the pixels of the homogeneous illuminated image sensor. . . . . . 55


List of Figures

2.3 Response curves of the logarithmic cell. . . . . . . . . . . . . . . . . . . . . . 57

2.4 Fixed pattern noise correction. . . . . . . . . . . . . . . . . . . . . . . . . . . 57

2.5 Distribution of the individual pixel slopes of our image sensor. . . . . . . . . 58

2.6 Crosstalk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 58

2.7 Temporal noise distribution. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 59

2.8 Spectral characteristic of our image-sensor. . . . . . . . . . . . . . . . . . . . 60

2.9 Geometry for waves in a dielectric �lm. . . . . . . . . . . . . . . . . . . . . . 61

2.10 Cross section of our photo-diode. . . . . . . . . . . . . . . . . . . . . . . . . . 63

3.1 Spatial �lter template processing. . . . . . . . . . . . . . . . . . . . . . . . . . 65

A.1 Hierarchical tree of layout designs. . . . . . . . . . . . . . . . . . . . . . . . . 71

B.1 Schematic of the pixel circuit. . . . . . . . . . . . . . . . . . . . . . . . . . . . 73


List of Tables

2.1 Optimal sizes of the Mos-Fet for the pixel circuit. . . . . . . . . . . . . . . . 9

2.2 Chip features and description. . . . . . . . . . . . . . . . . . . . . . . . . . . . 13

1.1 Simulink simulation stages . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 50

1.2 Overview over our Mlib M-functions . . . . . . . . . . . . . . . . . . . . . . . 52

2.1 Absorption length in silicon at three di�erent wavelengths. . . . . . . . . . . . 60

2.2 Refraction coeÆcients for the overall transmission calculation. . . . . . . . . . 63

2.3 Results of the overall transmission calculation. . . . . . . . . . . . . . . . . . 64


Bibliography

[1] Leon O. Chua. Global unfolding of chua's circuit. IEICE Trans. on Fundamentals

Electron. Commun., Comp. Sci., Vol. E76-A, May 1993.

[2] Leon O. Chua and Tam�as Roska. The cnn paradigm. IEEE Transaction on Circuits

and Systems-I, Vol. 40(3):147{156, March 1993.

[3] Leon O. Chua, Tam�as Roska, and P�eter Venetianer. The cnn is as universal as the

turning machine. IEEE Transaction on Circuits and Systems-I, Vol. 40(4):289{291,

April 1993.

[4] Leon O. Chua and Lin Yang. Cellular neural networks: Applications. IEEE

Transaction on Circuits and Systems-I, Vol. 35(10):1273{1290, October 1988.

[5] Leon O. Chua and Lin Yang. Cellular neural networks: Theory. IEEE Transaction on

Circuits and Systems-I, Vol. 35(10):1257{1272, October 1988.

[6] Kenneth R. Crounse and Leon O. Chua. Methods for image processing and pattern

formation in cellular neural networks: A tutorial. IEEE Transaction on Circuits and

Systems-I, Vol. 42(10):583{601, October 1995.

[7] David E. Goldberg. Genetic Algorithms in Search, Optimization and Machine

Learning. Addison-Wesly Publishing Company, 1989.

[8] Michel Goossens, Frank Mittelback, and Alexander Samarin. The LATEX Companion.

Addison-Wesley Publishing Co., Reading, Mass., 1994.

[9] Robert Guenther. Modern Optics. John Wiley & Sons, Inc., 1990.

[10] Martin H�anggi. Analysis, Desin, and Optimization of Cellular Neural Networks. ETH

Z�urich, 1999.

[11] Donald E. Knuth. The TEXbook, volume A of Computers and Typesetting .

Addison-Wesley Publishing Co., Reading, Mass., second edition, 1984.

[12] Tibor Kozek, Leon O. Chua, Tam�as Roska, Dietrich Wolf, Ronald Tetzla�, and Frank

Pu�er K�aroly Lotz. Simulating nonlinear waves and partial di�erential equations via

cnn{part ii: Basic techniques. IEEE Transaction on Circuits and Systems-I, Vol.

42(10):816{820, October 1995.

[13] Tibor Kozek, Tam�as Roska, and Leon O. Chua. Genetic algorithm for cnn template

learning. IEEE Transaction on Circuits and Systems-I, Vol. 40(6):392{402, June 1993.

[14] Leslie Lamport. LATEX: A Document Preparation System. Addison-Wesley Publishing

Co., Reading, Mass., second edition, 1994.


Bibliography

[15] Drahoslav L�IM. Implementation of a Programmable, Modularly Extendable

Cellular-Neural-Network Signal Processor. ETH Z�urich, 1999.

[16] Markus Loose. A self-calibrating cmos image sensor with logarithmic response.

Technical report, Institut f�ur Hochenergiephysik Universit�at Heidelberg, 1999.

[17] Carver Mead. Analog VLSI and Neuronal Systems. Addison-Wesly Publishing

Company, 1989.

[18] Edward D. Palik. Handbook of Optical Constants of solids I. Academic Press, Inc.

[19] Tam�as Roska, �Akos Zar�andy, S�andor Z�old, P�eter F�oldesy, and P�eter Szolgay. The

computational infrastructure of analogic cnn computing|part i: The cnn-um chip

prototyping system. IEEE Transaction on Circuits and Systems-I, Vol. 46(2):261{268,

February 1999.

[20] Tam�as Roska and Leon O. Chua. The cnn universal machine: An analogic array

computer. IEEE Transaction on Circuits and Systems-II, Vol. 40(3):163{173, March

1993.

[21] Tam�as Roska, Leon O. Chua, Dietrich Wolf, Tibor Kozek, Ronald Tetzla�, and Frank

Pu�er. Simulating nonlinear waves and partial di�erential equations via cnn{part i:

Basic techniques. IEEE Transaction on Circuits and Systems-I, Vol. 42(10):807{815,

October 1995.

[22] Peter Schneider. Simulation und visualisierung elektrischer und optischer

eigenschaften von halbleiterbauelementen. Technical report, Institut f�ur Physik und

Astronomie Universit�at Heidelberg, 1998.

[23] Markus Schwarz, Ralf Hauschild, Bedrich J. Hosticka, J�urgen Huppertz, Thorsten

Kneip, Stephan Kolnsberg, Lutz Ewe, and Hoc Kheim Trieu. Single-chip cmos image

sensors for a retina implant system. IEEE Transaction on Circuits and Systems-I, Vol.

46(7):870{877, July 1999.

[24] John M. Senior. Optical Fiber Communications: Principles and Practice. Prentice

Hall Europe, 1992.

[25] Olivier Vietze. Active pixel image sensors with application speci�c performance based

on standard silicon CMOS processes. ETH Z�urich, 1997.


Index

active pixel sensor APS, 9

ALADDIN, 65

cellular neural networks CNNs, 21

boundary values, 23

cell dynamics, 22

CNN universal machine, 41

constant source, 24

control template, 24

feedback template, 24

intial state, 23

introduction, 21

low-pass �lter design, 29

network coeÆcients, 22

network structure, 21

optimization techniques, 35

output nonlineraity, 22

settling time, 23

stability issues, 23

template, 22

unity gain ouput function, 27

CMOS image-sensor, 9

crosstalk, 57

de-multiplexer, 15

design- ow

analog part, 16

digital part, 15

dynamic range, 56

features, 13

�ll factor, 9

I/O system, 11

de-multiplexer, 12

multiplexer, 12

layout, 11

measured spectrum, 60

photoreceptor response, 56

process fabrication, 15

schematic, 10

spectral characteristic, 59

temporal noise, 58

CNN universal machine, 41

global architecture, 42

crosstalk, 57

depletion layer, 6

dielectric layers, 63

dSpace, 47

dynamic range, 56

�ll factor, 9

�tness function, 30

�xed pattern noise FPN, 53

remaining, 56

frequency sampling method, 30

frequency transformation method, 30

frequency-weighted minimization, 29

genetic algorithms, 35

binary strings, 36

chromosome, 36

crossover, 36

de�ning length, 38

�tness function, 36

genetic search mechanism, 36

initial population, 36

mathematical foundations, 37

mutation, 36

reproduction, 36

schema, 37

schema order, 38

schema theorem, 39

selection, 36

template learning, 39

inerference, 60


MLIB, 49

OBSCURA, 47

dSpace, 47

�xed pattern noise FPN, 53


Index

remaining, 56

Hardware, 47

software, 49

MLIB, 49

s-function, 49

state machine, 51

optics, 48

output nonlinearity, 22

passivation, 63

photo-generation, 6

photodiode, 6

photoreceptor response, 56

photoreceptors, 5

picture element

layout, 11

schematic, 10

quantum eÆciency, 6

refraction coeÆcient, 63

retina, 5

reverse leakage current, 6

s-function, 49

schema theorem, 39

slope variations, 56

spatial �lter, 27, 33, 65

spectral characteristic, 59

interference, 60


transmission calculation, 63

subthreshold, weak inversion, 9, 53

template, 65

temporal noise, 58

unity gain output function, 27

weak inversion, subthreshold, 9, 53

weight-vector, 32

zero-phase property, 28


Documents

F - UZHsam/smartimgsys/download/smart.pdfHalbleiterherstellung und ist in zahlreic hen eingespielten Prozessen billig v erf ugbar. T rotzdem wurde diese ec hnik erst seit ein paar