80386 and Pentium Microprocessors

7/31/2019 80386 and Pentium Microprocessors

1/37

80386 AND PENTIUM MICROPROCESSORS

4.1 Introduction

The 80386 microprocessor is a full 32-bit version of the earlier 16-bit processors.This microprocessor represents a major advancement in the architecture ofmicroprocessors and microcomputers. The 80386 microprocessor featuresmultitasking, memory management, virtual memory with or without paging,software protection and a very large memory system. In this unit, you will studyabout the architecture of 80386 in detail. The Pentium microprocessor is the nextmajor milestone in the development of microprocessors next to 80386. The majorfeatures of Pentium are improved cache structure, wider data bus, faster numericcoprocessor, dual integer processor and branch prediction logic. This unit makesyou to learn the basics of Pentium processors.

4.2 Learning Objectives

To learn the basics of 80386 microprocessor

To understand the programming model of 80386

To explain the memory organization and segmentation of 80386

To study about the interrupts and exceptions of 80386

To learn the basics of the Pentium microprocessor

To understand the architecture of the Pentium microprocessor

4.3 Evolutionary Offspring of 8086/8088 Microprocessors

8086/8088 is the simplest member of 80x86 family. However there are manyother powerful offspring of 8086 microprocessor which are used in the industryheavily. 80186 is basically an 8086 with an on-chip pritority controller,programmable timer, DMA controller and address decoding circuitry. Thisprocessor has been mostly used in industrial control applications. The 80286,another 16 bit enhancement of 8086 has the features like virtual managementcircuitry, protection circuitry and a 16-MByte addressing capability. The 80286was the first family member designed specifically for use as the CPU in amultiuser microcomputer. The needs of a multitasking/multiuser operating systeminclude environment preservation during task switches, operating system anduser protection and virtual memory management system. 808286 is the first

80x86 family microprocessor designed to implement these features relativelyeasy. Moreover the 80286 was the microprocessor used as the CPU in IBMPC/AT and its clones. The 80286 can operate in one of the two memory addressmodes, real address mode or protected virtual address mode. In the real addressmode, the address unit computes addresses using a segment base and an offset

just as the 8086 does. In the protected virtual address mode (protected mode),80286 uses all 24 address lines to access up to 16Mbytes of physical memory. Inprotected mode it also provides up to a gigabyte of virtual memory.

1


2/37

Some of the limitations of the 80286 microprocessor are that it has only a 16-bitALU, its maximum segment size is 64 Kbytes and it can not easily be switchedback and forth between real and protected modes. These drawbacks areeliminated in 32-bit microprocessors. These microprocessors are not merely

more of the same except bigger and faster. They offer some unique features notavailable in earlier 16-bit processors. They satisfy some major requirements ofmultitasking / multiuser systems like higher speed of execution, ability to handledifferent types of tasks efficiently, large memory space that can be shared bymultiple users, appropriate memory allocations and the management memoryaccess, data security and data access etc. Some of these requirements must bemanaged by a multiuser operating system, and some should be facilitated by thearchitectural design of the microprocessors. 32-bit microprocessors and 64 bitmicroprocessors have been designed and implemented to meet theserequirements.

Have you understood?

1. What are the features that should be possessed by the microprocessors tosupportmultiuser and multitasking environment?

4.4 Introduction to the 80386 Microprocessor

The 80386 is an advanced 32-bit microprocessor optimized for multitaskingoperating systems and designed for applications needing very high performance.80386 maintains the software compatibility with 80286. The 32-bit registers and

data paths support 32-bit addresses and data types. The processor can addressup to four gigabytes of physical memory and 64 terabytes (2 ^ (46) bytes) ofvirtual memory. 80386 segments can be as large as 4 Giga Bytes and a programcan save as many as 16384 segments. The virtual address then is 16384segments * 4 GBytes, or about 64TBytes. The 80386 has a virtual mode whichallows it to easily switch back and forth between 80386 protected mode tasksand 80386 real mode tasks. The on-chip memory-management facilities of 80386include address translation registers, advanced multitasking hardware, aprotection mechanism, and paged virtual memory. Special debugging registersprovide data and code breakpoints even in ROM-based software.

The processing mode of the 80386 also determines the features that areaccessible. The 80386 has three processing modes:

1. Protected Mode.2. Real-Address Mode.3. Virtual 8086 Mode.

2


3/37

Protected mode is the natural 32-bit environment of the 80386 processor. In thismode all instructions and features are available. Real-address mode (often called

just "real mode") is the mode of the processor immediately after RESET. In realmode the 80386 appears to programmers as a fast 8086 with some newinstructions. Most applications of the 80386 will use real mode for initialization

only. Virtual 8086 mode (also called V86 mode) is a dynamic mode in the sensethat the processor can switch repeatedly and rapidly between V86 mode andprotected mode. The CPU enters V86 mode from protected mode to execute an8086 program, then leaves V86 mode and enters protected mode to continueexecuting a native 80386 program.


1. What is the word length of the 80386 microprocessor?2. What is the size of the physical memory that can be addressed by 80386?3. What is the size of the virtual memory that can be addressed by 80386?

4. Mention the operating modes of 80386.5. In which mode of 80386 it is possible to quickly switch between virtual modeand

protected mode?

4.5 80386 Pins and Signals

The pin-out of the 80386DX microprocessor is a package in a 132-pin PGA (pingrid array). Two versions of the 80386 are commonly available: the 80386DX isthe full version, and the 80386SX is a reduced bus version of the 80386. a newversion of the 80386 the 80386EX-incorporates the AT bus system, dynamic

RAM controller, programmable chip selection logic, 26 address pins, 16 datapins, and 24 I/O pins. The 80386DX addresses 4G bytes of memory through its32-bit data bus and 32-bit address. The 80386SX, more like the 80286,addresses 16M bytes of memory with its 24-bit address bus via its 16-bit databus. The 80386SX was developed after the 80386DX for applications that did notrequire the full 32-bit bus version. The 80386SX is found in many personalcomputers that use the same mother board design as the 80286. At this time,most applications, including Windows, require less than 16M bytes of memory,so the 80386SX is a fairly popular and less costly version of the 80386microprocessor. Even though the 80486 has become a less expensive upgradepath of newer systems, the 80386 can still be used for many applications.

As with earlier versions of the Intel family of microprocessors, the 80386 requiresa single +5.0V power supply for operation. The power supply current averages550mA for the 25MHz version of the 80386, 500 mA for the 20MHz version and450 mA for the 16MHz version. Also the available is a 33 MHz version thatrequires 600 mA of power supply current. Note that during some modes ofnormal operation, power supply current can surge to over 1.0A. This means thatthe power supply and the power distribution network must be capable of

3


4/37

supplying these current surges. This device controls multiple VCC and VSSconnections that must all be connected to +5.0V and grounded for properoperation. Some of the pins are labeled N/C (no connection) and must not beconnected. Additional versions of the 80386SX are available with a +3.3V powersupply. These are often found in portable notebook or laptop computers and are

usually packaged in a surface mount device.

Each 80386 output pin is capable of providing 4.0mA (address and dataconnections) or 5.0mA (other connections). This represents an increase in drivecurrent compared to the 2.0mA available on earlier 8086, 8088, and 80286output pins. Each input pin represents a small load requiring only +10A ofcurrent. In some systems, except the smallest, these current levels require busbuffers.

The function of each 80386DX group of pins follows:

A31-A2 Address bus connections address any of the 1G x 32 memorylocations found in the 80386 memory system. Note that A0 and A1 are encodedin the bus enable (B23 BE0) to select any or all of the four bytes in a 32-bitwide memory location. Also note that because the 80386SX contains a 16-bitdata bus in place of the 32-bit data bus found on the 80386DX, A1 is present onthe 80386SX and the bank selection signals arereplaced with BHE and BLE. The BHE signal enables the upper data bus half,and the BLE signal the lower.

D31-D0 Data bus connections transfer data between the microprocessor and itsmemory and I/O system. Note that the 80386SX contains D15 D0.

BE3-BE0 Bank enables signals select the access of a byte, word, or double wordof data. These signals are generated internally by the microprocessor fromaddress bits A1 and A0. On the 80386SX, these pins are replaced by BHE(ActiveLow), BLE(Active Low) and A1.

M/IO(Active Low) Memory IO selects a memory device when a logic 1 or an I/Odevice when a logic 0. During an I/O operation, the address bus contains a 16 bitI/O address on address connections A15-A2.

W/R(Active Low) Write/read indicates that the current bus cycle is a write when alogic 1 or a read when a logic 0.

ADS(Active Low) The address data strobe becomes active whenever the 80386has issued a valid memory or I/O address. This signal is combined with the W/R(active low) signal to generate the separate read and write signals present in theearlier 80386-80286 microprocessor based systems.

4


5/37

RESET Reset initializes the 80386, causing it to begin executing software atmemory location FFFFFFFF0H. The 80386 is reset to the real mode and theleftmost 12 address connections remain logic 1s (FFFH) until a far jump or farcall is executed. This allows the compatibility with earlier microprocessors.

CLK2 Clock times 2 is driven by a clock signal that is twice the operatingfrequency of the 80386. For example, to operate the 808386 at 16MHz, we applya 32MHz clock to this pin.

READY(Active Low) Ready controls the number of wait states inserted into thetiming to lengthen the memory accesses.

LOCK(Active Low) Lock becomes a logic 0 whenever an instruction is prefixedwith the LOCK: prefix. This is the most often used during DMA accesses.

D/C(Active Low) Data/control indicates that the data bus contains data for or from

memory or I/O when a logic 1. If D/C (Active Low) is a logic 0, themicroprocessor is halted or executes an interrupt acknowledge.

BS16(Active Low) Bus size 16 selects either a 32-bit data bus (BS16(ActiveLow)=1) or a 16-bit data bus (BS16(Active Low)=0). In most cases, if an80386SX that has a 16-bit data bus.

NA(Active Low) Next Address causes the 80386 to output the address of the nextinstruction or data in the current bus cycle. The pin is often used for pipelining theaddress.

HOLD Hold requests a DMA action.

HLDA Hold acknowledge indicates that the 80386 is currently in a hold condition.

PEREQ(Active Low) The coprocessor request asks the 80386 to relinquishcontrol and is a direct connection to the 80387 arithmetic coprocessor.

BUSY(Active Low) Busy is an input used by the WAIT or FWAIT instruction thatwaits for the coprocessor to become not busy. This is also a direct connection tothe 80387 from the 80386.

ERROR(Active Low) Error indicates to the microprocessor that an error isdetected by the coprocessor.

INTR An interrupt request is used by external circuitry to request an interrupt.

NMI A non-maskable interrupt requests a non-maskable interrupt as it did on theearlier versions of the microprocessor.

5


6/37


1. What is the difference between 80386SX and 80386DX versions?2. What type of package is 80386DX?3. What is the feature supported by 80386EX version?

4.6 Programming Model

The basic programming model of 80386 consists of these aspects:

Memory organization and segmentation Data types Registers Instruction format Operand selection Interrupts and exceptions

Input/output is not usually included as part of the basic programming model.

4.6.1 Memory Organization and Segmentation

The physical memory of an 80386 system is organized as a sequence of 8-bitbytes. Each byte is assigned a unique address that ranges from zero to amaximum of 2^(32) -1 (4 gigabytes). 80386 programs, however, are independentof the physical address space. This means that programs can be written withoutknowledge of how much physical memory is available and without knowledge ofexactly where in physical memory the instructions and data are located. The

model of memory organization seen by applications programmers is determinedby systems-software designers. The architecture of the 80386 gives designersthe freedom to choose a model for each task. The model of memory organizationcan range between the following extremes:

A "flat" address space consisting of a single array of up to 4 gigabytes.

A segmented address space consisting of a collection of up to 16,383linear address spaces of up to 4 gigabytes each.

Both models can provide memory protection. Different tasks may employdifferent models of memory organization.

The "Flat" Model

In a "flat" model of memory organization, the applications programmer sees asingle array of up to 2^(32) bytes (4 gigabytes). While the physical memory cancontain up to 4 gigabytes, it is usually much smaller; the processor maps the 4gigabyte flat space onto the physical address space by the address translationmechanisms. Application programmers do not need to know the details of themapping. A pointer into this flat address space is a 32-bit ordinal number thatmay range from 0 to 2^(32) -1. Relocation of separately-compiled modules in this

6


7/37

space must be performed by systems software (e.g., linkers, locators, binders,loaders).

The Segmented ModelIn a segmented model of memory organization, the address space as viewed by

an applications program (called the logical address space) is a much largerspace of up to 2^(46) bytes (64 terabytes). The processor maps the 64 terabytelogical address space onto the physical address space (up to 4 gigabytes) by theaddress translation mechanisms. Applications programmers do not need to knowthe details of this mapping.

Applications programmers view the logical address space of the 80386 as acollection of up to 16,383 one-dimensional subspaces, each with a specifiedlength. Each of these linear subspaces is called a segment. A segment is a unitof contiguous address space. Segment sizes may range from one byte up to amaximum of 2^(32) bytes (4 gigabytes).

A complete pointer in this address space consists of two parts as shown in figure4.1.. .

| ||---------------|-+

32 0 | | |+-------+-------+ +---+ |---------------| || OFFSET |---| + |--->| OPERAND | |+-------+-------+ +---+ |---------------| |- SELECTED SEGMENT

^ | | |16 0 | | | |+-------+ | | | ||SEGMENT|---------o----->|---------------|-++-------+ | |

| || |. .

Figure 4.1 Two-Component Pointer

1. A segment selector, which is a 16-bit field that identifies asegment.

2. An offset, which is a 32-bit ordinal that addresses to the byte levelwithin a segment.

During execution of a program, the processor associates with a segment selector

the physical address of the beginning of the segment. Separately compiledmodules can be relocated at run time by changing the base address of theirsegments. The size of a segment is variable; therefore, a segment can be exactlythe size of the module it contains.

4.6.2 Data Types

7


8/37

Bytes, words, and doublewords are the fundamental data types as shown infigure 4.2. A byte is eight contiguous bits starting at any logical address. The bitsare numbered 0 through 7; bit zero is the least significant bit.7 0+---------------+| BYTE | BYTE

+---------------+15 7 0+-------------------------------+| HIGH BYTE | LOW BYTE | WORD+-------------------------------+

address n+1 address n31 23 15 7 0+---------------+---------------+---------------+--------------+| HIGH WORD | LOW WORD | DOUBLEWORD+---------------+---------------+---------------+--------------+

address n+3 address n+2 address n+1 address n

Figure 4.2 Fundamental Data Types

A word is two contiguous bytes starting at any byte address. A word thuscontains 16 bits. The bits of a word are numbered from 0 through 15; bit 0 is theleast significant bit. The byte containing bit 0 of the word is called the low byte;the byte containing bit 15 is called the high byte. Each byte within a word has itsown address, and the smaller of the addresses is the address of the word. Thebyte at this lower address contains the eight least significant bits of the word,while the byte at the higher address contains the eight most significant bits.

A doubleword is two contiguous words starting at any byte address. Adoubleword thus contains 32 bits. The bits of a doubleword are numbered from 0

through 31; bit 0 is the least significant bit. The word containing bit 0 of thedoubleword is called the low word; the word containing bit 31 is called the highword. Each byte within a doubleword has its own address, and the smallest ofthe addresses is the address of the doubleword. The byte at this lowest addresscontains the eight least significant bits of the doubleword, while the byte at thehighest address contains the eight most significant bits. Figure 4.3 illustrates thearrangement of bytes within words and doublewords.

MEMORYBYTE VALUES

All values in hexadecimalADDRESS +----------+

E| |

|----------|--+D| 7A | |- DOUBLE WORD AT ADDRESS A|----------|-+| CONTAINS 7AFE0636C| FE | |||----------| |- WORD AT ADDRESS BB| 06 | || CONTAINS FE06|----------|-+|A| 36 | ||----------|--|

8


9/37

9| 1F | |- WORD AT ADDRESS 9|----------|--+ CONTAINS IF8| ||----------|--+7| 23 | ||----------| |- WORD AT ADDRESS 66| OB | | CONTAINS 23OB|----------|--+5| ||----------|4| ||----------|--+3| 74 | ||----------|-+|- WORD AT ADDRESS 22| CB | || CONTAINS 74CB|----------|--+1| 31 | |-- WORD AT ADDRESS 1|----------|-+ CONTAINS CB310| |+----------+

Figure 4.3 Bytes, Words and Double Words in Memory

Note that words need not be aligned at even-numbered addresses anddoublewords need not be aligned at addresses evenly divisible by four. Thisallows maximum flexibility in data structures (e.g., records containing mixed byte,word, and doubleword items) and efficiency in memory utilization. When used ina configuration with a 32-bit bus, actualtransfers of data between processor and memory take place in units ofdoublewords beginning at addresses evenly divisible by four; however, theprocessor converts requests for misaligned words or doublewords into theappropriate sequences of requests acceptable to the memory interface. Such

misaligned data transfers reduce performance by requiring extra memory cycles.For maximum performance, data structures (including stacks) should bedesigned in such a way that, whenever possible, word operands are aligned ateven addresses and doubleword operands are aligned at addresses evenlydivisible by four. Due to instruction prefetching and queuing within the CPU, thereis no requirement for instructions to be aligned on word or doublewordboundaries. (However, a slight increase in speed results if the target addressesof control transfers are evenly divisible by four.)

Although bytes, words, and doublewords are the fundamental types of operands,the processor also supports additional interpretations of these operands.

Depending on the instruction referring to the operand, the following additionaldata types are recognized:Integer:

A signed binary numeric value contained in a 32-bit doubleword,16-bit word, or 8-bit byte. All operations assume a 2's complement representation. The sign bit islocated in bit 7 in a byte, bit 15 in a word, and bit 31 in a doubleword. The sign bithas the value zero for positive integers and one for negative. Since the high-

9


10/37

order bit is used for a sign, the range of an 8-bit integer is -128 through +127; 16-bit integers may range from -32,768through +32,767; 32-bit integers may range from -2^(31) through +2^(31) -1. Thevalue zero has a positive sign.

Ordinal:

An unsigned binary numeric value contained in a 32-bit doubleword, 16-bit word,or 8-bit byte. All bits are considered in determining magnitude of the number. Thevalue range of an 8-bit ordinal number is 0-255; 16 bits can represent valuesfrom 0 through 65,535; 32 bits can represent values from 0 through 2^(32) -1.

Near Pointer:

A 32-bit logical address. A near pointer is an offset within a segment. Nearpointers are used in either a flat or a segmented model of memory organization.

Far Pointer:

A 48-bit logical address of two components: a 16-bit segment selectorcomponent and a 32-bit offset component. Far pointers are used by applicationsprogrammers only when systems designers choose a segmented memoryorganization.

String:

A contiguous sequence of bytes, words, or doublewords. A string may contain

from zero bytes to 2^(32) -1 bytes (4 gigabytes).

Bit field:

A contiguous sequence of bits. A bit field may begin at any bit position of anybyte and may contain up to 32 bits.

Bit string:

A contiguous sequence of bits. A bit string may begin at any bit position of anybyte and may contain up to 2^(32) -1 bits.

BCD:

A byte (unpacked) representation of a decimal digit in the range0 through 9.Unpacked decimal numbers are stored as unsigned byte quantities. One digit isstored in each byte. The magnitude of the number is determined from the low-order half-byte; hexadecimal values 0-9 are valid and are interpreted as decimal

10


11/37

numbers. The high-order half-byte must be zero for multiplication and division; itmay contain any value for addition andsubtraction.

Packed BCD:

A byte (packed) representation of two decimal digits, each in the range 0 through9. One digit is stored in each half-byte. The digit in the high-order half-byte is themost significant. Values 0-9 are valid in each half-byte. The range of a packeddecimal byte is 0-99.

Figure 4.4 graphically summarizes the data types supported by the 80386.

+1 07 0 7 0 15 14 8 7 0

BYTE +-------+ BYTE +-------+ WORD +-------------+INTEGER || | | ORDINAL | | | INTEGER || | | | |

+-------+ +-------+ +-------------+SIGN BIT++------+ +-------+ SIGN BIT++MSB |

MAGNITUDE MAGNITUDE +-------------+MAGNITUDE

+1 0 +3 +2 +1 015 0 31 16 15 0

WORD +---------------+ DOUBLEWORD +----------------------------+ORDINAL || | | | | INTEGER || | | | | | | ||+---------------+ +----------------------------+| | SIGN BIT++MSB |+---------------+ +----------------------------+

MAGNITUDE MAGNITUDE

+3 +2 +1 031 0

DOUBLEWORD +-------------------------------+ORDINAL | | | | | | | | |

+-------------------------------++-------------------------------+

MAGNITUDE

+N +1 07 0 7 0 7 0

BINARY CODED +-------+ +---------------+DECIMAL (BCD) | | | ... | | | | |

+-------+ +---------------+BCD BCD BCD

11


12/37

DIGIT N DIGIT 1 DIGIT 0

+N +1 07 0 7 0 7 0

PACKED +-------+ +---------------+BCD | | | ... | | | | |

+-------+ +---------------++---+ +---+MOST LEASTSIGNIFICANT SIGNIFICANTDIGIT DIGIT

+N +1 07 0 7 0 7 0

BYTE +-------+ +---------------+STRING | | | ... | | | | |

+-------+ +---------------+

-2 GIGABYTES+2 GIGABYTES 210

BIT +------------------------- --------------------+STRING ||||| || |||||

+-------------------------- -------------------+BIT 0

+3 +2 +1 031 0

NEAR 32-BIT +-------------------------------+POINTER | | | | | | | | |

+-------------------------------++-------------------------------+

OFFSET

+5 +4 +3 +2 +1 048 0

FAR 48-BIT +-----------------------------------------------+POINTER | | | | | | | | | | | | |

+-----------------------------------------------++-----------------------------------------------+

SELECTOR OFFSET

+5 +4 +3 +2 +1 032-BIT +-----------------------------------------------+

BIT FIELD | | | | | | | | | | | | |+-----------------------------------------------+

||

1 TO 32 BITS

Figure 4.4 80386 Data Types4.6.3 Registers

The 80386 contains a total of sixteen registers that are of interest to theapplications programmer. As Figure 4.5 shows, these registers may be groupedinto these basic categories:

12


13/37

GENERAL REGISTERS

31 23 15 7 0+-----------------+-----------------+--------------------------------+| EAX AH AX AL ||-----------------+-----------------+--------------------------------|| EDX DH DX DL ||-----------------+-----------------+--------------------------------|| ECX CH CX CL ||-----------------+-----------------+--------------------------------|| EBX BH BX BL ||-----------------+-----------------+--------------------------------|| EBP BP ||-----------------+-----------------+-----------------+--------------|| ESI SI ||-----------------+-----------------+-----------------+--------------|| EDI DI ||-----------------+-----------------+-----------------+--------------|| ESP SP |

+-----------------+-----------------+-----------------+--------------+

15 7 0+-----------------+-----------------+| CS (CODE SEGMENT) ||-----------------+-----------------|| SS (STACK SEGMENT) |

SEGMENT |-----------------+-----------------|REGISTERS | DS (DATA SEGMENT) |

|-----------------+-----------------|| ES (DATA SEGMENT) ||-----------------+-----------------|| FS (DATA SEGMENT) ||-----------------+-----------------|| GS (DATA SEGMENT) |

+-----------------+-----------------+

STATUS AND INSTRUCTION REGISTERS

31 23 15 7 0+-----------------+-----------------+-----------------+------------+| EFLAGS ||------------------------------------------------------------------|| EIP (INSTRUCTION POINTER) |

13


14/37

+-----------------+-----------------+-----------------+------------+

Figure 4.5 80386 Applications Register Set

General registers. These eight 32-bit general-purpose registers areused primarily to contain operands for arithmetic and logicaloperations.

Segment registers. These special-purpose registers permit systemssoftware designers to choose either a flat or segmented model ofmemory organization. These six registers determine, at any given time,which segments of memory are currently addressable.

Status and instruction registers. These special-purpose registers areused to record and alter certain aspects of the 80386-processor state.

General Registers

The general registers of the 80386 are the 32-bit registers EAX, EBX, ECX, EDX,EBP, ESP, ESI, and EDI. These registers are used interchangeably to containthe operands of logical and arithmetic operations. They may also be usedinterchangeably for operands of address computations (except that ESP cannotbe used as an index operand).

As Figure 2-5 shows, the low-order word of each of these eight registers has aseparate name and can be treated as a unit. This feature is useful for handling16-bit data items and for compatibility with the 8086 and 80286 processors. Theword registers are named AX, BX, CX, DX, BP, SP, SI, and DI. Figure 4.5 alsoillustrates that each byte of the 16-bit registers AX, BX, CX, and DX has aseparate name and can be treated as a unit. This feature is useful for handlingcharacters and other 8-bit data items. The byte registers are named AH, BH, CH,and DH (high bytes); and AL, BL, CL, and DL (low bytes).

All of the general-purpose registers are available for addressing calculations andfor the results of most arithmetic and logical calculations; however, a fewfunctions are dedicated to certain registers. By implicitly choosing registers forthese functions, the 80386 architecture can encode instructions more compactly.The instructions that use specific registers include: double-precision multiply anddivide, I/O, string instructions, translate, loop, variable shift and rotate, and stackoperations.

Segment Registers

The segment registers of the 80386 give systems software designers theflexibility to choose among various models of memory organization.

Complete programs generally consist of many different modules, each consistingof instructions and data. However, at any given time during program execution,

14


15/37

only a small subset of a programs modules is actually in use. The 80386architecture takes advantage of this by providing mechanisms to support directaccess to the instructions and data of the current module's environment, withaccess to additional segments on demand.

At any given instant, six segments of memory may be immediately accessible toan executing 80386 program. The segment registers CS, DS, SS, ES, FS, andGS are used to identify these six current segments. Each of these registersspecifies a particular kind of segment, as characterized by the associatedmnemonics ("code," "data," or "stack") shown in Figure 4.6 Each registeruniquely determines one particular segment, from among the segments thatmake up the program, that is to be immediately accessible at highest speed.

+----------------+ +--------------+| MODULE | | MODULE || A || A || CODE | | | | DATA |

+----------------+ | +------------------+ | +--------------++--| CS (CODE) | |

|------------------| |+----------------+ +--| SS (STACK) | | +--------------+| | | |------------------| | | DATA || STACK || STRUCTURE || | |------------------| | | 1 |+----------------+ | ES (DATA) |---+ +--------------+

|------------------|+--| FS (DATA) |

+----------------+ | |------------------| +--------------+| DATA | | | GS (DATA) |--+ | DATA || STRUCTURE || STRUCTURE || 2 | | 3 |+----------------+ +--------------+

Figure 4.6 Memory segmentationThe segment containing the currently executing sequence of instructions isknown as the current code segment; it is specified by means of the CS register.The 80386 fetches all instructions from this code segment, using as an offset thecontents of the instruction pointer. CS is changed implicitly as the result ofintersegment control-transfer instructions (for example, CALL and JMP),interrupts, and exceptions.

Subroutine calls, parameters, and procedure activation records usually require

that a region of memory be allocated for a stack. All stack operations use the SSregister to locate the stack. Unlike CS, the SS register can be loaded explicitly,thereby permitting programmers to define stacks dynamically.

The DS, ES, FS, and GS registers allow the specification of four data segments,each addressable by the currently executing program. Accessibility to fourseparate data areas helps programs efficiently access different types of datastructures; for example, one data segment register can point to the data

15


16/37

structures of the current module, another to the exported data of a higher-levelmodule, another to a dynamically created data structure,and another to data shared with another task. An operand within a data segmentis addressed by specifying its offset either directly in an instruction or indirectlyvia general registers.

Depending on the structure of data (e.g., the way data is parceled into one ormore segments), a program may require access to more than four datasegments. To access additional segments, the DS, ES, FS, and GS registers canbe changed under program control during the course of a program's execution.This simply requires that the program execute an instruction to load theappropriate segment register prior to executing instructions that access the data.

The processor associates a base address with each segment selected by asegment register. To address an element within a segment, a 32-bit offset isadded to the segment's base address. Once a segment is selected (by loading

the segment selector into a segment register), a data manipulation instructiononly needs to specify the offset. Simple rules define which segment register isused to form an address when only an offset is specified.

Stack Implementation

Stack operations are facilitated by three registers:

The stack segment (SS) register. Stacks are implemented in memory. A systemmay have a number of stacks that is limited only by the maximum number ofsegments. A stack may be up to 4 gigabytes long, the maximum length of a

segment. One stack is directly addressable at a time the one located by SS.This is the current stack, often referred to simply as "the" stack. SS is usedautomatically by the processor for all stack operations.

The stack pointer (ESP) register. ESP points to the top of the push-down stack(TOS). It is referenced implicitly by PUSH and POP operations, subroutine callsand returns, and interrupt operations. When an item is pushed onto the stack, theprocessor decrements ESP, then writes the item at the new TOS as shown infigure 4.7. When an item is popped off the stack, the processor copies it fromTOS, then increments ESP. In other words, the stack grows down in memorytoward lesser addresses.

The stack-frame base pointer (EBP) register. The EBP is the best choice ofregister for accessing data structures, variables and dynamically allocated workspace within the stack. EBP is often used to access elements on the stackrelative to a fixed point on the stack rather than relative to the current TOS. Ittypically identifies the base address of the current stack frame established for thecurrent procedure. When EBP is used as the base register in an offsetcalculation, the offset is calculated automatically in the current stack segment

16


17/37

(i.e., the segment currently selected by SS). Because SS does not have to beexplicitly specified, instruction encoding in such cases is more efficient. EBP canalso be used to index into segments addressable via other segment registers.

31 0+------+------+------+------+


18/37

| |M|F| |T| PL |F|F|F|F|F|F| |F| |F| |F|+----------------------------------------------------------------+

| | | | | | | | | | | | |VIRTUAL 8086 MODE---X--------+ | | | | | | | | | | | |

RESUME FLAG---X----------+ | | | | | | | | | | |NESTED TASK FLAG---X--------------+ | | | | | | | | | |

I/O PRIVILEGE LEVEL---X-----------------+ | | | | | | | | |OVERFLOW---S---------------------+ | | | | | | | |

DIRECTION FLAG---C-----------------------+ | | | | | | |INTERRUPT ENABLE---X-------------------------+ | | | | | |

TRAP FLAG---S---------------------------+ | | | | |SIGN FLAG---S-----------------------------+ | | | |ZERO FLAG---S-------------------------------+ | | |

AUXILIARY CARRY---S-----------------------------------+ | |PARITY FLAG---S---------------------------------------+ |CARRY FLAG---S-------------------------------------------+

S = STATUS FLAG, C = CONTROL FLAG, X = SYSTEM FLAG

NOTE: 0 OR 1 INDICATES INTEL RESERVED. DO NOT DEFINE

Figure 4.8 EFLAGS Register

The flags register is a 32-bit register named EFLAGS. The low-order 16 bits ofEFLAGS are named FLAGS for compatibility with older 8086 and 80286 code.There are three basic groups of flags, status flags, control flags and the systemflags. Figure 4.8 defines the bits within this register. The flags control certainoperations and indicate the status of the 80386. The low-order 16 bits ofEFLAGS is named FLAGS and can be treated as a unit. This feature is usefulwhen executing 8086 and 80286 code, because this part of EFLAGS is identicalto the FLAGS register of the 8086 and the 80286.

The flags may be considered in three groups: the status flags, the control flags,and the systems flags.

The status flags used by application programmers are CF, PF, AF, ZF, SF, andOF. These flags hold the results of various instructions that are then used bylater instructions. The arithmetic instructions use OF, SF, ZF, AF, PF, and CF.The SCAS (Scan String), CMPS (Compare String), and LOOP instructions useZF to signal that their operations are complete. There are instructions to set,clear, and complement CF before execution of an arithmetic instruction. Meaningof various flags is described as follows.

Carry Flag: Set in math instructions to indicate that the high-order bit waseither carried or borrowed. It is cleared if neither of these conditionsoccurs.

Parity Flag: Indicates whether the lower 8-bits of a result contains an evennumber of bits set to 1 (flag is set) or an odd set of bits are set to 1 (flag iscleared)

18


19/37

Adjust Flag: Set in decimal math instructions to indicate whether the loworder 4-bits of AL where carried, or borrowed. It is cleared if not.

Zero Flag:Set to indicate a math instruction resulted in a zero result. It iscleared otherwise. It is also used by string and loop instructions to indicatecompletion of the instruction.

Sign Flag:Set equal to high-order bit of results of math instruction. If setthe result is negative, positive if cleared. Overflow Flag:Indicates if the number placed in the destination operand

overflowed, either too large, or small. If no overflow occurred, the bit iscleared.

The only control flag at this time is the Direction Flag. It is used by stringinstructions to determine whether to process strings from the end of the string(auto-decrement), or from the beginning of the string (auto-increment). SettingDF causes string instructions to auto-decrement; that is, to process strings fromhigh addresses to low addresses. Clearing DF causes string instructions to

auto-increment, or to process strings from low addresses to high addresses.

The other flags are system flags.

Instruction Pointer

The instruction pointer register (EIP) contains the offset address, relative to thestart of the current code segment, of the next sequential instruction to beexecuted. The instruction pointer is not directly visible to the programmer; it iscontrolled implicitly by control-transfer instructions, interrupts, and exceptions.

As Figure 4.9 shows, the low-order 16 bits of EIP is named IP and can be usedby the processor as a unit. This feature is useful when executing instructionsdesigned for the 8086 and 80286 processors.

16-BIT IP REGISTER+------------------------------+

31 23 15 7 0+-----------------+-----------------+-----------------+------------+| EIP (INSTRUCTION POINTER) |+-----------------+-----------------+-----------------+------------+

Figure 4.9 Instruction Pointer Register

4.6.4 Instruction Format

The information encoded in an 80386 instruction includes a specification of theoperation to be performed, the type of the operands to be manipulated, and thelocation of these operands. If an operand is located in memory, the instruction

19


20/37

must also select, explicitly or implicitly, which of the currently addressablesegments contains the operand.

80386 instructions are composed of various elements and have various formats.Of these instruction elements, only one, the opcode, is always present. The other

elements may or may not be present, depending on the particular operationinvolved and on the location and type of the operands. The elements of aninstruction, in order of occurrence are as follows:

Prefixes -- one or more bytes preceding an instruction that modify theoperation of the instruction. The following types of prefixes can be used byapplications programs:

1. Segment override -- explicitly specifies which segment register aninstruction should use, thereby overriding the default segment-register selectionused by the 80386 for that instruction.

2. Address size -- switches between 32-bit and 16-bit address generation.

3. Operand size -- switches between 32-bit and 16-bit operands.

4. Repeat -- used with a string instruction to cause the instruction to act oneach element of the string.

Opcode -- specifies the operation performed by the instruction. Someoperations have several different opcodes, each specifying a differentvariant of the operation.

Register specifier -- an instruction may specify one or two registeroperands. Register specifiers may occur either in the same byte as theopcode or in the same byte as the addressing-mode specifier.

Addressing-mode specifier -- when present, specifies whether an operandis a register or memory location; if in memory, specifies whether adisplacement, a base register, an index register, and scaling are to beused.

SIB (scale, index, base) byte -- when the addressing-mode specifier

indicates that an index register will be used to compute the address of anoperand, an SIB byte is included in the instruction to encode the baseregister, the index register, and a scaling factor.

Displacement -- when the addressing-mode specifier indicates that adisplacement will be used to compute the address of an operand, thedisplacement is encoded in the instruction. A displacement is a signedinteger of 32, 16, or eight bits. The eight-bit form is used in the common

20


21/37

case when the displacement is sufficiently small. The processorextends an eight-bit displacement to 16 or 32 bits, taking into account thesign.

Immediate operand -- when present directly provides the value of an

operand of the instruction. Immediate operands may be 8, 16, or 32 bitswide. In cases where an eight-bit immediate operand is combined in someway with a 16- or 32-bit operand, the processor automatically extends thesize of the eight-bit operand, taking into account the sign.

4.6.5 Operand Selection

An instruction can act on zero or more operands, which are the data manipulatedby the instruction. An example of a zero-operand instruction is NOP (nooperation). An operand can be in any of these locations:

In the instruction itself (an immediate operand)

In a register (EAX, EBX, ECX, EDX, ESI, EDI, ESP, or EBP in the case of32-bit operands; AX, BX, CX, DX, SI, DI, SP, or BP in the case of 16-bitoperands; AH, AL, BH, BL, CH, CL, DH, or DL in the case of 8-bitoperands; the segment registers; or the EFLAGS register for flagoperations)

In memory

At an I/O port

Immediate operands and operands in registers can be accessed more rapidlythan operands in memory since memory operands must be fetched frommemory. Register operands are available in the CPU. Immediate operands arealso available in the CPU, because they are prefetched as part of the instruction.

Of the instructions that have operands, some specify operands implicitly; othersspecify operands explicitly; still others use a combination of implicit and explicitspecification; for example:

Implicit operand: AAM

By definition, AAM (ASCII adjust for multiplication) operates on the contents ofthe AX register.

Explicit operand: XCHG EAX, EBX

The operands to be exchanged are encoded in the instruction after the opcode.

21


22/37

Implicit and explicit operands: PUSH COUNTER

The memory variable COUNTER (the explicit operand) is copied to the top ofthe stack (the implicit operand).

Note that most instructions have implicit operands. All arithmetic instructions, forexample, update the EFLAGS register.

An 80386 instruction can explicitly reference one or two operands.Two-operand instructions, such as MOV, ADD, XOR, etc., generally overwriteone of the two participating operands with the result. A distinction can thus bemade between the source operand (the one unaffected by the operation) and thedestination operand (the one overwritten by the result).

For most instructions, one of the two explicitly specified operandseither thesource or the destination--can be either in a register or in memory. The other

operand must be in a register or be an immediate source operand. Thus, theexplicit two-operand instructions of the 80386 permit operations of the followingkinds:

Register-to-register

Register-to-memory

Memory-to-register

Immediate-to-register

Immediate-to-memory

Certain string instructions and stack manipulation instructions, however, transfer

data from memory to memory. Both operands of some string instructions are inmemory and are implicitly specified. Push and pop stack operations allowtransfer between memory operands and the memory-based stack.

Immediate Operands

Certain instructions use data from the instruction itself as one (and sometimestwo) of the operands. Such an operand is called an immediate operand. Theoperand may be 32-, 16-, or 8-bits long. For example:

SHR PATTERN, 2

One byte of the instruction holds the value 2, the number of bits by which to shiftthe variable PATTERN.

TEST PATTERN, 0FFFF00FFH

22


23/37

A doubleword of the instruction holds the mask that is used to test the variablePATTERN.

Register Operands

Operands may be located in one of the 32-bit general registers (EAX, EBX, ECX,EDX, ESI, EDI, ESP, or EBP), in one of the 16-bit general registers (AX, BX, CX,DX, SI, DI, SP, or BP), or in one of the 8-bit general registers (AH, BH, CH, DH,

AL, BL, CL,or DL).

The 80386 has instructions for referencing the segment registers (CS, DS, ES,SS, FS, GS). These instructions are used by applications programs only ifsystems designers have chosen a segmented memory model.

The 80386 also has instructions for referring to the flag register. The flags may

be stored on the stack and restored from the stack. Certain instructions changethe commonly modified flags directly in the EFLAGS register. Other flags that areseldom modified can be modified indirectly via the flags image in the stack.

Memory Operands

Data-manipulation instructions that address operands in memory must specify(either directly or indirectly) the segment that contains the operand and the offsetof the operand within the segment. However, for speed and compact instructionencoding, segment selectors are stored in the high-speed segment registers.Therefore, data-manipulation instructions need to specify only the desired

segment register and an offset in order to address a memory operand.

An 80386 data-manipulation instruction that accesses memory uses one of thefollowing methods for specifying the offset of a memory operand within itssegment:

1. Most data-manipulation instructions that access memory contain a byte thatexplicitly specifies the addressing method for the operand. A byte, known as themodR/M byte, follows the opcode and specifies whether the operand is in aregister or in memory. If the operand is in memory, the address is computed froma segment register and any of the following values: a base register, an indexregister, a scaling factor, a displacement. When an index register is used, themodR/M byte is also followed by another byte that identifies the index registerand scaling factor. This addressing method is the most flexible.

2. A few data-manipulation instructions implicitly use specialized addressingmethods:

23


24/37

For a few short forms of MOV that implicitly use the EAX register, theoffset of the operand is coded as a doubleword in the instruction. No baseregister, index register, or scaling factor is used.

String operations implicitly address memory via DS:ESI, (MOVS, CMPS,

OUTS, LODS, SCAS) or via ES:EDI (MOVS, CMPS, INS, STOS).

Stack operations implicitly address operands via SS:ESP registers; e.g.,PUSH, POP, PUSHA, PUSHAD, POPA, POPAD, PUSHF, PUSHFD,POPF, POPFD, CALL, RET, IRET, IRETD, exceptions, and interrupts.

Segment Selection

Data-manipulation instructions need not explicitly specify which segment registeris used. For all of these instructions, specification of a segment register isoptional. For all memory accesses, if a segment is not explicitly specified by the

instruction, the processor automatically chooses a segment register according tothe rules of Table 4.1. (If systems designers have chosen a flat model of memoryorganization, the segment registers and the rules that the processor uses inchoosing them are not apparent to applications programs.)

There is a close connection between the kind of memory reference and thesegment in which that operand resides. As a rule, a memory reference impliesthe current data segment (i.e., the implicit segment selector is in DS).

However, ESP and EBP are used to access items on the stack; therefore, whenthe ESP or EBP register is used as a base register, the current stack segment is

implied (i.e., SS contains the selector).

Special instruction prefix elements may be used to override the default segmentselection. Segment-override prefixes allow an explicit segment selection. The80386 has a segment-override prefix for each of the segment registers. Only inthe following special cases is there an implied segment selection that a segmentprefix cannot override:

The use of ES for destination strings in string instructions.

The use of SS in stack instructions.

The use of CS for instruction fetches.

24


25/37

Table 4.1 Default Segment Register Selection Rules

Memory ReferenceNeeded Segment RegisterUsed Implicit Segment Selection

Rule

Instructions Code (CS) Automatic with instruction prefetch

Stack Stack (SS) All stack pushes and pops. Anymemory reference that uses ESP or

EBP as a base register.

Local Data Data (DS) All data references except whenrelative to stack or stringdestination.

DestinationStrings Extra (ES) Destination of string instructions.

Effective-Address Computation

The modR/M byte provides the most flexible of the addressing methods, andinstructions that require a modR/M byte as the second byte of the instruction arethe most common in the 80386 instruction set. For memory operands defined bymodR/M, the offset within the desired segment is calculated by taking the sum ofup to three components:

A displacement element in the instruction.

A base register.

An index register. The index register may be automatically multiplied by ascaling factor of 2, 4, or 8.

The offset that results from adding these components is called an effectiveaddress. Each of these components of an effective address may have either apositive or negative value. If the sum of all the components exceeds 2^(32), theeffective address is truncated to 32 bits. Figure 4.10 illustrates the full set ofpossibilities for modR/M addressing.

25


26/37

The displacement component, because it is encoded in the instruction, is usefulfor fixed aspects of addressing; for example:

Location of simple scalar operands.

Beginning of a statically allocated array.

Offset of an item within a record.

The base and index components have similar functions. Both utilize the same setof general registers. Both can be used for aspects of addressing that aredetermined dynamically; for example:

Location of procedure parameters and local variables in stack.

The beginning of one record among several occurrences of the samerecord type or in an array of records.

The beginning of one dimension of multiple dimension array.

The beginning of a dynamically allocated array.

The uses of general registers as base or index components differ in the followingrespects:

ESP cannot be used as an index register.

When ESP or EBP is used as the base register, the default segment is the

one selected by SS. In all other cases the default segment is DS.

The scaling factor permits efficient indexing into an array in the common caseswhen array elements are 2, 4, or 8 bytes wide. The shifting of the index registeris done by the processor at the time the address is evaluated with noperformance loss. This eliminates the need for a separate shift or multipliesinstruction.

The base, index, and displacement components may be used in anycombination; any of these components may be null. A scale factor can be usedonly when an index is also used. Each possible combination is useful for data

structures commonly used by programmers in high-level languages andassembly languages. Following are possible uses for some of the variouscombinations of address components.

DISPLACEMENT

26


27/37

The displacement alone indicates the offset of the operand. This combinationis used to directly address a statically allocated scalar operand. An 8-bit, 16-bit,or 32-bit displacement can be used.

BASE

The offset of the operand is specified indirectly in one of the generalregisters, as for "based" variables.

BASE + DISPLACEMENT

A register and a displacement can be used together for two distinct purposes:

1. Index into static array when element size is not 2, 4, or 8 bytes. Thedisplacement component encodes the offset of the beginning of the array. The

register holds the results of a calculation to determine the offset of a specificelement within the array.

2. Access item of a record. The displacement component locates an itemwithin record. The base register selects one of several occurrences of record,thereby providing a compact encoding for this common function.

An important special case of this combination is to access parameters in theprocedure activation record in the stack. In this case, EBP is the best choicefor the base register, because when EBP is used as a base register, theprocessor automatically uses the stack segment register (SS) to locate the

operand, thereby providing a compact encoding for this common function.

(INDEX * SCALE) + DISPLACEMENT

This combination provides efficient indexing into a static array when theelement size is 2, 4, or 8 bytes. The displacement addresses the beginning ofthe array, the index register holds the subscript of the desired array element,and the processor automatically converts the subscript into an index byapplying the scaling factor.

BASE + INDEX + DISPLACEMENT

Two registers used together support either a two-dimensional array (thedisplacement determining the beginning of the array) or one of severalinstances of an array of records (the displacement indicating an item in therecord).

BASE + (INDEX * SCALE) + DISPLACEMENT

27


28/37

This combination provides efficient indexing of a two-dimensional array whenthe elements of the array are 2, 4, or 8 bytes wide.

SEGMENT + BASE + (INDEX * SCALE) + DISPLACEMENT

+ +

| --- | + + + ++ + | EAX | | EAX | | 1 || CS | | ECX | | ECX | | | + +| SS | | EDX | | EDX | | 2 | | NO DISPLACEMENT |-| DS |- + -| EBX |- + -| EBX |- * -| |- + -| 8-BIT DISPLACEMENT |-| ES | | ESP | | --- | | 4 | | 32-BIT DISPLACEMENT || FS | | EBP | | EBP | | | + +| GS | | ESI | | ESI | | 6 |+ + | EDI | | EDI | + +

+ + + +

Figure 4.10 Effective Address Computation

4.6.6 Interrupts and Exceptions

The 80386 has two mechanisms for interrupting program execution:

1. Exceptions are synchronous events that are the responses of the CPUto certain conditions detected during the execution of an instruction.

2. Interrupts are asynchronous events typically triggered by externaldevices needing attention.

Interrupts and exceptions are alike in that both cause the processor to

temporarily suspend its present program execution in order to execute a programof higher priority. The major distinction between these two kinds of interrupts istheir origin. An exception is always reproducible by re-executing with the programand data that caused the exception, whereas an interrupt is generallyindependent of the currently executing program.

Application programmers are not normally concerned with servicing interrupts.Certain exceptions, however, are of interest to applications programmers, andmany operating systems give applications programs the opportunity to servicethese exceptions. However, the operating system itself defines the interfacebetween the applications programs and the exception mechanism of the 80386.

Table 4.2 highlights the exceptions that may be of interest to applicationsprogrammers.

Table 4.2 80386 Reserved Exceptions and Interrupts

28


29/37

Vector Number Description

0 Divide Error1 Debug Exceptions2 NMI Interrupt

3 Breakpoint4 INTO Detected Overflow5 BOUND Range Exceeded6 Invalid Opcode7 Coprocessor Not Available8 Double Exception9 Coprocessor Segment Overrun10 Invalid Task State Segment11 Segment Not Present12 Stack Fault13 General Protection

14 Page Fault15 (reserved)16 Coprocessor Error17-32 (reserved)

A divide error exception results when the instruction DIV or IDIV isexecuted with a zero denominator or when the quotient is too large for thedestination operand.

The debug exception may be reflected back to an applications program ifit results from the trap flag (TF).

A breakpoint exception results when the instruction INT 3 is executed.This instruction is used by some debuggers to stop program execution atspecific points.

An overflow exception results when the INTO instruction is executed andthe OF (overflow) flag is set (after an arithmetic operation that set the OFflag).

A bounds check exception results when the BOUND instruction isexecuted and the array index it checks falls outside the bounds of thearray.

Invalid opcodes may be used by some applications to extend theinstruction set. In such a case, the invalid opcode exception presents anopportunity to emulate the opcode.

29


30/37

The "coprocessor not available" exception occurs if the program containsinstructions for a coprocessor, but no coprocessor is present in thesystem.

A coprocessor error is generated when a coprocessor detects an illegal

operation.

The instruction INT generates an interrupt whenever it is executed; the processortreats this interrupt as an exception. The effects of this interrupt (and the effectsof all other exceptions) are determined by exception handler routines provided bythe application program or as part of the systems software (provided by systemsprogrammers).


1. What is the size of the memory in flat addressing space?

2. What is the size of the memory in segmented address space?3. How many bytes are occupied by a double word?4. Differentiate between near pointers and far pointers?5. Mention the various general registers present.6. State the purpose of the various segment registers present.7. What are the three basic groups of flags?8. Mention the various ways to find the effective address of the operands.9. What does a displacement indicate in the effective address computation?10. What are the two mechanisms supported to interrupt the program

execution?

4.7 I/O Addressing

The 80386 allows input/output to be performed in either of two ways:

By means of a separate I/O address space (using specific I/O instructions)

By means of memory-mapped I/O (using general-purpose operandManipulation instructions).

I/O Address Space

The 80386 provides a separate I/O address space, distinct from physicalmemory, that can be used to address the input/output ports that are used forexternal 16 devices. The I/O address space consists of 2^(16) (64K) individuallyaddressable 8-bit ports; any two consecutive 8-bit ports can be treated as a 16-bit port; and four consecutive 8-bit ports can be treated as a 32-bit port. Thus, theI/O address space can accommodate up to 64K 8-bit ports, up to 32K 16-bitports, or up to 16K 32-bit ports.

30


31/37

The program can specify the address of the port in two ways. Using animmediate byte constant, the program can specify:

256 8-bit ports numbered 0 through 255. 128 16-bit ports numbered 0, 2, 4, . . . , 252, 254.

64 32-bit ports numbered 0, 4, 8, . . . , 248, 252.

Using a value in DX, the program can specify:

8-bit ports numbered 0 through 65535 16-bit ports numbered 0, 2, 4, . . . , 65532, 65534 32-bit ports numbered 0, 4, 8, . . . , 65528, 65532

The 80386 can transfer 32, 16, or 8 bits at a time to a device located in the I/Ospace. Like doublewords in memory, 32-bit ports should be aligned at addressesevenly divisible by four so that the 32 bits can be transferred in a single bus

access. Like words in memory, 16-bit ports should be aligned at even-numberedaddresses so that the 16 bits can be transferred in a single bus access. An 8-bitport may be located at either an even or odd address. The instructions IN andOUT move data between a register and a port in the I/O address space. Theinstructions INS and OUTS move strings of data between the memory addressspace and ports in the I/O address space.

Memory-Mapped I/O

I/O devices also may be placed in the 80386 memory address space. As long asthe devices respond like memory components, they are indistinguishable to the

processor. Memory-mapped I/O provides additional programming flexibility. Anyinstruction that references memory may be used to access an I/O port located inthe memory space. For example, the MOV instruction can transfer data betweenany register and a port; and the AND, OR, and TEST instructions may be used tomanipulate bits in the internal registers of a device as shown in figure 4.11.Memory-mapped I/O performed via the full instruction set maintains the fullcomplement of addressing modes for selecting the desired I/O device (e.g., directaddress, indirect address, base register, index register, scaling).

MEMORYADDRESS SPACE I/O DEVICE 1

+---------------+ +-------------------+

| | | INTERNAL REGISTER ||---------------| - - - - - - - - -|-+---------------+ || | | | | ||---------------| - - - - - - - - -|-+---------------+ || | +-------------------+| || || || | I/O DEVICE 2| | +-------------------+

31


32/37

| | | INTERNAL REGISTER ||---------------| - - - - - - - - -|-+---------------+ || | | | | ||---------------| - - - - - - - - -|-+---------------+ || | +-------------------++---------------+

Figure 4.11 Memory Mapped I/O

Memory-mapped I/O, like any other memory reference, is subject to accessprotection and control when executing in protected mode.


1. What are the two types of I/O spaces supported by 80386?

4.8 Pentium Processors

The Pentium family of processors, which has its roots in the Intel 486(TM)processor, uses the basic Intel 486 instruction set with a few additionalinstructions. The Pentium microprocessor signals an improvement to thearchitecture found in 80486 microprocessor. The changes include an improvedcache structure, a wider data bus width, a faster numeric coprocessor, a dualinteger processor and branch prediction logic. The cache has been reorganizedto form two caches that are each 8K bytes in size, one for caching data, the otherfor instructions. The data bus width has been increased form 32 bits to 64 bits.The numeric coprocessor operates about five times faster than the 80486numeric coprocessor. The dual integer processor often allows two instructions

per clock. Finally the branch prediction logic allows programs that branch toexecute more efficiently. The Pentium Pro is still a faster version of the Pentiumand contains a modified internal architecture that can schedule up to fiveinstructions for execution and even faster floating-pint unit. The Pentium pro alsocontains a 256K-byte or 512K-byte level two cache in addition to the 16K-byte(8K for data and 8K for instruction) level one cache. Also added are fouradditional address lines, giving the Pentium Pro access to an astounding64GBytes of directly addressable memory space.

The term ''Pentium processor'' refers to a family of microprocessors that share acommon architecture and instruction set. The first Pentium processors (the P5variety) were introduced in 1993. This 5.0-V processor was fabricated in 0.8-micron bipolar complementary metal oxide semiconductor (BiCMOS) technology.The P5 processor runs at a clock frequency of either 60 or 66 MHz and has 3.1million transistors. The next version of the Pentium processor family, the P54Cprocessor, was introduced in 1994. The P54C processors are fabricated in 3.3-V,0.6-micron BiCMOS technology. The P54C processor also has SystemManagement Mode (SMM) for advanced power management

32


33/37

The Intel Pentium processor, like its predecessor the Intel 486 microprocessor, isfully software compatible with the installed base of over 100 million compatibleIntel architecture systems. In addition, the Intel Pentium processor provides newlevels of performance to new and existing software through a reimplementationof the Intel 32-bit instruction set architecture using the latest, most advanced,

design techniques. Optimized, dual execution units provide one-clock executionfor "core" instructions, while advanced technology, such as superscalararchitecture, branch prediction, and execution pipelining, enables multipleinstructions to execute in parallel with high efficiency. Separate code and datacaches combined with wide 128-bit and 256-bit internal data paths and a 64-bit,burstable, external bus allow these performance levels to be sustained in cost-effective systems. The application of this advanced technology in the IntelPentium processor brings "state of the art" performance and capability to existingIntel architecture software as well as new and advanced applications.

The Pentium processor has two primary operating modes and a "system

management mode."

The operating mode determines which instructions and architectural features areaccessible.

These modes are:

Protected Mode

This is the native state of the microprocessor. In this mode all instructions andarchitectural features are available, providing the highest performance and

capability. This is the recommended mode that all new applications andoperating systems should target. Among the capabilities of protected mode is theability to directly execute "real-address mode" 8086 software in a protected,multi-tasking environment. This feature is known as Virtual-8086 "mode" (or "V86mode"). Virtual-8086 "mode" however, is not actually a processor "mode," it is infact an attribute which can be enabled for any task (with appropriate software)while in protected mode.

Real-Address Mode (also called "real mode")

This mode provides the programming environment of the Intel 8086 processor,

with a few extensions (such as the ability to break out of this mode). Resetinitialization places the processor in real mode where, with a single instruction, itcan switch to protected mode.

System Management Mode

The Pentium microprocessor also provides support for System ManagementMode (SMM). SMM is a standard architectural feature unique to all new Intel

33


34/37

microprocessors, beginning with the Intel386 SL processor, which provides anoperating-system and application independent and transparent mechanism toimplement system power management and OEM differentiation features. SMM isentered through activation of an external interrupt pin (SMI#), which switches theCPU to a separate address space while saving the entire context of the CPU.

SMM-specific code may then be executed transparently. The operation isreversed upon returning.

Advanced Features

The Pentium P54C processor is the product of a marriage between the Pentiumprocessor's architecture and Intel's 0.6-micron, 3.3-V BiCMOS process ThePentium processor achieves higher performance than the fastest Intel486processor by making use of the following advanced technologies.

Superscalar Execution: The Intel486 processor can execute only one instruction

at a time. With superscalar execution, the Pentium processor can sometimesexecute two instructions simultaneously.

Pipeline Architecture: Like the Intel486 processor, the Pentium processorexecutes instructions in five stages. This staging, or pipelining, allows theprocessor to overlap multiple instructions so that it takes less time to execute twoinstructions in a row. Because of its superscalar architecture, the Pentiumprocessor has two independent processor pipelines.

Branch Target Buffer: The Pentium processor fetches the branch targetinstruction before it executes the branch instruction.

Dual 8-KB On-Chip Caches: The Pentium processor has two separate 8-kilobyte(KB) caches on chip--one for instructions and one for data--which allows thePentium processor to fetch data and instructions from the cache simultaneously.

Write-Back Cache: When data is modified; only the data in the cache is changed.Memory data is changed only when the Pentium processor replaces the modifieddata in the cache with a different set of data

64-Bit Bus: With its 64-bit-wide external data bus (in contrast to the Intel486processor's 32-bitwide external bus) the Pentium processor can handle up to

twice the data load of the Intel486 processor at the same clock frequency.

Instruction Optimization: The Pentium processor has been optimized to runcritical instructions in fewer clock cycles than the Intel486 processor.

Floating-Point Optimization: The Pentium processor executes individualinstructions faster through execution pipelining, which allows multiple floating-point instructions to be executed at the same time.

34


35/37

Pentium Extensions: The Pentium processor has fewer instruction set extensionsthan the Intel486 processors. The Pentium processor also has a set ofextensions for multiprocessor (MP) operation. This makes a computer withmultiple Pentium processors possible.

A Pentium system, with its wide, fast buses, advanced write-back cache/memorysubsystem, and powerful processor, will deliver more power for today's softwareapplications, and also optimize the performance of advanced 32-bit operatingsystems (such as Windows 95) and 32-bit software applications.

Architecture of Pentium processors

The block diagram of Pentium processor is shown in figure 4.12. The mostimportant enhancements over the 486 are the separate instruction and data

caches, the dual integer pipelines (the U-pipeline and the V-pipeline, as Intelcalls them), branch prediction using the branch target buffer (BTB), the pipelinedfloating-point unit, and the 64-bit external data bus. Even-parity checking isimplemented for the data bus and the internal RAM arrays (caches and TLBs).

As for new functions, there are only a few; nearly all the enhancements inPentium are included to improve performance, and there are only a handful ofnew instructions. Pentium is the first high-performance micro-processor toinclude a system management mode like those found on power-miserlyprocessors for notebooks and other battery-based applications; Intel is holding toits promise to include SMM on all new CPUs. Pentium uses about 3 million

transistors on a huge 294 mm 2 (456k mils 2 ). The caches plus TLBs use onlyabout 30% of the die. At about 17 mm on a side, Pentium is one of the largestmicroprocessors ever fabricated and probably pushes Intels production

35


36/37

Figure 4.12 architecture of Pentium Processor

equipment to its limits. The integer data path is in the middle, while the floating-point data path is on the side opposite the data cache. In contrast to other superscalar designs, such as SuperSPARC, Pentiums integer data path is actuallybigger than its FP data path. This is an indication of the extra logic associatedwith complex instruction support. Intel estimates about 30% of the transistorswere devoted to compatibility with the x86 architecture. Much of this overhead isprobably in the micro code ROM, instruction decode and control unit, and theadders in the two address generators, but there are other effects of the complexinstruction set. For example, the higher frequency of memory references in x86programs compared to RISC code led to the implementation of the dual-ac.

36


37/37

Register set

The purpose of the Register is to hold temporary results, and control theexecution of the program. General-purpose registers in Pentium are EAX, ECX,EDX, EBX, ESP, EBP,ESI, or EDI.

The 32-bit registers are named with prefix E, EAX, etc, and the least 16 bits 0-15of these registers can be accessed with names such as AX, SI Similarly thelower eight bits (0-7) can be accessed with names such as AL & BL. The highereight bits (8-15) with names such as AH & BH. The instruction pointer EAPknown as program counter (PC) in 8-bit microprocessor is a 32-bit register tohandle 32-bit memory addresses, and the lower 16-bit segment IP is used for 16-bi memory address.

The flag register is a 32-bit register, however 14-bits are being used at presentfor 13 different tasks; these flags are upward compatible with those of the 8086

and 80286. The comparison of the available flags in 16-bit and 32-bitmicroprocessor is may provide some clues related to capabilities of theseprocessors. The 8086 have 9 flags, the 80286 have 11 flags, and the 80286 have13 flags. All of these flag registers include 6 flags related to data conditions (sign,zero, carry, auxiliary, carry, overflow, and parity) and three flags related tomachine operations.(interrupts, Single-step and Strings). The 80286 have twoadditional: I/O Privilege and Nested Task. The I/O Privilege uses two bits inprotected mode to determine which I/O instructions can be used, and the nestedtask is used to show a link between two tasks. The processor also includescontrol registers and system address registers, debug and test registers forsystem and debugging operations.

Documents

80386 and Pentium Microprocessors