Automatic Verification of Micro- processor designs using ...uu.diva-portal.org › smash › get › diva2:544017 › FULLTEXT01.pdf · Automatic Verification of Microprocessor designs

IT 12 035

Examensarbete 30 hpJuli 2012

Automatic Verification of Micro- processor designs using Random Simulation

Anantha Ganesh Karikar Kamath

Institutionen för informationsteknologiDepartment of Information Technology

Teknisk- naturvetenskaplig fakultet UTH-enheten Besöksadress: Ångströmlaboratoriet Lägerhyddsvägen 1 Hus 4, Plan 0 Postadress: Box 536 751 21 Uppsala Telefon: 018 – 471 30 03 Telefax: 018 – 471 30 00 Hemsida: http://www.teknat.uu.se/student

Abstract

Automatic Verification of Microprocessor designsusing Random Simulation

Anantha Ganesh Karikar Kamath

Verification of microprocessor cores has always been a major challenge and a crucialphase in the development of a microprocessor. Increasing chip complexities anddecreasing time-to-market windows has led to steady increase in the verificationcosts. Automatic verification of microprocessor designs based on random simulationhelps to quickly capture inconceivable corner cases that would not have been foundby manual testing.

This thesis work focuses on the design and implementation of a Co-Simulationtestbench platform together with a framework for generating random assemblyprograms for the functional verification of the OpenRISC Processor, OR1200. ARandom Program Generator based on configurable instruction weights is developedto generate large test volumes. These random test programs are used to verify thefunctional correctness of the Register Transfer Logic model of a processor against abehavioral Instruction Set C Simulator. The simulation results show the effectivenessof this approach. Histograms are used to graphically illustrate the instruction andregister coverage statistics.

Tryckt av: Reprocentralen ITC

Sponsor: ORSoC ABIT 12 035Examinator: Philipp RümmerÄmnesgranskare: Leif GustafssonHandledare: Philipp Rümmer, Marcus Erlandsson

ACKNOWLEDGMENTS

This is a Master Thesis submitted in partial fulfillment of therequirements for the degree of Master of Science in EmbeddedSystems to the Department of Information Technology, UppsalaUniversiy, Uppsala, Sweden.

My deepest gratitude to my supervisor Dr. Philipp Rümmer,Programme Director for Master’s programme in Embedded Sys-tem at Uppsala University, for his guidance, encouragement andpatience he demonstrated during my thesis work.

Further I would like to thank my reviewer Dr. Leif Gustafsson, Se-nior Fellow at the department of Physics and Astronomy, UppsalaUniversity for undertaking my Master’s thesis.

I would also like to thank my thesis coordinator, Prof. Olle Eriks-son, for organizing my thesis presentation and for helping me toregister my thesis work.

I am also deeply indebted to Mr. Marcus Erlandsson, CTO atORSoC AB Stockholm, for offering me this thesis opportunity.I would also like to thank other employees at ORSoC AB andespecially Mr. Krister Karlsson and Mr. Yann Vernier for theirguidance.

Sincere thanks to my friends and family who gave me courageand support throughout this thesis work.

i

CONTENTS

Acknowledgments i

Contents iii

List of Figures v

Listings vii

1 Introduction 1

2 Background 32.1 Introduction 32.2 Formal and Simulation based Verification Methodolo-

gies 32.3 Directed and Random verification techniques 5

3 OpenRISC Processor and Tools 73.1 Introduction 73.2 OpenRISC 1200 Soft Processor 73.3 Or1ksim Instruction Set Simulator 93.4 ORPSoC and Icarus RTL Simulator 93.5 Cycle Accurate C++ Model 103.6 GTKWave Waveform Viewer 10

4 Co-Simulation Testbench 114.1 Introduction 114.2 Random Program Generator 114.3 Testbench Controller 13

5 Implementation Details 195.1 Introduction 195.2 Random Program Generator 205.3 Testbench Controller 265.4 Sniffer Modules 305.5 Checker Module 315.6 Coverage Monitor 32

6 Results and Discussion 35

iii

6.1 Introduction 356.2 Discrepancies between OR1200 and Or1ksim 356.3 The OpenRISC 1200 verification coverage results 39

7 Conclusion and Future Work 457.1 Conclusion 457.2 Future Work 45

Bibliography 47

Appendices

A Source Code 49

B Example test programs 51

iv

LIST OF FIGURES

1.1 Overview of the Co-Simulation testbench 2

3.1 Overview of the OpenRISC 1200 IP Core 8

4.1 Main components of the Co-Simulation Testbench 114.2 Design of Testbench Controller 144.3 Register queues 16

5.2 Steps to generate random test program 235.3 Conceptual representation of code memory 235.4 Markov model for instruction class selection 245.5 Random distribution for unequal weights 255.6 Random distribution containing equal weights 255.7 constraints in branching and function calls 255.8 Start-up sequence of Testbench Controller 275.9 Termination due to watchdog timer 285.10 Termination due to resource constraints or simulation

errors 295.11 Normal termination of TBC 30

6.1 Testbench waveform for l.rori instruction 376.2 Testbench waveform for illegal load/store instruction 386.3 Instruction coverage for a test suite 416.4 Instruction histogram for a test program1 416.5 Instruction histogram for a test program 416.6 Register coverage for a test suite 426.7 Register coverage for store instruction class 426.8 Register coverage for branch and call instructions 43

v

LISTINGS

Example program illustrating resource collision . . . . . 15Configurations for the Random Program Generator . . . 20Configuration file to specify instruction format . . . . . . 21Configuration file to specify instruction weights . . . . . 22Algorithm to implement weighted random distribution . 24Configurations for the Testbench Controller . . . . . . . 26Data structure implementing the Register queue . . . . . 32Code snippet illustrating use of l.mfspr instruction . . . 35simulation results for l.mtspr instruction . . . . . . . . . 36Code snippet for l.rori instruction . . . . . . . . . . . . . 36Simulation results for l.rori instruction . . . . . . . . . . 36Simulation results for l.ror instruction . . . . . . . . . . . 37Simulation results for l.ld or l.sd instruction . . . . . . . 38Source code containing l.sub instruction . . . . . . . . . 39Simulation results for l.sub instruction . . . . . . . . . . 39run time instruction traces for test program containing

l.sub instruction . . . . . . . . . . . . . . . . . . . . 40Code snippet for disabling caches . . . . . . . . . . . . . 40test1.s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 51test2.s . . . . . . . . . . . . . . . . . . . . . . . . . . . . 56run1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60run2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64

vii

1

INTRODUCTION

System on Chips (SOC’s) of today are very large systems, typicallyconsisting of processor clusters (master CPUs, DSPs, hardwareaccelerators etc), multiple-level data memory and program memorysystems (L1, L2, L3 memories and caches), on-chip interconnectionsystems (buses, crossbars, bridges), DMA1

1. DMA, Direct Memory AccessIO, Input/Output

controllers, IO’s andvarious system peripherals. The design of such a SOC is challenging.Typically, it is not the actual implementation, but instead theverification, where most of the project time is spent. It is estimatedthat roughly 70 percent of the total effort for hardware designs isinvested in functional verification (Bin, 2012). As a consequence ofthis, several different verification methods have been developed overtime, such as simulation, hardware emulation, formal verification,and assertion-based verification.

A microprocessor is a central SOC component, in the shapeof a control CPU, a DSP, an ASIP2 2. CPU, Central Processing

Unit;DSP, Digital Signal Processor;ASIP, Application Specific In-struction set Processor

, or a co-processor. Its designmay be verified by executing (lots of) test programs in a simulationtestbench. Typically the costs incurred for the verification of aprocessor design is very high. For example, the verification ofa RISC3

3. RISC, stands for ReducedInstruction Set Computing.

System/6000 processor involved approximately fifteenbillion simulation cycles, large engineering teams and hundreds ofcomputers during a year (Aharon et al., 1995). Also substantialamount of money is spent during design re-spins. For example, thePentium FDIV bug found in the Intel P5 Pentium floating pointunit (FPU) and the bug in Intel Sandy Bridge accounted for lossof millions of dollars (Lagoon, 2012).

This thesis work outlines the verification plan for the func-tional verification of an Open RISC processor core, OR1200 fromOpencores4 4. OpenCores is the world’s

largest site/community for devel-opment of hardware IP cores asopen source. One can find thesource code for different digitalhardware projects at,http://OpenCores.org

. The verification methodology suggested here is tocompare the actual resulting behavior to the expected behavior,for a certain test program and stimuli input data. The goal ofthe thesis is to design and implement a Co-Simulation testbenchplatform together with a framework for generating random as-sembly programs for the functional verification of an OpenRISCprocessor. A Random Program Generator based on configurableinstruction weights is developed to generate large test volumes.The co-simulation testbench platform interfaces the existing twodifferent models of the processor, i.e. the RTL-model of OR1200

1

http://OpenCores.org

core55. The RTL-model is simulatedusing Icarus Verilog which isan open-source Verilog sim-ulation and synthesis tool

and the ISS, Or1ksim C simulator6

6. Or1ksim is an OpenRISC1000 architectural simulator,

providing instruction level simu-lation of the processor and com-

mon peripherals.http://opencores.org/or1k/

Or1ksim

. The testbench controller,a module of the co-simulation testbench, executes the test programsimultaneously on the two processor models and compares thestate of the processor on a cycle-by-cycle basis. The verificationresults are quantified by implementing a coverage monitor, whichmeasures the verification coverage and builds coverage statistics.Figure 1.1 provides an overview of the co-simulation testbench.

Figure 1.1: Overview of the Co-Simulation testbench

A similar study was performed in (Ahmed, 2010) which appliesan OVM7

7. OVM is an acronym forOpen Verification Methodology.

framework to provide extensive comparison betweenthe two models of the processor. Although (Ahmed, 2010) pro-vides a good verification platform for the functional verification ofOR1200 core, it would be possible to achieve better results usingrandom verification techniques. In this thesis, we investigate theverification results obtained by tests program generated from a ran-dom program generator. A random program generator generatesassembly programs based on the concept of constrained random-ness. This means a combination of randomness and constrainedprogramming rules. The proposed method is advantageous as therandom generator can in a very short time create a large numberof test programs of arbitrary (however reasonable) size generatingenormous test volumes. Another advantage of random testing isthat it can over time, produce odd and strange programmes withunusual instruction sequences, that few human programmers willever think of. Very often these tests are the once that will hitcorner and extreme cases, and they can reveal bugs that would nothave been found by manual testing.

The next chapter discusses different hardware verification tech-niques covered during the literature review. The subsequent chap-ters focus on the design and implementation of the random programgenerator and the testbench controller. The report is concluded byanalyzing the verification and coverage results.

2

http://opencores.org/or1k/Or1ksim


2

BACKGROUND

2.1 introductionDesign verification is a process to determine if the implemented de-sign accurately represents the intended specification. New processtechnologies, higher chip complexity and increasing clock frequencycumulate the cost for verifying hardware designs. This has moti-vated researchers to investigate new approaches to improve andfasten the hardware verification process. This chapter detailsthrough the different verification methodologies and approachesused for verifying SOC designs. Based on the literature review, theSOC hardware verification can be broadly divided into two types:pre and post silicon. The thesis work is delimited to the functionalverification of SOC’s applied for Pre-silicon designs. Pre-siliconverification employs two major families of solutions: Simulationbased Verification and Formal Verification.

2.2 formal and simulation based verificationmethodologies

Formal verification is a technique that involves generating math-ematically proofs for the behavior of the design. It employs as-sertions based methodologies for the functional verification of theproperties of the system. It works aiming to prove or disprove aproperty of a hardware implementation using logical and mathe-matical models. Equivalence checking and Property checking formthe two major categories of formal verification techniques (Ahmed,2010). Although this methodology can possibly verify all possiblebehaviors of the model for all time instances and for all inputvariations, their extensive computational requirement limits itsusability for verifying small to moderate designs.

Simulation based techniques on the contrary are not limitedto such constraints. It can be seen as verification through inputspace sampling. In this method the functional correctness of ahardware model can be checked against higher level references orinstruction set simulators. These techniques typically consists of atestbench controller and a DUT1

1. DUT is an abbreviation forDevice Under Test.

. The testbench controller feeds

3

input test vectors to the DUT. The next state of the DUT is deter-mined based on its present state and the input from the testbenchcontroller. The computed states of DUT are then verified againstexpected states of the design. Unless all points are sampled, thereexists a possibility that an error escapes the verification process.So in simulation based verification techniques it is necessary toimplement coverage monitors to generate coverage statistics. Thishelps to keep track of the covered scenarios thereby increasing theconfidence on device correctness.

literature review(Aharon et al., 1995) discusses the design of a model based

test generator comprising a formal model of processor architecturetested against a behavioral simulator. In order to generate betterquality tests, it uses heuristic database capturing the testing exper-tise of test engineers. The verification platform succeeded in findingbugs in different derivatives of a PowerPC architecture. (Ahmed,2010) in his MSc dissertation discusses a Co-Simulation techniqueto verify RTL model of OpenRISC OR1200 core against a GoldenReference Model implemented in C. It applies an OVM frameworkto provide extensive comparison between the two models of theprocessor. (Semeria et al., 2002) also discusses a Co-Simulationtechnique to verify a processor RTL model using behavioral GoldenReference Model in HDL. An interesting Post-Silicon verificationtechnique is proposed in (Wagner and Bertacco, 2008) which can beextended to Pre-Silicon methods. The idea is to identify an inverseinstruction or a procedure for every instruction in the processor’sISA2

2. ISA is an acronym for In-struction Set Architecture . An initial random state of the processor is assumed and

known in prior to generating a random program. The differentinstructions and their inverse operations make the system to reacha final state equal to its initial state. A bug is discovered if afterthe execution of the program the sytem does not reach the initialstage. Since the same system is used to verify its correctness iteliminates the need for a Golden Reference Model. But this ap-proach is limited to test only a subset of the processor’s ISA, sincenot all instructions have an inverse.

Simulation based verification techniques are either structuralor functional in nature. Structural techniques are developed totarget the functional components like ALU’s33. ALU is an acronym for Arith-

matic Logic Unit, shifter, multiplier,

interconnect components that guide the data flow and storagecomponents like the register file. (Corno et al., 2003), (Hudec,2011), (Wagner and Bertacco, 2008) propose different structuralsimulation methods. Although structural techniques achieve highfault coverage for the functional components, they yield low faultcoverage for pipelined logic and system co-processors (Psarakiset al., 2006). Pipelined logic of a processor typically consists of

4

pipeline registers, hazard detection and resolution logic includingbypass hardware for forwarding and pipeline interlocks. (Psarakiset al., 2006) proposes a methodology which enhances the existingtest programs generated for structural verification techniques toachieve high fault coverage for the pipeline logic.

Along with the different methodologies used for hardware ver-ification it is also important to consider the techniques used togenerate the test inputs in the verification process. There are twomain approaches to achieve this: Directed verification and Randomverification. In the following section we will discuss and comparethese techniques.

2.3 directed and random verification techniquesTraditionally, simulation-based verification uses directed verifica-tion approach. Directed verification techniques4 4. Directed verification is some-

times also referred as happytesting or smoke testing.

are considered"white-box" techniques in which deterministic test sequences aregenerated to test the functionality of a design. Considering anexample of verifying the functionality of a microprocessor, directedverification can consist of test programs written to test a certain ora small group of related instructions. A self-checking test programconsists of four sections; preparation of stimuli input data, prepara-tion of expected result, executing the test program to generate theactual result and comparison of actual and expected result. Theadvantage of directed testing is that it is a fairly quick and easymethod for the very first verification of an instruction. It givesearly simulation feedback, so that the most embarrassing bugs canbe fixed.The disadvantage of directed testing is that it is a very slow andtedious method for exhaustive verification. For a common instruc-tion such as add , the number of combinations of input arguments(registers and immediate) can be large. The result can hit severalvarious conditions (overflow, negative/positive, zero, carry). Also,the execution of the instruction may depend on additional controlinformation stored in configuration registers.

Random verification techniques sometimes also known as Go-rilla testing5

5. It is a slang used for torturetesting a product to see whatlevel of abuse will cause it tobreak or fail.

is a technique of generating random test programswith the purpose of testing any instruction, or any sequence ofinstructions. It is generally considered as a "black-box" testingtechnique as one does not need to rely on the internal details ofthe model. Considering the previous example of verifying the func-tional design of the microprocessor, the test programs are generatedby a random program generator consisting of pseudo-randomlyselected instructions. These random test programs are likely togenerate test cases that can verify hidden scenarios unlikely to becovered by manual directed verification techniques. Automatic test

5

benches are typically set up to generate and verify large volumesof random programs. This can result in massive simulations whichhelps to increase the confidence on device correctness.

Random test generation can be again classified as either staticor dynamic. In static random test generation, the system is initiallyset to a random state to generate random test programs. Theserandom test programs are used by the test bench controllers ormodel simulators. Dynamically biased random test generationrelies on a simulation feedback to generate random programs on-the-fly yielding high coverage statistics.

literature reviewSeveral dynamic random test generation methods have been

proposed in the literature. (Corno et al., 2003), (Hudec, 2011) pro-pose automatic test generation technique for functional verificationof processor cores using evolutionary approach that significantly im-proves the fault coverage. (Corno et al., 2003) exploits MicroGP, anapproach which utilizes a Directed Acyclic Graphs (DAG) for rep-resenting the sequential flow a program. It associates the generatedtest programs with fitness values, which are used to probabilisticallyselect the parents for generating new test programs. The fitnessvalue of a test program measures its efficacy and its potential toexcite new faults. (Hudec, 2011) discusses a similar evolutionarytechnique using genetic algorithms to improve the fault coveragefrom the generated test programs.Although dynamic techniques can improve the quality of generatedtest programs, their throughput is generally less than the staticgenerators. Also significant design and implementation effort isneeded to develop random generators with dynamic techniques.Modifying the existing generators is also difficult as they are oftencustom made for specific architectures. In this thesis a randomgenerator using static techniques is implemented together witha coverage monitor to evaluate its performance. The design ofthe random program generator is separated from the testbenchcontroller which in future is helpful to upgrade or replace withbetter design implementations.

6

3

OPENRISC PROCESSOR ANDTOOLS

3.1 introductionThis chapter briefly discusses the OpenRISC OR1200 processorarchitecture, the functionality of which we aim to verify in thisthesis. Latter sections focus on the available simulators and thetools used in this thesis.

3.2 openrisc 1200 soft processorThe OpenRISC project was started in 1999 by a group of Slove-nian university students (Baxter, 2011). The project aimed todefine and implement free and open source families of RISC mi-croprocessor architectures. As a result of this the OpenRISC 1000architecture, the first set of specification for a family of 32- and64-bit RISC1

1. RISC is an abbreviation forReduced Insturction Set Com-puting.

/DSP processor was developed. Its open and modulararchitecture targets medium and high performance application andallows a spectrum of chip and system implementations. The archi-tecture defines the instruction set, the addressing modes, exceptionhandling mechanisms, cache model and cache coherency, memorymodel, memory management and also the software ABI2

2. ABI is an abbreviation forApplication Binary Interface.

(Lampret,2000). The OpenRISC 1000 architecture is licensed under GNUGPL and can be accessed freely at OpenCores.org. A good overviewof open source hardware development can be found in (Baxter,2011). The author reviews the OpenRISC architecture and itsimplementations.

The OpenRISC OR1200 is an open source synthesizable CPUimplemented and maintained by the developers at OpenCores.org3 3. OpenCores is currently owned

and maintained by ORSoC ABwhich is a fabless ASIC designand manufacturing services com-pany headquartered in Stock-holm, Sweden.

.The RTL implementation follows a 32-bit scalar RISC with Har-vard microarchitecture, 5 stage single issue integer pipeline, virtualmemory support and basic DSP capabilities (Damjan Lampret,2001). The implementation also specifies a register set, cache oper-ation, power management unit, programmable interrupt controller(PIC), debug unit and tick timer. Other peripheral systems canbe added using the processor’s implementation of a standardized32-bit Wishbone bus interface. An overview of the OpenRISC

7

OpenCores.org

OR1200 core is illustrated in Fig. 3.1.

Figure 3.1: Overview of OpenRISC 1200 IP Core retrieved May 20, 2012,from http://opencores.org/openrisc,or1200

The OR1200 is intended to be used for embedded, portable me-dia, networking, home entertainment and automotive applications.The latest application includes a product developed at ORSoC AB,to ease connection between different units in a spacecraft. TheRTL implementation of OR1200 core is written in Verilog hardwaredescription language and is released under the GNU Lesser GeneralPublic License (LGPL). The source code, test software and relatedscripts can be downloaded from the OpenRISC project subversion(svn) repository.

The verification platform developed in this thesis is implementedusing Ubuntu desktop operating system (version 11.04). In orderto compile the test programs for OpenRISC OR1200 architectureit is necessary to setup the GNU toolchain4

4. It contains the packagesconsisting of binutils, GCCand GDB. The instructions

to set up the toolchain can befound at http://OpenCores.org.

. The test programsare compiled to ELF5

5. ELF is a standard file for-mat. It is an abbreviation for

Executable and Linkable Format

file format which can be executed on boththe ISS6

6. ISS is an acronym for Instruc-tion Set Simulator.

and RTL simulators. It is also possible to use Virtual-Box Ubuntu image 7

7. The current VirtualBox imagecontains Ubuntu 11.10 Linuxdistribution with the Open-Risc tool chain installed. Itcontains all the tools, pack-ages and also the simulatorseliminating installation prob-lems. It can be downloadedfrom http://OpenCores.org.

which makes it easy to get started with theOpenRISC processor system. The test programs can be executedon bare metal, i.e. without using any operating system support,or as a linux application. The latest Linux kernels are ported tothe OpenRISC architecture which currently supports statically

8

http://opencores.org/openrisc,or1200



compiled binaries. This thesis work is carried out by executing thetest programs on ‘bare metal’.

3.3 or1ksim instruction set simulatorOr1ksim is an Instruction Set Simulator (ISS) for the OpenRISC1000 architecture and serves as a Golden Reference Model for thefunctional verification of the core. It is an open source simulatorand is implemented using C programming language. It can befreely downloaded from the OpenCores website8 8. http://opencores.org/or1k/

Or1ksim. Jeremy Bennett

from Embecosm9

9. http://www.embecosm.com/

has lead the contributions to Or1ksim and iscurrently responsible for the maintenance of the tool.Or1ksim models all major OpenCores peripherals and system con-troller cores. It also enables easy addition of new peripheral models.The tool uses a configuration file to configure different environmentslike memory and peripheral device configurations, and processormodels. It provides remote debugging facilities via TCP/IP withthe GNU Debugger (GDB) using either the GDB Remote SerialProtocol (RSP) or through a modeled JTAG interface. Thesefeatures makes Or1ksim an effective OpenRISC emulator allowingearly code analysis and to evaluate system performance.

Or1ksim can be used as a standalone simulator or as a library.The interactive command-line interface can be used to single stepthrough the code, insert breakpoints and to view register andmemory contents at run time. The user manual (Bennett, 2008)describes the installation and configuration of the simulator. It alsodetails the commands needed to work in interactive mode. TheOr1ksim source code is modified and the capabilities of interactivecommand-line interface are extended to interface the ISS with thetestbench controller. More implementation details are discussed insection 5.4.

3.4 orpsoc and icarus rtl simulatorIcarus Verilog is a Verilog simulation and synthesis tool written byStephen Williams10 10. The documentation and bug

support for Icarus Verilog canbe found at,http://iverilog.icarus.com/

.It can compile source code written in Verilog(IEEE-1364). The compiler can generate an intermediate formatcalled vvp assembly for batch simulation which can be executed bya vvp command. For synthesis, the compiler generates net lists inthe desired format.

The RTL-model of the OpenRISC OR1200 can be simulated us-ing Icarus Verilog. OpenCores.org has created the ORPSoC projectwhich provides a development platform for OpenRISC processorand OpenRISC based SOC’s targeted at specific hardware. Acomplete documentation of ORPSoC project can be found at (Bax-

9



http://www.embecosm.com/

http://iverilog.icarus.com/

OpenCores.org

ter, 2010). The project organization separates the implementationfor the reference design from the board specific designs. The im-plementation includes RTL description of the processor core andrelated peripherals and the software consisting of CPU drivers andrelated device drivers. The platform also provides custom tools,intended to be run on the host, for manipulation of binary softwareimages. As the thesis work is limited to verify the functionalityof the OpenRISC core the reference design is used for simulation.The section ReferenceDesign in (Baxter, 2010) provides a goodoverview of the project, illustrating structure of the reference designand details the steps needed to simulate it using the Icarus VerilogSimulator.

3.5 cycle accurate c++ modelThe simulation rate of Icarus Verilog, the event-driven simulator,discussed in section 3.4 is far from the simulation rate of the ISS.The simulation speed can be improved by using a C++ cycleaccurate model generated using the Verilator tool11

11. Verilator is an open sourcetool to translate Verilog code tooptimized C++/SystemC mod-ule. More information can be

found at,http://www.veripool.org/wiki/

verilator/Intro

. This modelgenerated from the OpenRISC RTL-model is capable of runningsimulations at hundreds of kilohertz range, as opposed to thesub-ten kilohertz range simulation experienced with Icarus Verilog(Baxter, 2011). In spite of this performance advantage the testbenchcontroller is interfaced with the Icarus Verilog Simulator to avoidunforeseen bugs caused during code translation. Thus assumingthat the behavior of the Golden Reference Model (ISS) is alwayscorrect, one can be certain that the bugs discovered from theverification process exists in the RTL-model.

3.6 gtkwave waveform viewerIt is necessary to log the state of the processor after executionof every single instruction. This can be accomplished by using astandard dump file format of verilog called VCD12

12. VCD is an acronym forValue Change Dump.

. The ORPSoCplatform discussed in section 3.4 provides configurations to enablethese logs. Since there is no built in waveform viewer in IcarusVerilog, we use the GTKWave13

13. The tool can be downloadfrom,

http://gtkwave.sourceforge.net/ , an open source tool to examinethe waveform. It makes use of the VCD files to generate thewaveforms.

10

http://www.veripool.org/wiki/verilator/Intro

http://www.veripool.org/wiki/verilator/Intro

http://gtkwave.sourceforge.net/

4

CO-SIMULATION TESTBENCH

4.1 introductionThis chapter discusses the design for the Testbench Controller. ACo-Simulation testbench is implemented to interface two differentexiting models of the processor, i.e. the RTL-model and Or1ksim,the ISS. ISS is used as a Golden Reference Model and is assumedto be produce expected results. A Random Program Generator isimplemented to generate test programs. Each of the test programsis executed simultaneously by both processor models and theirstates are compared after execution of each instruction. A Checkermodule is also implemented as part of the Testbench Controller tolog the differences in the states of the two models. The TestbenchController implementation also includes a coverage monitor moduleto measure the verification coverage and generate coverage statistics.An overview of the components is presented in 4.1.

Figure 4.1: Main components of the Co-Simulation testbench

In the following sections the design issues involved with each ofthe components shown in figure 4.1 are discussed. The implemen-tation specific details are covered in the next chapter.

4.2 random program generatorThe Random Program Generator is a module to generate testprograms1

1. The test programs are gener-ated in assembly language andis converted to binary using anassembler.

by randomly choosing instructions from the OpenRISCISA2

2. ISA is an abbreviation forInstruction Set Architecture.. It’s design is based on constrained randomness, which

involves certain programming rules to generate compilable code.Some of the rules can be,

11

• The generator must produce existing instructions with valid(legal) arguments.

• The target address of a branch instruction must be a validprogram location.

• A number of branches or subroutine calls must not create aninfinite loop.

• After a call to a subroutine there must be a return fromsubroutine.

• A software loop must have a branch back to the loop start.

• No attempts to pop elements from an empty stack or pushelements on a full stack.

The Random Program Generator is implemented as a standalonecomponent and is not part of the testbench controller. This ap-proach allows to upgrade or replace the generator with betterdesign implementations. The generator can also be configured toiteratively produce large volumes of test programs which can beexecuted in the Co-Simulation testbench in a batch. The initialseed value used by the generator decides the instructions chosen forthe test program. So it is important to log the initial seed valuesto reproduce interesting test programs.

The instruction types, their argument values, their location arelargely dependent on the ISA. An instruction configuration file iscreated consisting of a table with an entry for each instructionto hold the name and format of the instruction. The RandomProgram Generator uses this configuration file to generate mean-ingful instructions. The occurrence of an instruction in the testprogram is controlled by instruction weighting. In this mechanismeach instruction is associated with a weight, that indicates theprobability of being picked by the generator. A high weight meansa high probability and therefore many instances of the instructionin a program. A low weight means low probability and thereforeresults in few instances of the instruction. A weight of zero forcesno instances of the instruction in the test program. Together allweights form a probability distribution function for the instructionset. If a test program that contains only a few selected types ofinstructions to be generated, the corresponding weights are set toa number larger than zero and the weights of all other instructionsare set to zero. The instruction weights are supplied to the RandomProgram Generator using another configuration file.

12

literature review(Wagner et al., 2007) describes a Markov-model-driven ran-

dom test generator. It abstracts the individual instructions toinstruction classes which constitute the vertices of the Markovmodel. This method can be used to regulate the generation ofcertain classes of instructions3 3. Examples for instruction

classes can be arithmetic, log-ical, branch etc.

which otherwise requires maskingof large number of individual instructions. A template for thedesign verification of microprocessors and system-on-chips,SOC’sis presented in (Uday Bhaskar et al., 2005). The platform designedin our thesis work closely resembles their proposed template exceptfor the Debugger module, which is not included in our architecture.If inconsistencies are observed between the models, the test pro-grams can be run individually on either models and the existingdebugging facilities can be used.

4.3 testbench controller

The Testbench Controller is responsible for the overall simulationprogress and control. A detailed design of the Testbench Controlleris shown in in figure 4.2. The Testbench Controller starts byloading the same test program into the simulators. A softwarereset causes the processor to load predetermined values into theregisters and the data memory. The Testbench Controller controlsthe simulation progress of the Or1ksim ISS simulator throughstep commands. The Icarus RTL simulation advances by usingthe internal clock generator. Sniffer modules are included in thecontroller to interface the Or1ksim, ISS simulator and the IcarusRTL simulator. After execution of each instruction the sniffermodules collect changes in the processor status and register values.The sniffer module also keep track of the data memory locationbeing addressed during a store operation. The information from thesniffer is updated to the checker module which compares and logsthe differences in the processor states. If the number of differencesexceed some configured maximum limit, the checker signals theTestbench Controller to end the simulation. In normal scenariosthe simulation is completed after execution of the last instructionin the test program. If the test programs are run in a batch,the simulation process is repeated by loading the next program.The Testbench Controller terminates after executing the last testprogram. The design also includes a coverage monitor which is runoffline to build instruction coverage statistics.

13

Figure 4.2: Design of Testbench Controller illustrating the internal con-stituent modules.

4.3.1 Checker Module

As discussed in section 4.3 the Checker module is responsible forcomparing the states of the ISS to the RTL-model in order todetermine if the two models of the microprocessor have the samebehaviour. Since the two processor models run in different levelsof abstraction they might generate different results due to pipelineeffects. Typically only the RTL-model implements the processorpipeline while the ISS does not. Also as the two models run witha different perception of time4

4. Or1ksim ISS simulator isinstruction accurate and the

Icarus RTL Simulator is cycleaccurate

they are not perfectly synchronizedto each other. These differences cause problems, namely skew andcollision, which has to be handled by the checker module.

14

Skew Problem

Even though the two models run the same program in parallelthe result of the same instruction may be produced at differentpoints in time. Also during certain periods of the simulation, onemodel may be slightly ahead of the other model, and thereforeproduce few more results than the other model. This typicallyhappens after branch instruction, because pipeline refill latenciesoccur in the RTL-model but not in the ISS. So, the results fromthe two models are skewed in simulation time. (John L. Hennessy,2006) lists different reasons for pipeline hazards that can stall theprocessor.

Collisions in pipeline

Besides the skew problem, the RTL-model might generate differentresults due to collisions in the pipeline. If the RTL-model imple-ments an exposed pipeline then all the instructions are executed inthe way they are injected into the pipeline. Whereas in the case ofan interlocked pipeline the data dependencies and resource collisionsare resolved at run time. A simple example program 4.1 illustratesresource collisions in a processor implementing interlocked pipeline.

mac r0, r1, r2 /* Mult-and-acc:r0 = r0 + r1*r2, 2 exe cycles*/

add r0, r11, r12 /* Addition:r0 = r11 + r12, 1 exe cycles*/

Source Code Listing 4.1: Example program illustrating resource collision

The mac-instruction (having 2 execute cycles) and the subse-quent add instruction (having 1 execute cycle) both attempt toupdate register r0 in the same cycle. The ISS will typically storetwo different values for register r0, and therefore the ISS Snifferwill receive two values for register r0. In the RTL-model the tworesults for register r0 collide. Now, the pipeline may decide tokeep the result from the add-instruction (because add happensafter mac) and throw away the result from the mac-instruction. Inthis case the RTL sniffer will receive only one value for registerr0 (instead of two). The sequence of instructions causing resourcecollisions are limited and should be filtered by the checker module.

Using Register Queues

The skew problem discussed in 4.3.1 can be solved by using RegisterQueues. Usage of register queues enables timing, data-flow andordering control. The idea is to maintain two queues for each ofthe processor registers. The ISS sniffer pushes register updates

15

into a queue (ISS register queue) obtained from the Or1ksim ISSsimulator and the RTL sniffer uses the second queue (RTL registerqueue) to push the register values obtained from the Icarus RTLsimulator. In order to reduce the complexity and the number ofregister entries to be checked, only the changes in the register valuesare pushed. The depth of the queue is determined by the length ofthe processor pipeline but a typical value of 10 to 20 entries can beused. When the queues reach their capacity, the checker modulereads and compares the results of simulation. After comparisonidentical entries are popped out from the top of the queues. Atthe end of the simulation the queue pointers are compared andthe residual values in the queues are logged for analysis. A similarapproach is used for the comparison of data memory locations.Figure 4.3 is an example of a register queue.

Figure 4.3: A typical structure of Register queues

4.3.2 Coverage MonitorThe Coverage Monitor is responsible for measuring the verificationcoverage and building coverage statistics. The Coverage Monitorbasically contains a number of counters that detect and countthe occurrence of certain events. Some examples of verificationcoverage can be,

16

• Instruction coverage to measure the occurances of each in-struction.

• Sequence coverage to measure the occurance of certain se-quence of instructions.

• Register coverage to measure the instances it has been usedwith a particular instruction.

• Branch coverage to measure conditional branches that aretaken.

The number of times an instruction occurs in a program codecan be different than the number of times the instruction has beenexecuted. The program flow is non-deterministic because of looping,conditional branching, interrupts and other data dependent factors.So the coverage monitor has to use the execution trace instead ofthe static code to generate coverage statistics.

17

5

IMPLEMENTATION DETAILS

5.1 introductionThis chapter details the implementation issues for the RandomProgram Generator and the Testbench Controller. Later sections il-lustrate the modifications made to the processor models to interfaceit to the Testbench Controller.

The Testbench Controller and the Random Program Generatorare implemented using C programming language. The source codecan be checked out

|-trunk|-|-random-testing|-|-|-tb_controller.c|-|-|-registerfile.c|-|-|-rtlsniffer.c|-|-|-logger.c|-|-|-coverage.c|-|-|-global.c|-|-|-randomgen|-|-|-|-randomgen.c|-|-|-|-generateInst.c|-|-|-|-confutils.c|-|-|-|-markov.c|-|-|-|-.seed|-|-|-|-configweight.cfg|-|-|-|-configinst.cfg|-|-|-output|-|-|-|-test1.elf|-|-|-|-test2.elf|-|-|-|-test3.elf|-|-|-log|-|-|-|-instructions|-|-|-|-log|-|-|-|-test1|-|-|-|-|-arith|-|-|-|-|-logical|-|-|-|-|-branch|-|-|-|-|-call|-|-|-|-|-load|-|-|-|-|-store|-|-|-|-|-move|-|-|-|-|-other|-|-|-|-|-instlog|-|-|-|-|-simlog|-|-|-|-|-log|-|-|-docs|-|-|-www|-|-orpsocv2|-|-or1ksim

Figure 5.1: Project software hierarchyThe source code can be checked outfrom SVN repository1 .

1

1. use the commandsvn co http://svn.orsoc.se/svn/random-generatorfrom subversion (SVN) repository

http://svn.orsoc.se/svn/random-generator

Figure 5.1 outlines the organization of the project. For the sake ofclarity only the .c files under the trunk\random-testing are listed.

The trunk\random-testing directory contains the implemen-tation for the Random Program Generator and the TestbenchController. As discussed in section 4.2 the implementation of theRandom Program Generator is separated from the Testbench Con-troller. The source code for the Random Program Generator can befound under random-testing\randomgen, while the files directly un-der trunk\random-testing constitute for the Testbench Controller.The RTL model of the OR1200 core is saved in trunk\orpsocv2 di-rectory while the Or1ksim ISS model is saved under trunk\or1ksim.Information on setting up the simulators can be found in Sections3.3 and 3.4. The make all command run from the trunk\random-testing creates test programs under the random-testing\randomgendirectory. The number of test programs generated is defined by thevariables in the makefile. Then the make compiles the generatedtest programs. The output of the assembler, i.e. the test programsin elf binary format are stored under the trunk\output directory.The Testbench Controller is run using ./tbc command. Simulationlogs are generated under the trunk\log directory. The logs containthe run-time code executed by the ISS and the differences in thestates of the simulator. The simulation logs and the elf outputfiles are cleared when make clean or make all is executed.

19

http://svn.orsoc.se/svn/random-generator

The implementation work is documented using Doxygen22. Doxygen is an open sourcedocumentation system support-ing various programming lan-guages. More information can

be found at,http://www.stack.nl/~dimitri/

doxygen/

. Thehtml and pdf versions of the source code documentation can befound under trunk\random-testing\docs directory. The Testbenchcontroller and Random Program Generator support different userconfigurations. The following sections in this chapter discuss theseconfigurations and details the implementation aspects for each ofthe modules.

5.2 random program generatorThe design of the Random Program Generator is discussed insection 4.2. The Random Program Generator can be run as a stan-dalone tool to generate large volumes of test programs. Differentuser configurations are supported which can be listed using thecommand,

./rpg -h

Listing 5.1 illustrates these configurations.

-l <file> : option to specify the instruction log file, this is aoptional argument with no default value

-c <file> : option to specify the configuration instruction file,default file = configinst.s

-w <file> : option to specify the configuration weight file,default file = configweight.s

-i <No_of_Instructions> : option to specify number of instructions to begenerated in the output program, default value = 100

-o <output file> : option to specify the name of the output fileto be generated, default name = test.s

-f <Max Procedures> : option to specify the maximum limit on number ofproedures, default value = 5

-d <No of Bytes> : option to specify the maximum number of bytes indata segment,default value = 100

-r <RANDOM_SEED> : option to specify the random seed for the generationof random markov state

-p <pattern> : option to specify the pattern for initializing memroryand registers, possible options are RANDOM, SERIAL, ZERO,default value = SERIAL

-h : lists the available help options

Source Code Listing 5.1: Configurations for the Random Program Genera-tor

A test program named test.s, with 500 instructions and con-taining a maximum of 10 functions can be generated using thecommand,

./rpg -o test.s -i 500 -f 10

20

http://www.stack.nl/~dimitri/doxygen/


The test program can be compiled to ELF file format using theassembler ported for OpenRISC OR1200 architecture.

or32-elf-gcc -o test.o test.s

As discussed in section 4.2 the generator needs two separateconfiguration files to generate random but meaningful instructions.An example of a configuration file specifying the instruction formatis shown in the listing 5.2. Listing 5.3 is an example configurationfile to specify instruction weights.

instruction l.addtype = IT_ARITHpattern = OPCODE_R_R_R

end

instruction l.addctype = IT_ARITHpattern = OPCODE_R_R_R

end

instruction l.subtype = IT_ARITHpattern = OPCODE_R_R_R

end

instruction l.multype = IT_ARITHpattern = OPCODE_R_R_R

end

instruction l.divtype = IT_ARITHpattern = OPCODE_R_R_R

end

Source Code Listing 5.2: Configuration file to specify instruction format

The execution of the Random Program Generator starts byparsing the user supplied inputs and reading the configuration files.Figure 5.2 illustrates the sequence of tasks involved in generatinga random test program. It begins by initializing the data memoryand the registers to a predetermined state. This step is importantas the simulators starts with a software reset initializing the registerand memory values to zeros. Since most of the arithmetic and logicoperations operating with zero values do not change the processorstate, they would fail to produce any interesting results. Thecurrent implementation allows to initialize the registers and datamemory with a single constant value, or with serial values, zerosand with random values. An initial seed value is chosen in order

21

instruction l.additype = IT_ARITHweight = 8

end

instruction l.anditype = IT_LOGICALweight = 5

end

instruction l.oritype = IT_LOGICALweight = 2

end

instruction l.xoritype = IT_LOGICALweight = 2

end

instruction l.mulitype = IT_ARITHweight = 4

end

Source Code Listing 5.3: Configuration file to specify instruction weights

to randomize the selection of instructions and register usage. Theseed value is either supplied as an user input or is selected as arandom number. It’s value is logged into a file which enables oneto regenerate interesting test programs.

The test program is made up of a main function and otherrandomly generated functions. The functions are made up of arbi-trary length and placed at random locations to cover maximumcode memory location. A conceptual representation is illustratedin Figure 5.3. The implementation prevents recursion and avoidsinfinite loops caused by random function calls. It is also taken carethat the target address of a branch instruction is always a validprogram location.

The process of picking random instructions is designed as a twostep process. In the first step the Random Program Generator cat-egorizes the instructions into instruction classes and uses a markovmodel to pick an instruction class at random. Figure 5.4 shows anexample of a Markov model. The instruction classes can compriseof arithmetic, logical, branch, move instructions etc, which formthe vertices of the model. Once an instruction class is selected atrandom, the next step is to randomly pick an instruction under this

22

Figure 5.2: Flow chart illustrating the steps to gener-ate a random test program

Figure 5.3: Conceptual representation of code mem-ory

23

Figure 5.4: Example Markov model for instruction class selection

class based on the weights specified in the configuration file. Aftergenerating an instruction, the next instruction class is probabilisti-cally selected using the markov model based on the weights definedover the state transitions. The registers and immediate valuesused as arguments to the instructions are also randomly selected.The Random Program Generator keeps track of the generatedinstructions and continues till the user specified limit is reached.

1 objects[5] = {A, B, C, D, E};2 weights[5] = {1, 2, 3, 4, 5};3 Reorder_Weights(weights);4 while (no_of_iterations)5 x = RandomGenerate();6 i = 0;7 while (x >= weights[i])8 x = x - weights[i];9 do10 pick objects[i];11 do

Source Code Listing 5.4: Algorithm to implement weighted random distribu-tion

Listing 5.4 illustrates the algorithm used to randomly pick theinstructions based on preconfigured weights. The algorithm ran-domly picks the objects defined in the objects array based onthe weights defined in the weights array. The algorithm runs forno_of_iterations times which is supplied as an input. The functionRandomGenerate uses a discrete uniform distribution function togenerate random numbers within a range of (1 to totweight), wheretotweight is the sum of weights defined in the weights array. Before

24

Figure 5.5: Plot showing random distribution forunequal weights

Figure 5.6: Plot showing random distribution con-taining equal weights

initiating the algorithm the weights are sorted in decreasing orderby calling the Reorder_Weights function. The algorithm whenrun for sufficiently long iterations implements a weighted randomdistribution. Figure 5.5 shows the result of the algorithm whenrun for 20000 iterations and figure 5.6 shows the distribution whensome objects are assigned with equal weights.

Although the current implementation of the program generatorrandomizes many aspects involved in creating test programs, it isbound to limitations. To include function calls in the test programthe stack has to set-up which uses architecture specific registers3

3.register r0 is fixed to zero;register r1 is used as stackpointer;register r2 is used as framepointer;register r9 is used as Link ad-dress register;

.

Figure 5.7: Figure illustrates the constraints implemented for branchingand function calls. The red lines are prohibited and are not generated byrandom program generator.

25

Register r13 is reserved to simplify accesses to data memorylocations. The algorithm to select instruction arguments does notinclude these special registers. Some limitations are also introducedto avoid infinite loops during branches and subroutine calls. Figure5.7 illustrates these constraints. The current implementation avoidsrecursions. Nested function calls are allowed but a call from thecallee to the caller is prohibited. To simplify implementation ofbranch instructions, only forward branches are used. Generationof backward branches are restrained to avoid possible infinite loopsor segmentation faults. Appendix B lists some examples of thegenerated test programs.

5.3 testbench controllerThe design of the Testbench Controller is discussed in 4.3. TheTestbench Controller handles many user configurations which canbe listed using the command,

./tbc -h

Listing 5.5 illustrates these configurations.

-o <output Dir> : option to specify the output directory-n <No of programs> : option to specify the number of programs to be generated-p <rtl sim path> : option to specify path of icarus rtl sim-e <exename> : option to specify name of exe to execute-l <error log> : option to specify name of file to log errors-i <instruction log> : option to specify name of file to log instructions-h <help> : lists the available help options, example:

./tbc -n 1 -o ./output

Source Code Listing 5.5: Configurations for the Testbench Controller

An individual test case can be run by the Testbench Controllerby using the command,

./tbc -e test

while a batch of test programs (say 100 programs) can be executedusing the command,

./tbc -n 100

The Testbench Controller runs as the main thread of the ap-plication. It starts the two processor simulators and creates fourthreads each serving a particular functionality. The threads usePOSIX message queues4

4. POSIX, is an acronym forPortable Operating System In-terface. It is an IEEE Standard1003.1-1988, which defines API’scompatible with variants of Unix

and other operating systems. for internal communication which are alsocreated by the main thread. The or1ksim_thread receives messages

26

Figure 5.8: Sequence diagram illustrating the start-up sequence of Test-bench Controller

from the Or1ksim ISS simulator and stores the register and memoryinformation into the register queues. The rtlsim_thread receivesand stores messages from the Icarus RTL simulator into the registerqueues. When the message queues are full, the checker_threadaccesses the register queues and compares the result of simula-tion. To handle unexpected unforeseen scenarios like deadlocksin the Testbench Controller or any existing bugs in the processormodel a watchdog_timer thread is used. The sequence diagram5.8 illustrates the start-up sequence of the Testbench Controller.

The watchdog_timer thread expects timely updates from thesniffer threads, i.e. or1ksim_thread and rtlsim_thread, or otherwisehalts the simulation. Such scenarios are interesting to analyze asthey can reveal serious bugs in the processor models or in theco-simulation testbench. For example, the random program canenter an infinite loop without causing any change in the statusregister. In such scenarios, the sniffer modules will not receive anyupdates from the simulators. The watchdog_timer thread wouldthen log the simulation results and terminate the co-simulationtestbench. The watchdog_timer thread executes the following stepsfor systematic termination,

27

Figure 5.9: Sequence diagram illustrating termination due to watchdogtimer

· closes running instances of the simulators

· logs differences in the register queues needed to analyze theunexpected program termination

· frees dynamically allocated data, closes the message queuesand file handlers

· terminate the threads in the order of creation and exit theapplication

Since the POSIX message queues are created in a virtual file system,it is important to close and delete them after each simulation run.Also the existing implementation forbids multiple instances of thetestbench to prevent the message queues from being overwritten.

28

Figure 5.9 illustrates the termination scenario caused by a watchdog error.

Abnormal terminations can also be caused due to system re-source constraints or simulation errors. Some examples of thisscenario are lack of memory in the heap, error in creating mes-sage queues or usage of null pointers etc. These scenarios cannot be detected by the watchdog timer but have to be handled ina similar way. Figure 5.10 illustrates the termination procedurewhich ensures logging of necessary information for post simulationanalysis.

Figure 5.10: Sequence diagram illustrating termination due to resourceconstraints or due to simulation errors.

Under normal circumstances the main thread of TestbenchController is responsible to log the register information and exitthe application. Figure 5.11 illustrates the sequence diagram fornormal termination. Even in normal scenarios it is also importantto analyse the generated logs as the residual values in the register

29

queues can reveal serious bugs in the processor models.

Figure 5.11: Sequence diagram illustrating normal termination procedure.

5.4 sniffer modulesSniffer modules indicated in figure 4.2 act as an interface to thesimulators. The Testbench Controller creates POSIX messagequeues to interact with the simulators. The Or1ksim simulatoris modified to send messages to the or1ksim_thread created bythe main thread of the Testbench Controller 5.3. The value of theProgram counter PC, Processor Status register, General PurposeRegisters GPR’s are sent after the execution of each instruction.The values of the addressed data memory location are also sentafter a store operation. In case of exceptions the Or1ksim ISSsimulator also sends the exception vector address and value ofException Effective Address Register, EEAR. The EEAR register

30

is set to the address of the instruction causing exception (Lampret,2000).

The initial implementation of the RTL sniffer used GDB Re-mote Serial Protocol, RSP, to communicate with the GDP stubimplemented in ORPSoC (Baxter, 2010). This method did notrequire any modifications in the existing RTL model. The existingimplementation of the GDB stub interfaces the verilog simulatorusing VPI5 5. VPI, Verilog Procedural In-

terface is a part of the IEEE1364 Programming LanguageInterface standard. It allows be-havioral Verilog code to invokeC functions, and C functions toinvoke standard Verilog systemtasks.

which in turn accesses the JTAG debug module externalto the processor model. To retrieve the processor state information,the debug module stalls the processor after every instruction flush-ing the processor pipeline. Since this method is slow6

6. The co-simulation run for atest program consisting of 50instructions took approximately8 minutes.

and doesnot simulate a real working scenario of a processor an alternativeapproach is used. The RTL model is modified to write state infor-mation into a POSIX pipe whenever the enable and write controlsignals for the registers/data memory are asserted. The informa-tion is read and manipulated by a sniffer module implemented inrtlsniffer.c shown in figure 5.1. The sniffer module passes the stateinformation using message queues to the rtlsim_thread created bythe main thread of the Testbench Controller 5.3. This approach,although requiring some modifications in the RTL model it does notalter the functional or timing specification of the RTL simulation.Moreover since it bypasses the RSP and JTAG communicationports it significantly improves the simulation time7 7. The co-simulation run for a

test program consisting of 50instructions took less than aminute.

.

The source code modification for the simulators to interfacewith the Testbench Controller are discussed in section A.

5.5 checker moduleThe Checker module is responsible to compare the states of thetwo processor models after execution of each instruction and isimplemented as a thread, checker_thread. The Sequence diagram5.8 illustrates the creation of the checker_thread. As discussed insection 4.3.1 Register queues are used to handle the skew prob-lem. The sniffer modules update the Register queues and sendsa message to the Checker module when they reach their capacity.The depth of the Register queue can be determined by the lengthof the processor pipeline but a typical value of 10 to 20 entriesis sufficient. The current implementation uses a queue length of16. Listing 5.6 shows the data structure used to implement theRegister queue. The Register queue, regQueue, is implemented asa singly linked list of regElem. Each of the queue is associatedwith a counting semaphore and a mutex used to block the snifferthreads, i.e. or1ksim_thread and rtlsim_thread, until the Checkermodule completes the consistency checks.

31

///This data type is used to implement register file elementstypedef struct reg_elem{

int value;struct reg_elem* nextElem;

} regElem;

/// This data type represents the register filetypedef struct{

int address;int size;pthread_mutex_t mutex;sem_t semaphore;regElem* data;

}regQueue;

Source Code Listing 5.6: Data structure implementing the Register queue

The sniffer threads when blocked cannot receive messages fromthe simulators and thereby are prevented from updating the Regis-ter queues. The simulators though can update the processor statesuntil the POSIX message queues reach their maximum capacity.8

8. The current implementationlimits the message queue size to

16 entries

There can be scenarios when the two processor models producemany inconsistencies causing the Register queue to be filled withunequal values. In such cases the Register queues remain filled,blocking the sniffer threads indefinitely. Such scenarios are de-tected by the Checker module which then logs the Register queueentries and terminates the simulation. The sequence diagram 5.10illustrates this use case.

5.6 coverage monitor

|-log|-|-test1|-|-|-arith|-|-|-logical|-|-|-branch|-|-|-call|-|-|-load|-|-|-store|-|-|-move|-|-|-other|-|-|-log|-|-|-simlog|-|-|-instlog|-|-test2

Figure 5.12: Log file hierarchy

As discussed in section 4.3.2 the Coverage Monitor module isrequired to measure verification coverage which would consist ofbuilding different coverage statistics. The current implementationis limited to cover the number of instructions tested in each ofthe test programs or in a test suite (consisting of number of testprograms). It is also possible to measure register coverage withrespect to a particular instruction used in a test program or in atest suite.

The c file coverage.c as shown in 5.1 implements the coveragemodule. It generates instruction statistics from the run time codegenerated by the ISS simulator. These statistics are stored in thelog directory which has a directory structure as shown in figure5.12. Coverage results for each of the test programs are separatedinto different sub-folders. Further the results for each instructionclass is saved into different files as shown in 5.12. PHP scripts

32

are used to parse the coverage files and an open source JavaScriptcharting library, Highcharts9 9. Highcharts currently supports

line, spline, area, areaspline,column, bar, pie and scatterchart types. More informationcan be found at,http://www.highcharts.com/.

, is used to draw interactive instructionhistograms. The results generated from the coverage module aredetailed in the next chapter under the section 6.3.

33

http://www.highcharts.com/

6

RESULTS AND DISCUSSION

6.1 introductionThis chapter summarizes the functional verification results obtainedfrom the Co-Simulation platform for OR1200 core developed inthis thesis work. It first presents the discrepancies found betweenOR1200 and Or1ksim ISS simulator. The next section details onthe verification coverage results.

6.2 discrepancies between or1200 and or1ksim

6.2.1 l.mtspr implemented as l.nop in RTL modelThe instruction l.mtspr belongs to the ORBIS32-II instructionclass of the OpenRISC1000 architecture (Lampret, 2000). Theinstruction is used to move the contents from a general purposeregister to a special purpose register. The instruction requires theprocessor to be in supervision mode. An instruction to change thecontent of Present Program Counter,PPC can be written as,

l.mfspr r3,r0,16l.addi r3,r0,460l.mtspr r0,r3,16

Source Code Listing 6.1: Code snippet illustrating use of l.mfspr instruction

The simulation results logged by the checker is illustrated inthe listing 6.2,

The simulation results show the differences in the value ofProgram Counter. The l.mtspr instruction causes the Or1ksim,ISS simulator, to jump to an instruction pointed by the new valueof PC while the RTL simulator implements the instruction as al.nop and continues with the program execution. In some scenariosdepending on the new value stored in PC after l.mtspr instruction,the execution in ISS simulator could end up in a infinite loop orcould cause a segmentation fault. Such scenarios can be easilyreproduced by loading random values to the PPC register.

35

Register r 3or1k sim rtl sim

901d

PCor1k sim rtl sim

1cc 238c1d0 23901d41d01d41d01d4

SRor1k sim rtl sim

9005

Source Code Listing 6.2: simulation results for l.mtspr instruction

6.2.2 logical error in l.roriThe instruction l.rori belongs to the ORBIS32-II instruction class ofthe OpenRISC1000 architecture (Lampret, 2000). The instructionis used to rotate contents of a general purpose register by a numberof bit positions specified by 6 bit immediate value. An instructionsequence illustrated in 6.3 is generated in a test program. Listing6.4 shows the results logged by checker.

l.addi r3,r0,7l.rori r4,r3,1

Source Code Listing 6.3: Code snippet for l.rori instruction

Register r 4or1k sim rtl sim80000003 3

Source Code Listing 6.4: Simulation results for l.rori instruction

The simulation result logged by the checker module showsresidual values generated for register r4. From the logs it is clearthat the RTL model implements the l.rori instruction as a shiftright instruction instead of a rotate instruction. After the rotateinstruction register r4 is expected to contain value of 80000003.

36

The implementation in the ISS is found to be correct as per thespecification. While the RTL model contains value of 3 as a resultof a shift operation.

Figure 6.1 shows the testbench waveforms1

1. Waveforms are viewed usingGtkwave

showing the valueof register r4 after the shift operation.

Figure 6.1: Testbench waveform showing value of register r3 being shifted instead of a rotate operation. Theresult of the operation is stored in register r4

6.2.3 Illegal-instruction exception raised for l.rorinstruction

The instruction l.ror belongs to the ORBIS32-II instruction class ofthe OpenRISC1000 architecture (Lampret, 2000). The instructionis used to rotate contents of a general purpose register by a numberof bit specified by another general purpose register. The RTL modelgenerates an illegal instruction exception when l.ror is generatedin the test programs. Listing 6.5 illustrates the result of thesimulation.

SRor1k sim rtl sim

900518001

EEARor1k sim rtl sim

0 2384

E Vectoror1k sim rtl sim

700

Source Code Listing 6.5: Simulation results for l.ror instruction

The program counter is set to 0x700 when an illegal instructionis found in the instruction stream (Lampret, 2000). The simulationresults show that the exception vector 0x700 is set only for RTL

37

model. Also it can be verified that the value of EEAR,ExceptionEffective Address Register, points to the address of the exceptioncausing instruction.

6.2.4 Illegal-instruction exception raised forload/store instructions

The instruction l.ld and l.sd belongs to the ORBIS32-II instructionclass of the OpenRISC1000 architecture (Lampret, 2000) and areused to perform double word load and store memory operations.The RTL model generates an illegal instruction exception when theyare generated in the test programs. The usage of the instructionl.lws produces simulation results shown in listing 6.6. Figure 6.2 isthe testbench waveform showing the contents of current programcounter, next program counter and EEAR register after the faultingload/store instructions. These values match with the co-simulationresults.

SRor1k sim rtl sim

900518001

EEARor1k sim rtl sim

0 2384

E Vectoror1k sim rtl sim

700

Source Code Listing 6.6: Simulation results for l.ld or l.sd instruction

Figure 6.2: Testbench waveform showing value of program counter and EEAR register after executing illegalload/store instruction.:

38

6.2.5 Multiple updates for l.sub instructionThe instruction l.sub belongs to the ORBIS32-II instruction classof the OpenRISC1000 architecture (Lampret, 2000) and is used toperform subtraction operation. Listing 6.7 shows an example of

l.addi r3, r0, 1l.addi r4, r0, 2l.sub r5, r3, r4l.addic r6, r3, 3

SRor1k sim rtl sim

88058c05

Source Code Listing 6.7: The left half of the listing shows code snippet forl.sub instruction, while the right half illustrates the simulation results

a subtraction operation followed by addition with carry operation(l.addic). The simulation results are shown in the second half ofthe listing. It is also interesting to discuss the run time tracesgenerated by the testbench controller which is also shown in 6.8.According to the architecture specification (Lampret, 2000) thel.sub instruction does not alter the Carry Flag. But the simulationresults 6.8 contradicts the specification. Moreover the results showthat the RTL model causes multiple updates to the status register(the overflow flag is initially set and then reset) resulting in residualvalues in the register queue.

6.2.6 Error in cache configuration for ORPSoCThe RTL-model in ORPSoC can be configured using refdesign-or1ksim.cfg configuration file. The current implementation doesnot configure the caches (data and instruction cache) from theinput given in the configuration files. This causes differencesin the simulation results for the processor status register. Thisconfiguration error can be avoided by programmatically disablingthe caches as illustrated in the listing 6.9

6.3 the openrisc 1200 verification coverageresults

The coverage monitor implemented is used to generate graphicalresults as explained in section 5.6. Coverage results generatedfor a test suite consisting of 50 test programs can be accessed at,http://home.student.uu.se/anka2609/index.php.

Figure 6.3 shows the coverage statistics of a test suite consistingof 200 test programs with each program consisting of 100 randominstructions. The graphs are drawn by grouping the instructions

39

http://home.student.uu.se/anka2609/index.php

or1ksim: sprs 0 : 2314 and size : 0or1ksim: sprs 0 : 2318 and size : 1or1ksim: reg 3 : 1 and size : 0or1ksim: sprs 0 : 231c and size : 2or1ksim: reg 4 : 2 and size : 0or1ksim: sprs 0 : 2320 and size : 3or1ksim: sprs 1 : 8405 and size : 0or1ksim: reg 5 : ffffffff and size : 0or1ksim: sprs 0 : 2324 and size : 4or1ksim: sprs 1 : 8005 and size : 1or1ksim: reg 6 : 5 and size : 0or1ksim: sprs 0 : 2328 and size : 5

or1ksim_thread:or1ksim_thread exitingwatchdog_timer: received LAST_INST_SIMchecker_thread: received LAST_INST_SIM

rtlsim_thread: sprs 0 : 2314 and size : 0rtlsim_thread: reg 3 : 1 and size : 0rtlsim_thread: sprs 0 : 2318 and size : 1rtlsim_thread: reg 4 : 2 and size : 0rtlsim_thread: sprs 0 : 231c and size : 2rtlsim_thread: sprs 1 : 8805 and size : 0rtlsim_thread: reg 5 : ffffffff and size : 0rtlsim_thread: sprs 0 : 2320 and size : 3rtlsim_thread: sprs 1 : 8c05 and size : 1rtlsim_thread: sprs 1 : 8405 and size : 2rtlsim_thread: reg 6 : 5 and size : 0rtlsim_thread: sprs 0 : 2324 and size : 4rtlsim_thread: sprs 1 : 8005 and size : 3rtlsim_thread: sprs 0 : 2328 and size : 5

rtlsim_thread: rtlsim_thread exitingchecker_thread: received LAST_INST_RTLchecker_thread:checker thread exitingwatchdog_timer: received LAST_INST_RTLwatchdog_thread:watchdog thread exitingtbc_main:tbc exiting

Source Code Listing 6.8: run time instruction traces for test program con-taining l.sub

#disable data and instruction cachel.mfspr r3,r0,17l.addi r4,r0,0xffe7l.and r3,r3,r4l.mtspr r0,r3,17

Source Code Listing 6.9: Code snippet for disabling caches

into instruction classes. It is possible to generate similar chartsfor any particular test program. For example the instructioncoverage for test1.s is shown in figure 6.4 and 6.5. The figure in

40

Figure 6.3: The chart illustrates Instruction coverage for a test suite consist-ing of 200 test programs

Figure 6.4: The chart illustrates instruction coveragegrouped in classes for an example test program,test1.s

Figure 6.5: The chart illustrates arithmetic instruc-tion coverage for an example test program, test1.s

the left provides coverage in terms of instruction classes. A moredetailed instruction coverage, as shown in the right figure for logicalinstruction class, can be viewed by clicking the bar of a particularinstruction class.

Figure 6.6 illustrates the register coverage statistics generatedfor three test programs, test1.s, test2.s and test3.s. Any arbitrarynumber of test programs and instructions can be selected froma drop-down list provided on the webpage. The graphs providean interactive interface allowing to further filter the results fora particular program or programs. It is interesting to see andverify register coverage for certain instruction classes. For example,register coverage obtained for all the store instructions is shown

41

Figure 6.6: The chart illustrates interactive register coverage for three testprograms

Figure 6.7: The chart illustrates coverage statistics for store instructionclass

42

in figure 6.7. The figure illustrates that r13 and r1 were the onlyregisters used to generate the store instructions. Section 5.2 liststhe architecture specific registers reserved for specific operations.Register r13 and r1 are dedicated registers, r13 being a pointerto data section and r1 is reserved as a stack pointer. Thus thestore operation which comprises of data and stack memory accessesare limited to use these dedicated registers. Another interesting

Figure 6.8: The chart illustrates coverage statistics for branch and callinstructions.

coverage is for branch and call instructions. Figure 6.8 illustratesthe coverage obtained for these instruction classes. The high cov-erage obtained for register r9 is justified as it is dedicated as aLink Address Register. From figures 6.6, 6.7 and 6.8 it can also beinferred that the coverage obtained for register r0 is always zero.This is because the architecture specifies r0 as a zero register2 2. The value of the register is

forced to 0 (Lampret, 2000)..

Also figure coverage for register r1 and r2 obtained in figure 6.6can be justified for their use in allocating and deallocating the stack.

The current implementation can only generate coverage statis-tics for the basic instruction set and floating point extensions ofthe OpenRISC core. OpenRISC DSP/Vector instruction sets arenot generated and hence do produce any coverage statistics.

43

7

CONCLUSION AND FUTURE WORK

7.1 conclusionThis thesis work was carried out to provide a co-simulation frame-work for dynamic functional verification of a OpenRISC processorcore, OR1200. To achieve this goal a Random Assembly ProgramGenerator and a Co-simulation testbench platform was imple-mented. The verification results discussed in section 6.2 substan-tiates the correctness of the Testbench Controller. Furthermorethe discrepancies found in the RTL-model are verified against thetest bench waveforms. The inconsistencies are brought into noticeof ORSoC AB1 1. ORSoC AB is the owner and

maintainer of Opencoreshttp://orsoc.se/

and are reported in bugzilla2

2. Bugzilla is an open source bugtracking tool. The bugs for theopen RISC OR1200 core can befound at,http://bugzilla.opencores.org/,under the ORPSoC category.(Bug Id: 95, 96, 97 and 98)

. We expect that theinconsistencies found in this thesis work will be updated in thespecification documents and the design of OR1200 core will bechecked for correctness.

The coverage results discussed in section 6.3 proves that the largevolumes of random test programs can be generated based on theconfigurable instruction weights. The histograms can easily beused to generate instruction, register and branch coverage statisticseither for individual test programs or for large test volumes.

As discussed in section 4.2 the Random Program Generator runsindependent of the Testbench Controller. The provision to controlthe generation of a particular class of instruction offers more flexi-bility to the generator. Also the possibility to regenerate interestingtest programs based on the input seed values avoids the need tosave large test volumes.

7.2 future workThe Co-Simulation test bench implemented as a part of this thesiswork provides basic instruction and register coverage statistics. Ex-tending the coverage monitor module to include sequence coverage,will provide interesting statistics on occurrence of certain sequenceof instructions. The configuration files can be supplemented with

45

http://orsoc.se/

http://bugzilla.opencores.org/

register weights to force usage of certain registers that have poorcoverage statistics.

The existing implementation of the coverage monitor does notfilter the discrepancies resulting from resource collisions in thepipeline. So it is required to manually analyze the simulation logsto exclude these discrepancies. Also it is difficult to analyze the sim-ulation logs to identify the fault causing instruction or instructionsequence. This method is laborious and needs careful examinationof the logs. Implementing an automatic fault detection module tolocate the faulty instruction or minimizing the search space couldsubsequently reduce the manual effort required to track the bug.

The Random Program Generator can be extended to include afeed back loop from the coverage monitor to increase the coveragestatistics after each simulation run. This method would decreasethe throughput of the generator but at the cost of producing betterprograms.

In the current implementation the memory values are comparedagainst their virtual addresses. Though this method sufficientlytests the functional behavior of the processor, the implementationcan be extended to include the memory management units andpossibly even the processor cache designs. The existing imple-mentation of Or1ksim, ISS simulator does not support Hardwarearchitecture33. Bug id 80 - support for hard-

ware architecture reported inhttp://bugzilla.opencores.org/

and also since the cache model in OR1200, imple-ments a physically indexed physically tagged design (Lampret,2000) it is difficult to include these modules.

The Test Bench Controller can be redesigned to run efficientlyon a multiprocessor or multi-core systems. This can help in im-proving the simulation speed.

46

http://bugzilla.opencores.org/

BIBLIOGRAPHY

Aharon, A.; Goodman, Dave; Levinger, Moshe; Lichtenstein, Yossi;Malka, Yossi; Metzger, Charlotte; Molcho, Moshe; Shurek, Gil;and Metzger, Yossi Malka Charlotte. 1995. Test Program Gen-eration for Functional Verification of PowerPC Processors inIBM. Cited on pp. 1 and 4.

Ahmed, Waqas. 2010. Implementation and Verification of a CPUSubsystem for Multimode RF Tranceivers. Master’s thesis, RoyalInsitute of Technology (KTH). Cited on pp. 2, 3, and 4.

Baxter, Julius. 2010. ORPSoC User Guide. OpenCores. Cited onpp. 9, 10, and 31.

———. 2011. Open Source Hardware Development and the Open-RISC Project. Master’s thesis, Royal Insitute of Technology(KTH). Cited on pp. 7 and 10.

Bennett, Jeremy. 2008. Or1ksim User Guide. Embecosm Limited.Cited on p. 9.

Bin, Eyal. 2012. Challenges in Constraint Programming for Hard-ware Verification. Tech. rep., IBM Haifa Research Lab. Citedon p. 1.

Corno, F.; Cumani, G.; Sonza Reorda, M.; and Squillero, G.2003. Fully automatic test program generation for microprocessorcores. In Design, Automation and Test in Europe Conferenceand Exhibition, 2003, pp. 1006 – 1011. ISSN 1530-1591. doi:10.1109/DATE.2003.1253736. Cited on pp. 4 and 6.

Damjan Lampret, Julius Baxter. 2001. OpenRISC 1200 IP CoreSpecification. OpenCores. Cited on p. 7.

Hudec, J. june 2011. An efficient technique for processor automaticfunctional test generation based on evolutionary strategies. InInformation Technology Interfaces (ITI), Proceedings of the ITI2011 33rd International Conference on, pp. 527 –532. ISSN1330-1012. Cited on pp. 4 and 6.

John L. Hennessy, David A. Patterson. 2006. Computer ArchitectureA Quantitative Approach. No. A2. Morgan Kaufmann, 4th edn.Cited on p. 15.

47

Lagoon, Vitaly. 2012. Constraint-Based Test Generation. Tech.rep., Cadence. Cited on p. 1.

Lampret, Damjan. 2000. OpenRISC 1000 Architectural Manual.OpenCores. Cited on pp. 7, 31, 35, 36, 37, 38, 39, 43, and 46.

Psarakis, Mihalis; Gizopoulos, Dimitris; Hatzimihail, Miltiadis;Paschalis, Antonis; Raghunathan, Anand; and Ravi, Srivaths.2006. Systematic Software-Based Self-Test for Pipelined Proces-sors. Cited on pp. 4 and 5.

Semeria, Luc; Seawright, Andrew; Mehra, Renu; Ng, Daniel;Ekanayake, Arjuna; and Pangrle, Barry. June 2002. RTL C-Based Methodology for Designing and Verifying a Multi-ThreadedProcessor. Cited on p. 4.

Uday Bhaskar, K.; Prasanth, M.; Chandramouli, G.; and Kamakoti,V. jan. 2005. A universal random test generator for functionalverification of microprocessors and system-on-chip. In VLSIDesign, 2005. 18th International Conference on, pp. 207 – 212.ISSN 1063-9667. doi:10.1109/ICVD.2005.37. Cited on p. 13.

Wagner, I. and Bertacco, V. oct. 2008. Reversi: Post-siliconvalidation system for modern microprocessors. In ComputerDesign, 2008. ICCD 2008. IEEE International Conference on,pp. 307 –314. ISSN 1063-6404. doi:10.1109/ICCD.2008.4751878.Cited on p. 4.

Wagner, I.; Bertacco, V.; and Austin, T. june 2007. Micropro-cessor Verification via Feedback-Adjusted Markov Models. InComputer-Aided Design of Integrated Circuits and Systems,IEEE Transactions on, vol. 26, no. 6, pp. 1126 –1138. ISSN0278-0070. doi:10.1109/TCAD.2006.884494. Cited on p. 13.

48

A

SOURCE CODE

The Test Bench Controller and the Random Program Generatorare implemented in c and their source code can be checked outfrom subversion (SVN) repository1

1. Please contact ORSoC AB foraccess privileges

using the command,

svn co http://svn.orsoc.se/svn/random-generator

Figure Figure 5.1 outlines the organization of the project. Forthe sake of clarity only the .c files under the trunk\random-testingare listed. The implementation work is documented using Doxygen2

2. Doxygen is an open sourcedocumentation system support-ing various programming lan-guages. More information canbe found at,http://www.stack.nl/~dimitri/doxygen/

.The html and pdf versions of the source code documentation canbe found under trunk\random-testing\docs directory. The Cover-age statistics are generated using trunk\random-testing\coverage.c.The HTML code and the scripts used to generate the instructionhistograms can be accessed under trunk\random-testing\www di-rectory.

The source code for Or1ksim, ISS simulator, found in the direc-tory trunk\or1ksim contains the modifications to interface it withthe Test Bench Controller. The source code for the unmodifiedversion of the ISS simulator can be checked out using the command,

svn co http://opencores.org/ocsvn/openrisc/openrisc/trunk/or1ksim

The changes made to the RTL-model can be found in the di-rectory trunk\orpsocv2\rtl\verilog. When the write control signalsfor the register and memory modules are asserted their values arewritten onto a Posix pipe. This value is read using a RTLsniffermodule implemented in trunk\random-testing\rtlsniffer.c. Theunmodified version of the RTL model can be checked out using thecommand,

svn co http://opencores.org/ocsvn/openrisc/openrisc/trunk/orpsocv2

The changes made to the processor models can be listed by com-paring the unmodified version3

3. The latest version of the sim-ulators can be downloaded fromthe OpenCores SVN repositoryat,http://opencores.org/websvn,listing,openrisc

with the version present in thedirectory structure in figure Figure 5.1.

49



http://opencores.org/websvn,listing,openrisc

http://opencores.org/websvn,listing,openrisc

B

EXAMPLE TEST PROGRAMS

Listing B.1 and B.2 refers to randomly generated test programstest1.s and test2.s. The run time code generated for each of thetest programs are illustrated in B.3 and B.4 respectively. From thelistings we can see that the run time code can be much larger thanthe static code as in the case of test1.s or can be much smaller asin the case of test2.s. The program counter value listed in the runtime code is used for post simulation analysis.

1 /**2 /* @file test1.s3 *4 * @brief this file was generated from random generator program5 *6 * @date Mon Jun 4 21:28:55 20127 *8 */9

10 #start of data segment11 .global data12 .data13 .align 41415 data:16 .4byte 0, 1, 2, 3, 4, 5, 6, 7, 8, 917 .4byte 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2018 .4byte 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 3119 .4byte 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 4220 .4byte 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 5321 .4byte 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 6422 .4byte 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 7523 .4byte 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 8624 .4byte 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 9725 .4byte 98, 99

51

26 #start of code segment27 .text28 .align 42930 .space 677382831 PROC0:32 #allocate frame33 l.sw -8(r1), r234 l.addi r2,r1,035 l.sw -4(r1),r936 l.addi r1,r1,-1237 l.msync38 l.mfspr r23,r20,263940 #deallocate frame41 l.ori r1,r2,042 l.lwz r2,-8(r1)43 l.lwz r9,-4(r1)44 l.jr r945 l.nop4647 .space 2706448 PROC1:49 #allocate frame50 l.sw -8(r1), r251 l.addi r2,r1,052 l.sw -4(r1),r953 l.addi r1,r1,-125455 l.jal PROC056 l.bnf 557 l.lhz r6,16(r13)58 l.msync59 l.sw 20(r13),r2060 l.cmov r16,r27,r2361 l.sh 20(r13),r2662 l.mfspr r22,r0,1663 l.addi r22,r22,1264 l.jr r2265 l.lwz r15,16(r13)66 l.cmov r8,r10,r2567 l.psync68 l.sfgeui r30,06970 #deallocate frame71 l.ori r1,r2,072 l.lwz r2,-8(r1)73 l.lwz r9,-4(r1)74 l.jr r975 l.nop

52

76 .space 16706477 PROC2:78 #allocate frame79 l.sw -8(r1), r280 l.addi r2,r1,081 l.sw -4(r1),r982 l.addi r1,r1,-128384 l.lbs r5,12(r13)85 l.mfspr r29,r25,5486 l.mfspr r27,r14,7487 l.mfspr r15,r27,6988 l.psync8990 #deallocate frame91 l.ori r1,r2,092 l.lwz r2,-8(r1)93 l.lwz r9,-4(r1)94 l.jr r995 l.nop9697 .global main98 .global first99 .global last

100 .space 4384101 main:102103 #allocate frame104 l.sw -8(r1), r2105 l.addi r2,r1,0106 l.sw -4(r1),r9107 l.addi r1,r1,-12108109 #enable interrupts and supervision bit110 l.mfspr r3,r0,17111 l.ori r3,r3,5112 l.mtspr r0,r3,17113 #disable data and instruction cache114 l.mfspr r3,r0,17115 l.addi r4,r0,0xffe7116 l.and r3,r3,r4117 l.mtspr r0,r3,17118119 #make r13 to point to data section120 l.movhi r13,hi(data)121 l.ori r13,r13,lo(data)122 #initialize registers123 l.addi r3,r0,3124 l.addi r4,r0,4125 l.addi r5,r0,5126 l.addi r6,r0,6127 l.addi r7,r0,7128 l.addi r8,r0,8129 l.addi r10,r0,10130 l.addi r11,r0,11131 l.addi r12,r0,12

53

132 l.addi r14,r0,14133 l.addi r15,r0,15134 l.addi r16,r0,16135 l.addi r17,r0,17136 l.addi r18,r0,18137 l.addi r19,r0,19138 l.addi r20,r0,20139 l.addi r21,r0,21140 l.addi r22,r0,22141 l.addi r23,r0,23142 l.addi r24,r0,24143 l.addi r25,r0,25144 l.addi r26,r0,26145 l.addi r27,r0,27146 l.addi r28,r0,28147 l.addi r29,r0,29148 l.addi r30,r0,30149 l.addi r31,r0,31150151 first:152 l.nop153 l.lbs r27,8(r13)154 l.jal PROC1155 l.sfgts r17,r12156 l.sfgtu r3,r19157 l.bf 68158 l.sb 4(r13),r30159 l.sb 0(r13),r23160 l.lhz r25,8(r13)161 l.movhi r29,hi(PROC1)162 l.ori r29,r29,lo(PROC1)163 l.jalr r29164 l.lwz r19,12(r13)165 l.csync166 l.jal PROC1167 l.sb 12(r13),r12168 l.sfgtui r29,71169 l.addi r26,r29,28170 l.psync171 l.mfspr r8,r8,42172 l.lbs r5,4(r13)173 l.ff1 r6,r27174 l.psync175 l.mac r5,r15176 l.andi r8,r24,24177 l.mfspr r16,r15,89178 l.psync179 l.lbs r17,12(r13)180 l.mfspr r19,r17,60181 l.lhz r14,8(r13)182 l.muli r15,r30,26183 l.cmov r24,r20,r8184 l.lbs r6,12(r13)185 l.sh 20(r13),r17186 l.jal PROC2

54

187 l.lhs r8,20(r13)188 l.andi r17,r15,14189 l.lhz r18,12(r13)190 l.muli r30,r31,32191 l.addi r18,r25,15192 l.msync193 l.csync194 l.sw 24(r13),r26195 l.movhi r26,hi(PROC2)196 l.ori r26,r26,lo(PROC2)197 l.jalr r26198 l.sfleui r10,3199 l.csync200 l.movhi r3,34201 l.msync202 l.csync203 l.div r16,r10,r14204 l.cmov r24,r21,r10205 l.sfltui r15,52206 l.lhs r22,24(r13)207 l.sfgeui r29,80208 l.cmov r10,r18,r3209 l.jal PROC2210 l.sb 8(r13),r4211 l.lbs r14,24(r13)212 l.lwz r15,4(r13)213 l.mfspr r8,r12,77214 l.jal PROC1215 l.sh 20(r13),r16216 l.sfnei r31,28217 l.sw 16(r13),r4218 l.muli r24,r20,5219 l.ori r12,r19,11220 l.lhs r5,12(r13)221 l.csync222 l.lbs r18,16(r13)223 l.mfspr r10,r15,57224 l.mfspr r18,r20,41225 l.csync226 l.csync227 l.cmov r26,r29,r26228 l.addi r27,r10,92229 l.cmov r30,r21,r23230 l.addi r8,r3,87231 l.sh 12(r13),r6232 last:233 l.nop234235 #deallocate frame236 l.ori r1,r2,0237 l.lwz r2,-8(r1)238 l.lwz r9,-4(r1)239 l.jr r9240 l.nop

Source Code Listing B.1: example test program test1.s

55

1 /**2 /* @file test2.s3 *4 * @brief this file was generated from random generator program5 *6 * @date Mon Jun 4 21:28:55 20127 *8 */9

10 #start of data segment11 .global data12 .data13 .align 41415 data:16 .4byte 0, 1, 2, 3, 4, 5, 6, 7, 8, 917 .4byte 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 2018 .4byte 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 3119 .4byte 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 4220 .4byte 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 5321 .4byte 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 6422 .4byte 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 7523 .4byte 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 8624 .4byte 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 9725 .4byte 98, 992627 #start of code segment28 .text29 .align 43031 .space 254888032 PROC0:33 #allocate frame34 l.sw -8(r1), r235 l.addi r2,r1,036 l.sw -4(r1),r937 l.addi r1,r1,-1238 l.sw 16(r13),r1939 l.sw 16(r13),r1940 l.macrc r841 l.j 142 l.msync43 l.lbz r28,12(r13)44 l.lhs r20,4(r13)45 l.srli r10,r10,594647 #deallocate frame48 l.ori r1,r2,049 l.lwz r2,-8(r1)50 l.lwz r9,-4(r1)51 l.jr r952 l.nop5354 .global main55 .global first56 .global last

56

5758 .space 127887659 main:6061 #allocate frame62 l.sw -8(r1), r263 l.addi r2,r1,064 l.sw -4(r1),r965 l.addi r1,r1,-126667 #enable interrupts and supervision bit68 l.mfspr r3,r0,1769 l.ori r3,r3,570 l.mtspr r0,r3,1771 #disable data and instruction cache72 l.mfspr r3,r0,1773 l.addi r4,r0,0xffe774 l.and r3,r3,r475 l.mtspr r0,r3,177677 #make r13 to point to data section78 l.movhi r13,hi(data)79 l.ori r13,r13,lo(data)80 #initialize registers81 l.addi r3,r0,382 l.addi r4,r0,483 l.addi r5,r0,584 l.addi r6,r0,685 l.addi r7,r0,786 l.addi r8,r0,887 l.addi r10,r0,1088 l.addi r11,r0,1189 l.addi r12,r0,1290 l.addi r14,r0,1491 l.addi r15,r0,1592 l.addi r16,r0,1693 l.addi r17,r0,1794 l.addi r18,r0,1895 l.addi r19,r0,1996 l.addi r20,r0,2097 l.addi r21,r0,2198 l.addi r22,r0,2299 l.addi r23,r0,23

100 l.addi r24,r0,24101 l.addi r25,r0,25102 l.addi r26,r0,26103 l.addi r27,r0,27104 l.addi r28,r0,28105 l.addi r29,r0,29106 l.addi r30,r0,30107 l.addi r31,r0,31

57

109110 first:111 l.nop112 l.add r17,r22,r20113 l.lbs r15,20(r13)114 l.bnf 82115 l.addic r16,r17,88116 l.addic r17,r5,4117 l.jal PROC0118 l.mfspr r23,r18,69119 l.sfgeu r27,r16120 l.sub r16,r10,r5121 l.movhi r18,hi(PROC0)122 l.ori r18,r18,lo(PROC0)123 l.jalr r18124 l.mfspr r3,r17,89125 l.or r22,r18,r30126 l.mfspr r6,r31,86127 l.sh 8(r13),r17128 l.csync129 l.sfne r6,r10130 l.addic r21,r6,18131 l.rori r23,r10,1132 l.sb 12(r13),r10133 l.msync134 l.jal PROC0135 l.muli r10,r3,3136 l.psync137 l.lbs r15,4(r13)138 l.addi r16,r17,87139 l.lbz r12,8(r13)140 l.movhi r28,12141 l.sb 0(r13),r14142 l.sfleui r10,67143 l.addi r4,r26,56144 l.msync145 l.csync146 l.msync147 l.movhi r30,46148 l.ori r25,r15,29149 l.msync150 l.muli r27,r26,71151 l.divu r29,r29,r31152 l.mfspr r19,r4,96153 l.cmov r6,r19,r23154 l.srli r26,r29,50155 l.sub r21,r12,r25156 l.sh 4(r13),r21157 l.sh 8(r13),r22158 l.sub r26,r30,r7159 l.movhi r8,9160 l.sh 24(r13),r5161 l.addc r30,r3,r23162 l.maci r15,7163 l.jal PROC0

58

164 l.sfgts r15,r23165 l.rori r30,r24,38166 l.sfnei r27,43167 l.sb 8(r13),r4168 l.rori r5,r17,55169 l.lhz r16,20(r13)170 l.lhz r17,16(r13)171 l.sh 16(r13),r31172 l.mfspr r31,r23,80173 l.lhs r3,12(r13)174 l.psync175 l.sw 8(r13),r18176 l.sub r30,r25,r6177 l.or r26,r15,r29178 l.jal PROC0179 l.msync180 l.lhz r16,20(r13)181 l.lhz r6,4(r13)182 l.lbs r15,8(r13)183 l.lbs r22,24(r13)184 l.addc r19,r25,r31185 l.sh 16(r13),r15186 l.rori r14,r26,89187 l.andi r16,r5,30188 l.psync189 l.mfspr r6,r15,19190 l.addi r3,r28,83191 l.msync192 l.lwz r4,0(r13)193 l.csync194 l.sra r6,r12,r5195 l.cmov r28,r14,r22196 l.movhi r15,40197 l.msync198 l.mfspr r19,r0,16199 l.addi r19,r19,12200 l.jr r19201 l.macrc r28202 l.sw 12(r13),r15203 l.lhz r26,24(r13)204 last:205 l.nop206207 #deallocate frame208 l.ori r1,r2,0209 l.lwz r2,-8(r1)210 l.lwz r9,-4(r1)211 l.jr r9212 l.nop

Source Code Listing B.2: example test program test2.s

59

1 PPC:6a8768 l.lbs r27 0x8(r13)2 PPC:6a876c l.jal -429413 PPC:6a8770 l.sfgts r17 r124 PPC:67e878 l.sw -8(r1) r25 PPC:67e87c l.addi r2 r1 0x06 PPC:67e880 l.sw -4(r1) r97 PPC:67e884 l.addi r1 r1 -128 PPC:67e888 l.jal -67819 PPC:67e88c l.bnf 0x510 PPC:677e94 l.sw -8(r1) r211 PPC:677e98 l.addi r2 r1 0x012 PPC:677e9c l.sw -4(r1) r913 PPC:677ea0 l.addi r1 r1 -1214 PPC:677ea4 l.msync15 PPC:677ea8 l.mfspr r23 r20 0x1a16 PPC:677eac l.ori r1 r2 017 PPC:677eb0 l.lwz r2 -8(r1)18 PPC:677eb4 l.lwz r9 -4(r1)19 PPC:677eb8 l.jr r920 PPC:677ebc l.nop 021 PPC:67e890 l.lhz r6 0x10(r13)22 PPC:67e894 l.msync23 PPC:67e898 l.sw 0x14(r13)r2024 PPC:67e89c l.cmov r16 r27 r2325 PPC:67e8a0 l.sh 0x14(r13)r2626 PPC:67e8a4 l.mfspr r22 r0 0x1027 PPC:67e8a8 l.addi r22 r22 0xc28 PPC:67e8ac l.jr r2229 PPC:67e8b0 l.lwz r15 0x10(r13)30 PPC:67e8b0 l.lwz r15 0x10(r13)31 PPC:67e8b4 l.cmov r8 r10 r2532 PPC:67e8b8 l.psync33 PPC:67e8bc l.sfgeuir300x034 PPC:67e8c0 l.ori r1 r2 035 PPC:67e8c4 l.lwz r2 -8(r1)36 PPC:67e8c8 l.lwz r9 -4(r1)37 PPC:67e8cc l.jr r938 PPC:67e8d0 l.nop 039 PPC:6a8774 l.sfgtu r3 r1940 PPC:6a8778 l.bf 0x4441 PPC:6a877c l.sb 0x4(r13)r3042 PPC:6a8780 l.sb 0x0(r13)r2343 PPC:6a8784 l.lhz r25 0x8(r13)44 PPC:6a8788 l.movhi r29 0x6745 PPC:6a878c l.ori r29 r29 0xe87846 PPC:6a8790 l.jalr r2947 PPC:6a8794 l.lwz r19 0xc(r13)48 PPC:67e878 l.sw -8(r1) r249 PPC:67e87c l.addi r2 r1 0x050 PPC:67e880 l.sw -4(r1) r951 PPC:67e884 l.addi r1 r1 -1252 PPC:67e888 l.jal -678153 PPC:67e88c l.bnf 0x554 PPC:677e94 l.sw -8(r1) r2

60

56 PPC:67e8a0 l.sh 0x14(r13)r2657 PPC:67e8a4 l.mfspr r22 r0 0x1058 PPC:67e8a8 l.addi r22 r22 0xc59 PPC:67e8ac l.jr r2260 PPC:67e8b0 l.lwz r15 0x10(r13)61 PPC:67e8b0 l.lwz r15 0x10(r13)62 PPC:67e8b4 l.cmov r8 r10 r2563 PPC:67e8b8 l.psync64 PPC:67e8bc l.sfgeuir300x065 PPC:67e8c0 l.ori r1 r2 066 PPC:67e8c4 l.lwz r2 -8(r1)67 PPC:67e8c8 l.lwz r9 -4(r1)68 PPC:67e8cc l.jr r969 PPC:67e8d0 l.nop 070 PPC:6a8798 l.csync71 PPC:6a879c l.jal -4295372 PPC:6a87a0 l.sb 0xc(r13)r1273 PPC:67e878 l.sw -8(r1) r274 PPC:67e87c l.addi r2 r1 0x075 PPC:67e880 l.sw -4(r1) r976 PPC:67e884 l.addi r1 r1 -1277 PPC:67e888 l.jal -678178 PPC:67e88c l.bnf 0x579 PPC:677e94 l.sw -8(r1) r280 PPC:677e98 l.addi r2 r1 0x081 PPC:677e9c l.sw -4(r1) r982 PPC:677ea0 l.addi r1 r1 -1283 PPC:677ea4 l.msync84 PPC:677ea8 l.mfspr r23 r20 0x1a85 PPC:677eac l.ori r1 r2 086 PPC:677eb0 l.lwz r2 -8(r1)87 PPC:677eb4 l.lwz r9 -4(r1)88 PPC:677eb8 l.jr r989 PPC:677ebc l.nop 090 PPC:67e890 l.lhz r6 0x10(r13)91 PPC:67e894 l.msync92 PPC:67e898 l.sw 0x14(r13)r2093 PPC:67e89c l.cmov r16 r27 r2394 PPC:67e8a0 l.sh 0x14(r13)r2695 PPC:67e8a4 l.mfspr r22 r0 0x1096 PPC:67e8a8 l.addi r22 r22 0xc97 PPC:67e8ac l.jr r2298 PPC:67e8b0 l.lwz r15 0x10(r13)99 PPC:67e8b0 l.lwz r15 0x10(r13)

100 PPC:67e8b4 l.cmov r8 r10 r25101 PPC:67e8b8 l.psync102 PPC:67e8bc l.sfgeuir300x0103 PPC:67e8c0 l.ori r1 r2 0104 PPC:67e8c4 l.lwz r2 -8(r1)105 PPC:67e8c8 l.lwz r9 -4(r1)106 PPC:67e8cc l.jr r9107 PPC:67e8d0 l.nop 0108 PPC:6a87a4 l.sfgtuir290x47109 PPC:6a87a8 l.addi r26 r29 0x1c110 PPC:6a87ac l.psync

61

111 PPC:6a87b0 l.mfspr r8 r8 0x2a112 PPC:6a87b4 l.lbs r5 0x4(r13)113 PPC:6a87b8 l.ff1 r6 r27114 PPC:6a87bc l.psync115 PPC:6a87c0 l.mac r5 r15116 PPC:6a87c4 l.andi r8 r24 0x18117 PPC:6a87c8 l.mfspr r16 r15 0x59118 PPC:6a87cc l.psync119 PPC:6a87d0 l.lbs r17 0xc(r13)120 PPC:6a87d4 l.mfspr r19 r17 0x3c121 PPC:6a87d8 l.lhz r14 0x8(r13)122 PPC:6a87dc l.muli r15 r30 0x1a123 PPC:6a87e0 l.cmov r24 r20 r8124 PPC:6a87e4 l.lbs r6 0xc(r13)125 PPC:6a87e8 l.sh 0x14(r13)r17126 PPC:6a87ec l.jal -1184127 PPC:6a87f0 l.lhs r8 0x14(r13)128 PPC:6a756c l.sw -8(r1) r2129 PPC:6a7570 l.addi r2 r1 0x0130 PPC:6a7574 l.sw -4(r1) r9131 PPC:6a7578 l.addi r1 r1 -12132 PPC:6a757c l.lbs r5 0xc(r13)133 PPC:6a7580 l.mfspr r29 r25 0x36134 PPC:6a7584 l.mfspr r27 r14 0x4a135 PPC:6a7588 l.mfspr r15 r27 0x45136 PPC:6a758c l.psync137 PPC:6a7590 l.ori r1 r2 0138 PPC:6a7594 l.lwz r2 -8(r1)139 PPC:6a7598 l.lwz r9 -4(r1)140 PPC:6a759c l.jr r9141 PPC:6a75a0 l.nop 0142 PPC:6a87f4 l.andi r17 r15 0xe143 PPC:6a87f8 l.lhz r18 0xc(r13)144 PPC:6a87fc l.muli r30 r31 0x20145 PPC:6a8800 l.addi r18 r25 0xf146 PPC:6a8804 l.msync147 PPC:6a8808 l.csync148 PPC:6a880c l.sw 0x18(r13)r26149 PPC:6a8810 l.movhi r26 0x6a150 PPC:6a8814 l.ori r26 r26 0x756c151 PPC:6a8818 l.jalr r26152 PPC:6a881c l.sfleuir100x3153 PPC:6a756c l.sw -8(r1) r2154 PPC:6a7570 l.addi r2 r1 0x0155 PPC:6a7574 l.sw -4(r1) r9156 PPC:6a7578 l.addi r1 r1 -12157 PPC:6a757c l.lbs r5 0xc(r13)158 PPC:6a7580 l.mfspr r29 r25 0x36159 PPC:6a7584 l.mfspr r27 r14 0x4a160 PPC:6a7588 l.mfspr r15 r27 0x45161 PPC:6a758c l.psync162 PPC:6a7590 l.ori r1 r2 0163 PPC:6a7594 l.lwz r2 -8(r1)164 PPC:6a7598 l.lwz r9 -4(r1)165 PPC:6a759c l.jr r9

62

166 PPC:6a75a0 l.nop 0167 PPC:6a8820 l.csync168 PPC:6a8824 l.movhi r3 0x22169 PPC:6a8828 l.msync170 PPC:6a882c l.csync171 PPC:6a8830 l.div r16 r10 r14172 PPC:6a8834 l.cmov r24 r21 r10173 PPC:6a8838 l.sfltuir150x34174 PPC:6a883c l.lhs r22 0x18(r13)175 PPC:6a8840 l.sfgeuir290x50176 PPC:6a8844 l.cmov r10 r18 r3177 PPC:6a8848 l.jal -1207178 PPC:6a884c l.sb 0x8(r13)r4179 PPC:6a756c l.sw -8(r1) r2180 PPC:6a7570 l.addi r2 r1 0x0181 PPC:6a7574 l.sw -4(r1) r9182 PPC:6a7578 l.addi r1 r1 -12183 PPC:6a757c l.lbs r5 0xc(r13)184 PPC:6a7580 l.mfspr r29 r25 0x36185 PPC:6a7584 l.mfspr r27 r14 0x4a186 PPC:6a7588 l.mfspr r15 r27 0x45187 PPC:6a758c l.psync188 PPC:6a7590 l.ori r1 r2 0189 PPC:6a7594 l.lwz r2 -8(r1)190 PPC:6a7598 l.lwz r9 -4(r1)191 PPC:6a759c l.jr r9192 PPC:6a75a0 l.nop 0193 PPC:6a8850 l.lbs r14 0x18(r13)194 PPC:6a8854 l.lwz r15 0x4(r13)195 PPC:6a8858 l.mfspr r8 r12 0x4d196 PPC:6a885c l.jal -43001197 PPC:6a8860 l.sh 0x14(r13)r16198 PPC:67e878 l.sw -8(r1) r2199 PPC:67e87c l.addi r2 r1 0x0200 PPC:67e880 l.sw -4(r1) r9201 PPC:67e884 l.addi r1 r1 -12202 PPC:67e888 l.jal -6781203 PPC:67e88c l.bnf 0x5204 PPC:677e94 l.sw -8(r1) r2205 PPC:67e8a0 l.sh 0x14(r13)r26206 PPC:67e8a4 l.mfspr r22 r0 0x10207 PPC:67e8a8 l.addi r22 r22 0xc208 PPC:67e8ac l.jr r22209 PPC:67e8b0 l.lwz r15 0x10(r13)210 PPC:67e8b0 l.lwz r15 0x10(r13)211 PPC:67e8b4 l.cmov r8 r10 r25212 PPC:67e8b8 l.psync213 PPC:67e8bc l.sfgeuir300x0214 PPC:67e8c0 l.ori r1 r2 0215 PPC:67e8c4 l.lwz r2 -8(r1)216 PPC:67e8c8 l.lwz r9 -4(r1)217 PPC:67e8cc l.jr r9218 PPC:67e8d0 l.nop 0219 PPC:6a8864 l.sfnei r31 0x1c

63

220 PPC:6a8868 l.sw 0x10(r13)r4221 PPC:6a886c l.muli r24 r20 0x5222 PPC:6a8870 l.ori r12 r19 0xb223 PPC:6a8874 l.lhs r5 0xc(r13)224 PPC:6a8878 l.csync225 PPC:6a887c l.lbs r18 0x10(r13)226 PPC:6a8880 l.mfspr r10 r15 0x39227 PPC:6a8884 l.mfspr r18 r20 0x29228 PPC:6a8888 l.csync229 PPC:6a888c l.csync230 PPC:6a8890 l.cmov r26 r29 r26231 PPC:6a8894 l.addi r27 r10 0x5c232 PPC:6a8898 l.cmov r30 r21 r23233 PPC:6a889c l.addi r8 r3 0x57234 PPC:6a88a0 l.sh 0xc(r13)r6235 PPC:6a88a4 l.nop 0

Source Code Listing B.3: Run time code for test program test1.s

1 PPC:3a8b64 l.add r17 r22 r202 PPC:3a8b68 l.lbs r15 0x14(r13)3 PPC:3a8b6c l.bnf 0x524 PPC:3a8b70 l.addic r16 r17 0x585 PPC:3a8cb4 l.movhi r15 0x286 PPC:3a8cb8 l.msync7 PPC:3a8cbc l.mfspr r19 r0 0x108 PPC:3a8cc0 l.addi r19 r19 0xc9 PPC:3a8cc4 l.jr r1910 PPC:3a8cc8 l.macrc r2811 PPC:3a8cc8 l.macrc r2812 PPC:3a8ccc l.sw 0xc(r13)r1513 PPC:3a8cd0 l.lhz r26 0x18(r13)14 PPC:3a8cd4 l.nop 0

Source Code Listing B.4: Run time code for test program test2.s

64

Documents

Automatic Verification of Micro- processor designs using ...uu.diva-portal.org › smash › get › diva2:544017 › FULLTEXT01.pdf · Automatic Verification of Microprocessor designs