22
Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers High Performance Computing in C/C++ The C Abstract Machine Model David Chisnall February 14, 2011

1 Intro and Pointers

  • Upload
    savio77

  • View
    225

  • Download
    4

Embed Size (px)

Citation preview

Page 1: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

High Performance Computing in C/C++The C Abstract Machine Model

David Chisnall

February 14, 2011

Page 2: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Course Aims

• Introduce C programming

• Describe the differences between C, C++, and Java

• Explain programming styles appropriate to HPC

Author's Note
Comment
This course will explain some trends in HPC, some developments in computer architecture, and how they relate to performance for programmers. By the end, you should know how to optimise your code for a variety of different architectures.
Page 3: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Why Should You Care About. . . C?

• Widely used in industry

• Many other languages inherit syntax and semantics

• 40 years of legacy code...

Page 4: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Why Should You Care About. . . HPC?

High-performance computing is a small (well-paid) niche, but‘High performance’ is relative.

Machine Performance Architecture Note

Cray-1 250MFLOPS SIMD First supercomputerIBM XT <1MFLOP SISD First PC

Cortex A8 10MFLOPS SISD + SIMD Smartphone CPUnVidia Fermi 1GFLOPS SIMD + MIMD Modern GPU

Many ‘HPC’ techniques eventually become useful elsewhere.

Author's Note
Comment
In the '80s, SIMD (vector) programming was something that only HPC people cared about. Now, any CPU has a vector unit. A modern GPU uses very similar designs and outperforms early supercomputers, and outperforms them by a significant amount.
Page 5: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Assessment

50% Programming:

• 1 easy assignment: 15%

• 1 moderate assignment: 25%

• 1 hard assignment: 10%

50% Essay.

Author's Note
Comment
Some original work is expected for the essay. An essay just repeating things taught in the lecture will not receive good marks!
Page 6: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Why was C Created?

• UNIX was originally written in assembly language

• Porting it to a new architecture meant rewriting it

• C was created to be slightly more abstract than assembly, butalmost as fast

Author's Note
Comment
C was intended to be just abstract enough to write code that ran on multiple architectures, while still exposing most of the real architecture to the programmer.
Page 7: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

The C Abstract Model

• Very close to how a real machine works.

• Some simple abstractions

• ‘Close to the metal’

Every C language construct maps to a very short sequence ofmachine instructions.

Author's Note
Comment
A simple C compiler just maps each language construct to a short sequence of instructions. This gives reasonable performance, which is why C has a reputation as a fast language - it's not that the language is intrinsically fast, it's that writing a compiler that generates fast code is easy.
Page 8: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Contrast: The Java Abstract Model

Contains many things that are not in the real hardware model:

• Automatic memory management

• Bounds-checked arrays

• Objects and classes

• Dynamic method lookup

• Typed memory and reflection

Author's Note
Comment
Objects and classes are entirely a language abstraction. Automatic memory management requires a garbage collector to track references. Dynamic method lookup based on the class type requires complex jump tables. The memory model requires each object to be aware of its own type.
Page 9: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Why C is Fast

• A simple compiler can produce good code from C

• Easy for the programmer to understand performance

• CPU features are usually exposed directly, via laguageextensions

Author's Note
Comment
Toy C compilers can get more than 50% of the performance of good optimising compilers. Because the mapping from C code machine instructions is relatively simple, it's easy to judge the performance characteristics of various pieces of software.
Page 10: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

C Memory Model

• Flat array of bytes

• Each byte is addressable

Contrast with Java:

• Split into objects

• Objects accessed by (opaque) reference

• Access control on object fields

Author's Note
Comment
The memory model in C is very close to the hardware model. The hardware has load and store instructions that work on memory addresses. C has a very slightly more abstract model than this.
Page 11: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Pointers Vs Java References

C/C++ Pointers Java ReferencesNumbers indicating an addressin memory

Opaque type with only assign(copy) operation defined.

Can be created as the result ofarithmetic

Can only be created from validobjects.

Can point to invalid locations Can only point to valid objectsor NULL.

Point to memory. User mustknow type of destination.

Point to objects which knowtheir own type.

Java implements a variant of Smalltalk’s object memory model. Cexposes a flat memory model.

Author's Note
Comment
Java provides a high-level model, where objects are isolated and can only be accessed by an opaque reference. Fields in an object can be accessed only via a reference to that object. This means that the VM is free to move objects around in memory and there is absolutely no way in which the programmer can tell that this has happened. In C, you see the real machine address (or, at least, the virtual memory address) of everything you store in memory, and are largely responsible for managing this. It's possible to implement Java's model on top of C's model, but much harder to do it the other way around, which is why C is called a lower-level language than Java.
Page 12: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

No Garbage Collection

Memory must be explicitly allocated and deallocated. Potentialproblems include:

• Memory leaks (don’t free after last use)

• Dangling pointers (use after free)

• Buffer overflows (access out of bounds)

Author's Note
Comment
There are lots of categories of bug that Java programmers are protected against by garbage collection. In Java, you explicitly create objects, but they are implicitly destroyed when you no longer refer to them. In C, you are responsible to manually destroying things (and everything that they refer to) when you are finished with them.
Page 13: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Code is Data

1. C code is compiled to machine code

2. Machine code is loaded into memory

3. Everything in memory is accessible to C code

4. Pointers can point to functions as well as data

Author's Note
Comment
When you run a program in C, it is mapped into memory and you can get pointers to parts of it. In Java, you have references to objects, indirect access to fields via these, but the only way you can access a method is by calling it. In C, you can take a pointer to a function, pass it to another function, call it, and so on. Via GCC extensions, you can even take a pointer to an address inside a function, although this is rarely sensible.
Page 14: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Example Memory Layout

0x00000000

0xffffffff

Program Code0x00001000

0x0000f000

Library Code

0x00100000

0x00140000

Static Data

0x000fe000

Heap Memory

0xf0000000

stack Memory

0xffffffff

Author's Note
Comment
This is a typical UNIX memory layout. The stack is at the top, and grows downwards. The heap is near the bottom and grows upwards. Most operating systems map a page at address 0 that has no access permissions, so you get a crash if you try to access a null pointer. The program, any global variables, and any shared libraries are also mapped into the address space, and you can take the address of any of these things.
Page 15: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

C Pointers

• Contain a memory address

• Can point anywhere in memory

• Do not store the type of the data pointed to

Author's Note
Comment
Pointers are just numbers, which are annotated in the source code as storing the address of something. In BCPL (the language that C evolved from), there was no distinction between pointers and integers - you could treat any integer as an address and read a value from it. In assembly languages, this is still true. Unlike Java references, there is nothing magic about pointers.
Page 16: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Example: Casting

Listing 1: Incorrect Casting in Java�class Cast

{

public static void main(String [] args)

{

String str = "12345";

Object obj = str;

Number num = (Number)obj;

}

} � �$ javac Cast.java && java Cast

Exception in thread "main"

java.lang.ClassCastException:

java.lang.String cannot be cast to java.lang.Number

at Cast.main(Cast.java:7)

Author's Note
Comment
Objects in Java know what their type is. If you cast a String to a Number, you get an exception at run time, because this is not a valid cast.
Page 17: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Example: Casting

Listing 2: Incorrect Casting in C�int main(void)

{

int a = 12;

int *ptr = &a;

float *ptr2 = (float *)ptr;

*ptr2 = 1123 e12;

return a;

} � �$ c99 cast.c && ./a.out ; echo $?

87 Wrong result, but no error report!

Author's Note
Comment
In C, memory is just memory. This simple example takes the address of a, an integer, and stores that address in two pointers. ptr2 is indicated as being the address of a float, but really stores the address of this integer. When we store a floating point value in it, then load it as an integer, we get nonsense out. This code generates no errors at compile or run time, it just does the wrong thing. In some cases, it's useful (for example, you might want to byte-swap a floating point value for sending over the network by treating a pointer to it as an array of bytes).
Page 18: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Example: Pointers are Numbers

�#include <stdio.h>

int main(void)

{

int a = 12;

int b = (int)&a;

printf("a: 0x%x, b: 0x%x\n", a, b);

return 0;

} � �$ c99 ptr.c && ./a.out

a: 0xc, b: 0xbffff8fc

Author's Note
Comment
This code prints the value of an int variable and a pointer variable, both as hexadecimal numbers. Both are just numbers.
Page 19: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Pointers are Low Level

• Machine memory is indexed by a numerical address

• CPUs have load and store instructions for accessing memory

• This abstraction is exposed in C

Author's Note
Comment
Pointers are just integer variables that store a machine address. There is nothing magic about them, unlike Java references.
Page 20: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

Pointer Example

// Integer

int a = 12;

// Pointer to integer

int *b = &a;

// Pointer to pointer

to integer

int **c = &b;

a: 0x000000000c

0xbffff8ff

0xbffff8fc

b: 0xbffff8fc

0xbffff8f8

c: 0xbffff8f8

0xbffff8f4

Author's Note
Comment
This code stores three values on the stack. The first is an integer. The second is a pointer to that integer. The third is a pointer to that pointer. You can have as many layers of indirection as you want in C. From the assembly perspective, you just have three integers stored on the stack, it's only in C that two of them are annotated as being pointers.
Page 21: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

What Does int a Really Mean?

• Allocate enough space for an integer (2, 4, or 8 bytes,depending on the platform)

• Keep track of where this memory is

• Replace references to a with this address

a = 12 means store the value 12 in memory at the address reservedfor the variable a.

Author's Note
Comment
When you declare a variable in C, it just means 'allocate enough space for that variable and use the address where it was allocate anywhere where I refer to that variable'.
Page 22: 1 Intro and Pointers

Course Overview Introducing C History of C The C Abstract Model Memory in C Understanding Pointers

What Are Pointers For?

• Connecting data structures

• Aliasing data (two pointers to the same data)

• Anything you use object references for in Java