78
Review for Modern GPU Hardware Lan-Da Van (范倫達), Ph. D. Department of Computer Science National Chiao Tung University Hsinchu, Taiwan Spring, 2020 1 The following content are extracted from the material in the references on last page. If any wrong citation or reference missing, please contact [email protected] . I will correct the error asap. This course used only and please do NOT broadcast. Thank you.

Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

  • Upload
    others

  • View
    6

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Review for Modern GPU Hardware

Lan-Da Van (范倫達), Ph. D.

Department of Computer Science

National Chiao Tung University Hsinchu, Taiwan

Spring, 2020

1

The following content are extracted from the material in the references on

last page. If any wrong citation or reference missing, please contact

[email protected] . I will correct the error asap.

This course used only and please do NOT broadcast. Thank you.

Page 2: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Outline

2

GPU Pipeline

History of GPU Hardware

GPU Hardware Consideration

Modern GPU Hardware Architecture

NVIDIA GeForce

AMD (ATI) Radeon

IMG PowerVR

ARM Mali

GPU Applications

Summary

Page 3: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

GPU Fundamentals: Graphics Pipeline

• A simplified graphics pipeline

– Note that pipe widths vary

– Many caches, FIFOs, and so on not shown

GPUCPU

ApplicationTransform

& LightRasterize Shade Video

Memory

(Textures)

Xfo

rmed, L

it Vertic

es (2

D)

Graphics State

Render-to-texture

Assemble

Primitives

Vertic

es (3

D)

Scre

ensp

ace tria

ngle

s (2D

)

Fra

gm

ents (p

re-p

ixels)

Fin

al P

ixels (C

olo

r, Depth

)

Page 4: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

GPU

Transform

& Light

CPU

Application Rasterize Shade Video

Memory

(Textures)

Xfo

rmed, L

it Vertic

es (2

D)

Graphics State

Render-to-texture

Assemble

Primitives

Vertic

es (3

D)

Scre

ensp

ace tria

ngle

s (2D

)

Fra

gm

ents (p

re-p

ixels)

Fin

al P

ixels (C

olo

r, Depth

)

GPU Fundamentals: ModernGraphics Pipeline

• Programmable vertex processor!

• Programmable pixel processor!

Fragment

Processor

Vertex

Processor

Page 5: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

GPUCPU

ApplicationVertex

ProcessorRasterize

Fragment

ProcessorVideo

Memory

(Textures)

Xfo

rmed, L

it Vertic

es (2

D)

Graphics State

Render-to-texture

Vertic

es (3

D)

Scre

ensp

ace tria

ngle

s (2D

)

Fra

gm

ents (p

re-p

ixels)

Fin

al P

ixels (C

olo

r, Depth

)

GPU Fundamentals: ModernGraphics Pipeline

Assemble

Primitives

Geometry

Processor

Programmable primitive assembly!

More flexible memory access!

Page 6: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

History of Graphics Hardware (1/3)

6

… - mid ’90s

SGI mainframes and workstations

PC: only 2D graphics hardware

mid ’90s

Consumer 3D graphics hardware (PC)

- 3dfx, NVIDIA, Matrox, ATI, …

Triangle rasterization (only)

Cheap: pushed by game industry

1999

PC-card with TnL (Transform and Lighting)

- NVIDIA GeForce: Graphics Processing Unit (GPU)

PC-card more powerful than specialized workstations

3DFX Voodoo graphics 4MB - 1997

Page 7: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

History of Graphics Hardware (2/3)

https://www.zhihu.com/question/21980949

Page 8: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

History of Graphics Hardware (3/3)

8

Modern graphics hardware

Graphics pipeline partly programmable

Leaders: AMD(ATI) and NVIDIA

- “AMD Radeon HD 6990” and “NVIDIA GeForce GTX 590”

Game consoles similar to GPUs (Xbox)

Page 9: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Computational Power (1/2)

• GPUs are fast…

– 3.0 GHz Intel Core2 Duo (Woodcrest Xeon 5160):

• Computation: 48 GFLOPS peak

• Memory bandwidth: 21 GB/s peak

• Price: $874 (chip)

– NVIDIA GeForce 8800 GTX:

• Computation: 330 GFLOPS observed

• Memory bandwidth: 55.2 GB/s observed

• Price: $599 (board)

• GPUs are getting faster, faster

– CPUs: 1.4× annual growth

– GPUs: 1.7× (pixels) to 2.3× (vertices) annual growth

Page 10: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Computational Power (2/2)

Courtesy Naga Govindaraju

GPU

CPU

Page 11: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Flops Comparison on GPU and CPU

Page 12: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Memory Bandwidths Comparison of CPU and GPU

Page 13: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Motivation

• Why are GPUs getting faster so fast?

– Arithmetic intensity

• the specialized nature of GPUs makes it easier to use additional transistors for computation

– Economics

• multi-billion dollar video game market is a pressure cooker that drives innovation to exploit this property

Page 14: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Flexible and Precise

• Modern GPUs are deeply programmable

– Programmable pixel, vertex, and geometry engines

– Solid high-level language support

• Modern GPUs support “real” precision

– 32-bit/64-bit floating point throughout the pipeline

• High enough for many applications

– DX10-class GPUs add 32-bit integers

Page 15: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Graphics Hardware Consideration (1/2)

• GPU = Graphics Processing Unit– Vector processor

– Operates on 4 tuples• Position ( x, y, z, w )

• Color ( red, green, blue, alpha )

• Texture Coordinates ( s, t, r, q )

– 4 tuple ops, 1 clock cycle• SIMD [ Single Instruction Multiple Data ]

– ADD, MUL, SUB, DIV, MADD, …

Page 16: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

• Pipelining

– Number of stages

• Parallelism

– Number of parallel processes

• Parallelism + pipelining

– Number of parallel pipelines

1 2 3

1 2 3

1 2 3

1 2 3

1

2

3

Graphics Hardware Consideration (2/2)

Page 17: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Outline

17

GPU Pipeline

History of GPU Hardware

GPU Hardware Consideration

Modern GPU Hardware Architecture

NVIDIA GeForce

AMD (ATI) Radeon

IMG PowerVR

ARM Mali

Summary

Page 18: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

http://5pit.tw/tech/computer/tid_12880

Page 19: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Growth of NVIDIA GPU

• Performance matrices

– Since 2000, the amount of horsepower applied to processing 3D vertices and fragments has been growing at a remarkable rate.

Page 20: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Growth of NVIDIA GPU

Page 21: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

NVIDIA GeForce 7900 GTX

Page 22: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Nvidia Graphics Card Architecture

• GeForce-8 Series– 12,288 concurrent threads, hardware managed– 128 Thread Processor cores at 1.35 GHz == 518 GFLOPS peak

TEX L1

SP

SharedMemory

IU

SP

SharedMemory

IU

TF

TEX L1

SP

SharedMemory

IU

SP

SharedMemory

IU

TF

TEX L1

SP

SharedMemory

IU

SP

SharedMemory

IU

TF

TEX L1

SP

SharedMemory

IU

SP

SharedMemory

IU

TF

TEX L1

SP

SharedMemory

IU

SP

SharedMemory

IU

TF

TEX L1

SP

SharedMemory

IU

SP

SharedMemory

IU

TF

TEX L1

SP

SharedMemory

IU

SP

SharedMemory

IU

TF

TEX L1

SP

SharedMemory

IU

SP

SharedMemory

IU

TF

L2

Memory

Work DistributionHost CPU

L2

Memory

L2

Memory

L2

Memory

L2

Memory

L2

Memory

Page 23: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

NVIDIA FERMI

Page 24: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

FERMI: Streaming Multiprocessor (SM)

• Each SM contains

• 32 Cores

• 16 Load/Store units

• 32,768 registers

• Newer FP representation

• IEEE 754-2008

• Two units

• Floating point

• Integer

Page 25: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

FERMI: Results

Page 26: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

FERMI: Comparison

Page 27: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Kepler: Core Architecturehttp://www.weistang.com/article-941-1.html

Page 28: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Titan vs Tesla Comparison

Page 29: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Maxwell: Core Architecturehttp://www.weistang.com/article-941-1.html

http://www.coolaler.com/showthread.php/313295-

%E5%8F%B2%E4%B8%8A%E6%9C%80%E9%A

B%98%E6%95%88GPU%EF%BC%9ANVIDIA-

Maxwell%E6%9E%B6%E6%A7%8B

Page 30: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Kepler vs Maxwell Comparison

http://www.coolaler.com/showthread.php/313295-

%E5%8F%B2%E4%B8%8A%E6%9C%80%E9%AB%98%E6%95%88GPU%EF%BC%9ANVIDIA-

Maxwell%E6%9E%B6%E6%A7%8B

2012 2014

Page 31: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Mobile Roadmap

http://www.techbang.com/posts/19899-nvidia-shield-rebirths-carrying-kepler-

into-the-tablet-market-discarded-palm-machine-changes-to-core-login-table-

drawing-tablet?page=2

Page 32: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Pascal: Core Architecture

https://read01.com/zh-tw/oemmE4.html#.Wi5F30qWYps

Page 33: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Volta: Core Architecture

http://technews.tw/2017/05/11/nvidia-gpu-volta/

Page 34: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Pascal vs Volta Comparison

http://technews.tw/2017/05/11/nvidia-gpu-volta/

2016 2017

Page 35: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

https://zh.wikipedia.org/wiki/CUDA

Page 36: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

• Features of ATI Radeon X1900 XTX

– Core speed 650 MHz

– 48 pixel shader processors

– 8 vertex shader processors

– 51 GB/s memory bandwidth

– 512 MB memory

ATI Radeon X1900 XTX

http://product.pcpop.com/000024721/Index

.html

Page 37: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

GPU

650MHzGraphics memory

½ GB

CPU

3GHzMain memory

1GB

Cach

e

½M

B

AGP bus

2GB/s

Output

Graphics CardHigh bandwidth

51GB/s

High bandwidth

77GB/s

Par

alle

l P

roce

sses

3GB/s

AGP memory

½ GB

Processor Chip

• High Memory Bandwidth

ATI Radeon X1900 XTX

Page 38: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

• Parallelism + pipelining: ATI Radeon 9700

4 vertex pipelines 8 pixel pipelines

ATI Radeon 9700

Page 39: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Radeon Comparison

http://www.pcdiy.com.tw/detail/4275

Page 40: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

http://wccftech.com/amd-vega-4096-gcn-stream-processors/

Page 41: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

http://wccftech.com/amd-vega-4096-gcn-stream-processors/

Page 42: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

http://www.anandtech.com/show/9233/amds-2016-gpu-roadmap-

finfet-high-bandwidth-memory

Page 43: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

http://www.anandtech.com/show/9233/amds-2016-gpu-roadmap-

finfet-high-bandwidth-memory

Page 44: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

https://www.youtube.com/watch?v=l_f_lIF3A7Q

Page 45: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

45

https://www.cnread.news/content/2536026.html

Page 46: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

46

http://intotech.ir/phone-tablet/proccessor/ -مقایسه-گوشی‌-گرافیکی-پردازنده‌ی powervr- /ای

Page 47: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

47

http://imgtec.eetrend.com/news/7355

Page 48: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

IMG PowerVR Series5XT (SGXMP)

48

Page 49: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

IMG PowerVR Series5XT (SGXMP)

49

• Shader-driven Tile-Based Deferred Rendering (TBDR) architecture

• Fully programmable GPU using unique USSE architecture

• All SGX cores support OpenGL ES 2.0/1.1, OpenVG 1.1, OpenGL 2.0/3.0 and DirectX 9/10.1

Page 50: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

IMG PowerVR Series6 (Rogue)

50

Page 51: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

IMG PowerVR Series6 (Rogue)

51

• Support OpenGL ES 3.0, OpenGL ES 2.0, OpenGL 3.x/4.x, OpenCL 1.x and DirectX10 with certain family members.

http://technews.tw/2014/07/19/powervr-rogue-gpu-list/

Page 52: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

IMG PowerVR 7XT Plus

52http://imgtec.eetrend.com/article/7130

Page 53: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

IMG PowerVR 7XT Plus

53http://imgtec.eetrend.com/article/7130

Page 54: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

IMG PowerVR 7XT Plus

54

http://www.21ic.com/news/opto/201703/709965.htm

Page 55: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

IMG PowerVR 8XE Plus

55

http://www.anandtech.com/show/11028/powervr-8xe-plus-announced

Page 56: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

IMG PowerVR 8XE Plus

56

http://www.anandtech.com/show/11028/powervr-8xe-plus-announced

Page 57: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

IMG PowerVR 8XE Plus

57

http://www.anandtech.com/show/11028/powervr-8xe-plus-announced

Page 58: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

IMG PowerVR 8XE Plus

58

http://www.anandtech.com/show/11028/powervr-8xe-plus-announced

Page 59: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

59

http://intotech.ir/phone-tablet/proccessor/ -مقایسه-گوشی‌-گرافیکی-پردازنده‌ی powervr- /ای

Page 60: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Features of ARM Mali

60

http://www.arm.com/products/graphics-and-multimedia/mali-gpu

Page 61: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

ARM Mali-200

61

Page 62: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

ARM Mali-300

62

Page 63: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

ARM Mali-400MP

63

Page 64: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

ARM Mali-450MP

64

Page 65: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

ARM Mali-T604

65

Page 66: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

ARM Mali-T604

• GPGPU (support OpenCL 1.1)

• Tri-pipe architecture

• The first GPU based on the Midgard architecture

• True IEEE double-precision floating-point math in hardware for Full Profile

• The Job Manager within Mali-T600 Series GPUs offloads task management from the CPU to the GPU

• 5x performance improvement over previous Mali graphics processors.

66

Page 67: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

ARM Mali-T678

67

Page 68: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

ARM Mali-T678

68

• 50% performance improvement compared to the Mali-T658.

Page 69: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

ARM Mali-T760

69

Page 70: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

ARM Mali-T880

70

Page 71: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

ARM Mali Comparison

71

https://zh.wikipedia.org/wiki/Mali_(GPU)

Page 72: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

ARM Mali Comparison

72

https://zh.wikipedia.org/wiki/Mali_(GPU)

Page 73: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Applications (1/4)

• Includes lots of applications

– Ray-tracer

– Image segmentation

– FFT/Linear Algebra

http://graphics.stanford.edu/data/3Ds

canrep/stanford-bunny-cebal-ssh.jpg

http://f.fwallpapers.com/images/3d

-bunny.jpg

Page 74: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

09/02/11

Applications (2/4)

http://www.techbang.com/posts/19899-nvidia-shield-rebirths-carrying-kepler-

into-the-tablet-market-discarded-palm-machine-changes-to-core-login-table-

drawing-tablet?page=2

Page 75: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Applications (3/4)

http://wechatinchina.com/thread-461154-1-1.html

Page 76: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

http://wechatinchina.com/thread-461154-1-1.html

Applications (4/4)

AR and VR Applications @@

Page 77: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Summary

77

Understand the GPU pipeline in depth

Understand the motivation of of GPU hardware

Understand modern GPU hardware architecture and

specifications

Understand GPU/GPGPU applications

Page 78: Review for Modern GPU Hardwareviplab.cs.nctu.edu.tw/course/VLSIDSP2020_Spring/VLSIDSP_CHAP10… · CPU GPU Application Vertex Processor Rasterize Fragment Processor Video Memory (Textures)

Reference

78

GPU Architecture & CG, Mark Colbert, 2006

Introduction to Graphics Hardware and GPUs, Yannick Francken,

Tom Mertens

GPU Tutorial, Yiyunjin, 2007

Evolution of GPU and Graphics Pipelining, Weijun Xiao

Commercial product website (NVIDIA, ATI, IMG, ARM).

Referencing SIGGRAPH 2005 Course Notes from David Luebke

Adapted from: David Luebke (University of Virginia) and NVIDIA

Jan Verschelde, MCS 572 Lecture 27, Introduction to

Supercomputing, 17 March 2014

Acknowledgement:

Thanks for TA’s help for preparing the material.