93
GPGPU: Bruteforce is NOT DEAD ! GPGPU BruteForce is NOT DEAD ! 1 Bruteforce is NOT DEAD !

Gpgpu how to make bruteforcing tool using gpgpu

Embed Size (px)

Citation preview

Page 1: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD !

GPGPUBruteForce is NOT DEAD !

1

Bruteforce is NOT DEAD !

Page 2: Gpgpu how to make bruteforcing tool using gpgpu

GPGPUHow to make “Bruteforcing” Tool using

GPGPU

Page 3: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

정구범

http://blog.ninetiger.com

[email protected]

질문이있다면,바로 E-Mail 주세요!

양찬무

http://coldmu.tistory.com

[email protected]

3

Who we are?

/GPGPU/WHO_WE_ARE

COLDMUNineTiger

Page 4: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

4

Goal of LectureGPGPU, CUDA, OpenCL에관해알며,

원하는모든알고리즘의 Bruteforcing Toolkit제작을

할수있다.

+매 Section마다의기프티콘! 개이득!

4

/GPGPU/GOAL_OF_LECTURE

Page 5: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Content

발전방향

GPGPU?

CPU vs GPU

Cuda vs OpenCL

/GPGPU/CONTENT

GPGPU CUDA OpenCL TRAINING

Understanding CUDA

How to install CUDA

Hello World !

Understanding OpenCL

Parallel Programming

OpenCLArchitecture

Hello World !

OpenCLFramework

SHA1

Demo공인인증서

Q & A

5

Page 6: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

발전방향

1995년

/GPGPU/FUTURE_DIRECTION

6

Page 7: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

발전방향매니코어가전력효율성이높다.

/GPGPU/FUTURE_DIRECTION

7

입력 프로세서 출력

입력f

정전용량 = C전압 = V

동작주파수 = fPower = CV2𝑓

프로세서

프로세서

f/2

f/2

출력

f

정전용량 = 2.2C전압 = 0.6V

동작주파수 = 0.5fPower = 0.396CV2𝑓

Page 8: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

0

2

4

6

8

10

12

14

16

GFL

OPS

/Wat

t발전방향특화칩이전력효율성이좋다.

/GPGPU/FUTURE_DIRECTION

8

Intel 80-코어 테라급 프로세서

NVIDIA GTX 280

Intel Core2 쿼드

프로세서 (Q6700)

97W

236W

95W

Page 9: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

9

매니코어 + 특화된칩= 이종매니코어플랫폼의세상도래

9

/GPGPU/FUTURE_DIRECTION

이런플랫폼에맞는소프트웨어는어떻게설계할것인가

Page 10: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

GPGPU?

General Purpose Graphics Processing Unit

컴퓨터그래픽스를위한계산만다루는 GPU를사용하여 CPU가하던프로그램들의계산을수행하는기술.

이를가능하게하는것이 CUDA, OpenCL이있다.

/GPGPU/GPGPU?

10

Page 11: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

CPU vs GPU

오직계산을위한효율만높여옴

나머지는 CPU에서처리

i7의 10개를합쳐야나오는성능.

/GPGPU/CPU_vs_GPU

11

Page 12: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

CPU vs GPU

CPU 1 , 2, 4, 8, …

GPU 240 , 480, …

/GPGPU/CPU_vs_GPU

12

GPUCPU

Page 13: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

그래픽카드안에보드,메모리, GPU 모두존재

제조사에서개발용이

발전시간이매우빠름

Ex) CPU와대조적으로그래픽카드는 GDDR5

CPU, 메인보드, 메모리제조사가다다름

개발이덜용이

발전에오랜시간이걸림

Ex) 메모리 DDR1 > DDR3오랜시간이걸림

13

CPU vs GPU

/GPGPU/CPU_vs_GPU

CPUGPU

Page 14: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

-호환성

+개발단순

+호환성

-개발복잡

14

CUDA vs OpenCL

/GPGPU/CDUA_vs_OPENCL

OpenCL

최근들어,

cuda와 opencl은 10퍼센트미만의차이점만을가짐

즉, 무엇을사용해도큰차이점을보이지는않는다.

CUDA

Page 15: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

15

/GPGPU/EXAMPLE

Page 16: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

16

Review: GPGPU1. GPU가제조사에서개발이용이하여발전을빠

르게하였습니다. 개발이용이한이유?

2. CPU와 GPU의차이점아무거나?

16

/REVIEW/GPGPU/GA_2_DUT

Page 17: Gpgpu how to make bruteforcing tool using gpgpu

CUDAComputer Unified Device Architecture

Page 18: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

WHATIS THE

Contents?UNDERSTANDING CUDA

HOW TO INSTALL CUDA

HELLO WORLD

/GPGPU/CUDA/WHATISTHECONTENTS

2

Page 19: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

3

CUDA?

Data Processing

Load Calculation

TPC/SM/SP

Thread/Block/Grid

CUDA Function

Thread

Basic Function

Function Modifier

UnderstandingCUDA

/GPGPU/CUDA/UNDERSTANDING_CUDA

Page 20: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

CUDA?

Computer Unified Device Architecture

Nvidia사가 GPU를이용한범용적인프로그램을개발할수있도록 ‘프로그램모델‘, ‘프로그램언어‘, ‘컴파일러‘, ‘라이브러리‘, ‘프로파일러'를제공하는통합환경을구축.

2007년 2월에통합개발환경 CUDA 발표.

/GPGPU/CUDA/UNDERSTANDING_CUDA/CUDA?

4

Page 21: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

5

CPU Data Processing

/GPGPU/CUDA/UNDERSTANDING_CUDA/DATA_PROCESSING

CPU

메인보드

메모리(DRAM)

FPU(실수 연산) ALU(정수 연산)

레지스터

LSU

입력 데이터 출력 데이터

Page 22: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

6

GPU Data Processing

/GPGPU/CUDA/UNDERSTANDING_CUDA/DATA_PROCESSING

CPU

메인보드

메모리(DRAM)

입력 데이터 출력 데이터

그래픽 카드

GPU

데이터 처리

그래픽 카드 메모리(DRAM)

입력 데이터 출력 데이터

GPU 코어

코어

코어

코어

공유메모리

코어

코어

코어

코어

공유메모리

코어

코어

코어

코어

공유메모리

코어

코어

코어

코어

공유메모리

Page 23: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

7

/GPGPU/CUDA/UNDERSTANDING_CUDA/LOAD_CALCULATION

Load Calculation

CPU를 이용한처리 시간

GPU를 이용한처리 시간

CPU를 이용한처리 시간

GPU를 이용한처리 시간

그래픽카드에서PC로

출력 데이터전송

PC에서그래픽 카드로입력 데이터전송

실제연산시간 전체프로그램소요시간

Page 24: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

8

TPC / SM / SP

/GPGPU/CUDA/UNDERSTANDING_CUDA/LOAD_CALCULATION

TPC(Texture Processor Cluster)

SM, Texture Unit, Controller

SM(Streaming Multiprocessor)SP를 제어, 명령어 캐시, 데이터캐시

SP(Streaming Processors)1개의 코어

TPC

Texture Units

Geometry Controller

SMC

Texture L1

SMI Cache

MT issueC Cache

SP SP

SP SP

SFU SFU

Shared Memory

SMI Cache

MT issueC Cache

SP SP

SP SP

SFU SFU

Shared Memory

Page 25: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Thread / Block / Grid스레드가모여서 Block

Block이모여서 GRID

Grid 2차원

Block 3차원

Dim3 Dg(3, 4, 1)

Dim3 Db(4, 3, 3)

Kernel <<< Dg, Db>>>(a,b,c)

/GPGPU/CUDA/UNDERSTANDING_CUDA/THREAD_BLOCK_GRID

Block(0,1)Thread(0,0) Thread(1,0)

Thread(0,1) Thread(1,1)

GridBlock(0,0) Block(1,0)

Block(0,1) Block(1,1)

9

Page 26: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

10

CUDA Function

/GPGPU/CUDA/UNDERSTANDING_CUDA/CUDA_FUNCTION

__global__ void KernelFunction(int a, int b, int c)

{

// GPU 에서실행되는함수.

}

Void main()

{

KernelFunction (1, 2, 3);

}

<< < 블록, 스레드 >>>

Page 27: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Thread

CPU환경에서는CPU 코어개수와생성한

스레드개수가동일해야최적의효율을냄. 스레드가많아지면시분할스케줄을시행하여효율떨어짐.

CUDA 환경에서는무조건잘쪼개기만

하면됨.

/GPGPU/CUDA/UNDERSTANDING_CUDA/Thread

11

Page 28: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

코어의제한

문맥교환이비교적오래걸림.

레지스터 8~16개

매니코어문맥교환의정보최소화.레지스터 SM마다16,384개스레드마다 16개레지스터필요 -> 16,384/16 = 1024개SM 30개 -> 1024 * 30 = 30,720개

12

Thread

/GPGPU/CUDA/UNDERSTANDING_CUDA/Thread2

CPU Context Switch GPU Context Switch

Page 29: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

15

/GPGPU/CUDA/UNDERSTANDING_CUDA/Thread2

이름 : 콜드무나이 : 24성별 : 남

사과음료수

총 : 10,000원

이름 : 나인타이거나이 : 24성별 : 남

과자빵

총 : 5,000원

물품

사과음료수

총 : 10,000원

물품

과자빵

총 : 5,000원

Page 30: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

코어의제한

문맥교환이비교적오래걸림.

코어마다레지스터8~16개

매니코어문맥교환의정보최소화.레지스터 SM마다16,384개스레드마다 16개레지스터필요 -> 16,384/16 = 1024개SM 30개 -> 1024 * 30 = 30,720개

16

Thread

/GPGPU/CUDA/UNDERSTANDING_CUDA/Thread2

CPU Context Switch GPU Context Switch

Page 31: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Basic FunctioncudaError_t cudaMalloc(void** devPtr, size_t count)

cudaError_t cudaFree(void* devPtr)

cudaError_t cudaMemcpy(void* dst, const void* src, size_t count, cudaMemcpyHostToDevice);

cudaMemcpyHostToHost : PC메모리에서 PC메모리로

cudaMemcpyHostToDevice : PC메모리에서그래픽카드메모리로

cudaMemcpyDeviceToHost : 그래픽카드메모리에서 PC메모리로

cudaMemcpyDeviceToDevice : 그래픽카드메모리에서그래픽카드메모리로

/GPGPU/CUDA/UNDERSTANDING_CUDA/BASIC_FUNCTIOn

17

Page 32: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Function Modifier

/GPGPU/CUDA/UNDERSTANDING_CUDA/FUNCTION_MODIFIER

18

•디바이스에서 실행

•호스트에서 호출. 디바이스에서 호출 불가.__global__

•디바이스에서 실행

•디바이스에서 호출. 호스트에서 호출 불가.__device__

•호스트에서 실행

•호스트에서 호출. 디바이스에서 호출 불가.__host__

Page 33: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

20

Download

Install

Settings

How to installCUDA

/GPGPU/CUDA/HOW_TO_INSTALL_CUDA

Page 34: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

21

/GPGPU/CUDA/HOW_TO_INSTALL_CUDA/DOWNLOAD

Page 35: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

22

/GPGPU/CUDA/HOW_TO_INSTALL_CUDA/DOWNLOAD

Page 36: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

23

/GPGPU/CUDA/HOW_TO_INSTALL_CUDA/INSTALL

Page 37: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

24

/GPGPU/CUDA/HOW_TO_INSTALL_CUDA/INSTALL

Page 38: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

25

/GPGPU/CUDA/HOW_TO_INSTALL_CUDA/INSTALL

Page 39: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

26

Hello World !!!Hello World

/GPGPU/CUDA/UNDERSTANDING_CUDA

Page 40: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

27

/GPGPU/CUDA/HELLO_WORLD/CREATE_SOLUTION

Page 41: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

28

/GPGPU/CUDA/HELLO_WORLD/HELLO_WORLD

Page 42: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

29

/GPGPU/CUDA/HELLO_WORLD/PARALLEL_CODE

Page 43: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

30

/GPGPU/CUDA/HELLO_WORLD/PARALLEL_RESULT

Page 44: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Debugging

1 GPU

2 GPU

2 Computer

/GPGPU/CUDA/APPENDIX/DEBUGGING

31

&

Page 45: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Source

CUDA 병렬프로그래밍 / 정영훈저 / 프리렉출판사

https://developer.nvidia.com/cuda-education-training (CUDA Online Training)

/GPGPU/CUDA/APPENDIX/SOURCE

32

Page 46: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

33

Review: CUDA1. 이름이 CUDA라고정의된이유?

2. CPU는왜코어갯수만큼스레드를생성하고, GPU는많이쪼개도될까?

3. 디바이스랑호스트란?

33

/REVIEW/CUDA

Page 47: Gpgpu how to make bruteforcing tool using gpgpu

OpenCLOpen Computing Language

Page 48: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

저는여러분들이

최대한이해하셨으면좋겠습니다.1. C언어를해보셨습니까?

2. Windows/Linux/E.T.C API 공부를해보셨습니까?

3. 프로세스, 쓰레드,이벤트란무엇인지아십니까?

2

/GPGPU/OPENCL/BEFORE_START

Page 49: Gpgpu how to make bruteforcing tool using gpgpu

WHATIS THE

Contents?UNDERSTANDING OPENCL

PARALLEL PROGRAMMING

OpenCL Architecture

Hello World !

Page 50: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

4

OpenCL?UnderstandingOpenCL

GPGPU/OPENCL/UNDERSTANDING_OPENCL

Page 51: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

5

OpenCL?Open Computing Language

CPU, GPU등의이종프로세서들의조합으로구성된컴퓨터를프로그래밍하기위한산업계의표준프레임워크

OpenCL은하드웨어를우아한추상화뒤로감추는것이아니라오히려하드웨어를노출시킴으로써높은수준의이식성제공.

즉, 이말은프로그래머가플랫폼과문맥그리고작업을서로다른디바이스들에어떻게스케줄할지에대해서명시적으로정의해야한다는것을의미

GPGPU/OPENCL/UNDERSTANDING_OPENCL/OpenCL

Page 52: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

6

병렬성과병행성

Parallel Programming Model

Why OpenCL use parallel-model?

Parallel Programming

/GPGPU/OPENCL/PARALLEL_PROGRAMMING

Page 53: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

단순히두개이상의활성화된연산스트림으로구성

이들을한꺼번에진행할수있다면, 병행성이있다고할수있다.

병행적인소프트웨어가실행의기본단위가되는하드웨어유닛인 Processing Element를여러개가진컴퓨터에서여러 PE로실제로동시에실행될때

즉, 하드웨어에의해병행성이지원될때병렬적이라고한다.

7

병렬성과병행성

/GPGPU/OPENCL/PARALLEL_PROGRAMMING/병렬성과_병행성

병렬성 병행성

Page 54: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

병렬성과병행성

프로그래머는병행성에대하여대단히고민해야한다.

/GPGPU/OPENCL/PARALLEL_PROGRAMMING/병렬성과_병행성

8

연산 스트림 정의 연산 데이터 연관 결과 종속성 만족

1~100 더하기 10개씩 10개로 나눔 결과들을 서로 더함

근의 공식 어떻게 쪼갤 것인지?결과들을 어떻게합칠 것인지?

따라서, 문제를 보다 쉽게 다루기 위해서, 모델링을 한다.

Page 55: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

여러병행적인 task들을프로그래머가직접정의하고나누어각PE에맵핑

부하조절문제가더욱어려울수있다

9

Parallel programming model

/GPGPU/OPENCL/PARALLEL_PROGRAMMING/PARALLEL_PROGRAMMING_MODEL

Task-parallel

1

23 4

5 6

6개의 독립적인 Tasks

1

2

3

4

5

6

3개의 PE, 잘못된부하조절

1

2

34

5

6

3개의 PE, 잘된부하조절

끝나는시간

Page 56: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

동일한명령어스트림을가지고병행적으로데이터요소에적용

상대적으로부하조절문제가적다

10

Parallel programming model

/GPGPU/OPENCL/PARALLEL_PROGRAMMING/PARALLEL_PROGRAMMING_MODEL

Data-parallel

6 1 1 0 9 2 4 1 1 9

36 1 1 0 81 4 16 1 1 81

A_Vector

A_Result

TASK(A[i])

TASK(A[i])

TASK(A[i])

단일 task가병렬적으로적용

Page 57: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Why OpenCL use parallel-model?

OpenCL은 CPU가입출력작업을하고다른모든작업을 GPGPU에맡기는작업을장려하지않음

OpenCL은매니코어와이종프로그래밍의장점을지향

모든 OpenCL을지원하는디바이스들에대한접근을허용함으로써, 추상화와는반대적으로접근

모든 OpenCL디바이스를활용한효율적인프로그래밍지향

/GPGPU/OPENCL/PARALLEL_PROGRAMMING/Why_OpenCL_use_parallel-model?

11

Page 58: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

12

Platform Model

Execution Model

Memory Model

Programming Model

OpenCL제약

OpenCL Component

OpenCLArchitecture

/GPGPU/OPENCL/OpenCL_ARCHITECTURE

Page 59: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Platform Model

이종시스템에대한상위수준의기술

Host: 단한개. 모든 Compute Device들과연결됨

Compute Device: =OpenCL Device, 명령어스트림(커널)이실행되는곳

Compute Unit: ≈작업그룹

PE: 실제적인계산이이루어지는곳

/GPGPU/OPENCL/OpenCL_ARCHITECTURE/PLATFORM_MODEL

13

Page 60: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Execution Model

Host Program: OpenCL객체들과상호연동하는프로그램. OpenCL명세에는자세한동작규약이없다.

Kernel: 입력메모리객체를출력메모리객체로변환하는함수, 실제적인작업을수행

Command Queue: 커널과데이터를전송하는순차/비순차큐

Work-item: PE에의해처리되는커널의 인스턴스

Work-group: Work-item들의집합

/GPGPU/OPENCL/OpenCL_ARCHITECTURE/EXECUTION_MODEL

14

Page 61: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Execution Model

1. OpenCL 플랫폼을선택하고문맥생성하기

2. 디바이스들을열거하고, 명령-큐생성하기

3. 프로그램객체를생성하고빌드하기

4. 커널객체를생성하고커널인자를위해메모리객체생성하기

5. 커널을실행하고그결과를읽기

6. OpenCL에서의에러확인하기

/GPGPU/OPENCL/OpenCL_ARCHITECTURE/EXECUTION_MODEL

15

Page 62: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Memory Model

Private MemoryPer work-item

Local MemoryPer work-groupwork-group안의work-item끼리공유가능

Global/Constant Memory동기화안됨

Host MemoryCPU, RAM …

/GPGPU/OPENCL/OpenCL_ARCHITECTURE/MEMORY_MODEL

16

Page 63: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Single Instruction Multiple Data(SIMD)동일한명령에데이터만다른것Single Program Multiple Data(SPMD)같은커널실행, 작업내용이여러가지인것OpenCL은 Data-parallel을기본으로설계됨

Kernel 내부병렬성

비순차큐병렬성선택적구현 -모든플랫폼에서동작하지않음

이벤트모델병렬성

17

Programming Model

/GPGPU/OPENCL/OpenCL_ARCHITECTURE/PROGRAMMING_MODEL

Task-parallel Model Data-parallel Model

Page 64: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

OpenCL제약

서로다른work-group의work-item 사이의동기화메커니즘을제공하지않음.따라서, 이에의존하는알고리즘은안전한수행을보장받지못함

OpenCL의구현상의한계때문에 OpenCL로구현할수없는병렬패턴이존재할수있음

OpenCL Specification문서에서더많은제약을볼수있음http://www.khronos.org/registry/cl/specs/opencl-1.x-latest.pdf#page=157

/GPGPU/OPENCL/OpenCL_ARCHITECTURE/OpenCL_LIMITATION

18

Page 65: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

OpenCL Component

Platform API: 디바이스찾기

Runtime API: 큐에명령을집어넣는함수등의실행중에필요한모든함수

Programming Language: OpenCL C, ISO C99의확장. 이식성이중요하기때문에 CPU에서만지원하는기능몇가지를제외.> 재귀함수, 함수포인터, 비트필드, 표준라이브러리.

/GPGPU/OPENCL/OpenCL_ARCHITECTURE/OpenCL_COMPONETN

19

Page 66: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Review: OpenCL1. OpenCL이실행되는과정을아는대로말해주세요!

2. OpenCL에서 Kernel과 Host를설명해주세요!

20

/REVIEW/OpenCL/GA_2_DUT

Page 67: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

OpenCL이미지출처

http://ahnlabsabo.tistory.com/1426

http://amd.com

http://www.khronos.org/assets/uploads/developers/library/overview/opencl_overview.pdf

21

Page 68: Gpgpu how to make bruteforcing tool using gpgpu

Goal of LectureGPGPU, CUDA, OpenCL에관해알며,

원하는모든알고리즘의Bruteforcing Toolkit제작!

+마지막남은제일비싼기프티콘!개이득!

Page 69: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

2

Favorite C IDE

OpenCL SDK

Boost Lib

Install

Page 70: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

3

Build 환경구축

Hello World 실행

Hello World Kernel 분석

Host Program 분석

Hello World !

/GPGPU/OPENCL/HELLO_WORLD

Page 71: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Build 환경구축OpenCL지원확인

CPU:intel driver: SSE4.1~AMD driver: SSE2.x~

GPU:intel driver: HD 4000~NVIDIA driver: 2009 year~https://developer.nvidia.com/cuda-gpus

AMD driver: 2009 year~http://developer.amd.com/tools-and-sdks/opencl-zone/opencl-tools-sdks/amd-accelerated-parallel-processing-app-sdk/system-requirements-driver-compatibility/

OpenCL SDK 설치AMD APP SDK or Intel OpenCL SDKAMD CodeXL (Only on AMD GPU)

Visual Studio에서 OpenCL빌드C/C++ > 추가포함디렉토리: C:\Program Files (x86)\AMD APP SDK\2.9\include

링커 > 추가종속성: C:\Program Files (x86)\AMD APP SDK\2.9\lib\x86_64\OpenCL.lib

프로젝트메인디렉토리: kernel.cl (실행시킬 Kernel 파일)

/GPGPU/OPENCL/HELLO_WORLD/INSTALL_OPENCL

4

Page 72: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Hello World 실행

i=0~i + i*2 > 출력

/GPGPU/OPENCL/HELLO_WORLD/HELLO_WORLD

5

Page 73: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Hello World Kernel Analysis모든 __(underbar 2개)는생략가능

OpenCL C로구현되어있으며, 보통 .cl확장자를가짐

__Kernel

OpenCL에서쓰이는커널함수지시자

항상 void 리턴

__global

__local, __constant, __private(default)

가리킨메모리가어디에할당된메모리인가에대한지시자

get_global_id()

/GPGPU/OPENCL/HELLO_WORLD/HELLO_WORLD_KERNEL_ANALYSIS

6

6 1 1 0 9 2 4 1 1 9

get_work_dim = 1get_global_size = 10

get_local_id = 2

get_num_groups = 2

get_group_id = 0 get_local_size = 5

get_global_id = 7

Page 74: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Host Program Analysis

OpenCL Platform 선택: CreateContext()

Device의 Command-Queue 생성: CreateCommandQueue()

프로그램객체생성및빌드: CreateProgram()

커널객체생성: clCreateKernel()

커널인자=메모리객체생성: clSetKernelArg()

커널실행을위한큐: clEnqueueNDRangeKernel()

값읽기를위한큐: clEnqueueReadBuffer()

/GPGPU/OPENCL/HELLO_WORLD/HOST_PROGRMA_ANALYSIS

7

Page 75: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

Host Program Analysis

clGetxxInfo() 사용법

cl_device_info flag

/GPGPU/OPENCL/HELLO_WORLD/HOST_PROGRMA_ANALYSIS

8

Page 76: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

9

편리를위해멀티바이트세팅

기본 cpp, cl 파일

CREATE EMPTY CONSOLE PROJECT

Page 77: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

10

정상적으로되었는지컴파일해보고테스트

통과한다면OpenCL사용가능환경

INCLUDEHEADER & LIBRARY

Page 78: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

11

readKernelFiletry catchbuild함수추가ERROR: clCreateProgramWithSource(-30)이나오면성공

platform = profilecontext = device typedevice = devicesrc = sourceprogram = compilable/ed src

BASICHOST PROGRAM

Page 79: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

12

program.build(devices);가성공해야한다

SIMPLEOPENCL KERNEL

Page 80: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

13

result[500] == 250000이면성공

SIMPLEHOST PROGRAM

Page 81: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

14

GID2KEY ?

Page 82: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

15

GID2KEY ?

1. 큐를사용하는것은매우느리다?

2. GPGPU로보내는데매우많은시간이걸린다?

3. 일차원배열을사용하는것은처리에매우비효율적이다?

Page 83: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

16

GID2KEY ?

Page 84: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

17

GID2KEY ?

Page 85: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

18

GID2KEY ?

A B C D E F G H

1 2 3 4 5 6 7 8

Page 86: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

19

GID2KEY ?

Page 87: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

20

GID2KEY ?

Page 88: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

21

cpp/cl 파일들을교체기존파일들을복붙함

OpenCL BruteForce Tool Framework 완성

원하는알고리즘을자신이추가하여완성

GID2KEY

Page 89: Gpgpu how to make bruteforcing tool using gpgpu

GPGPU: Bruteforce is NOT DEAD ! // NineTiger && COLDMU

22

C SHA1 Source 확인

OpenCL C99 변환예외확인

몇가지수정사항BRUTEFORCESHA1

Page 90: Gpgpu how to make bruteforcing tool using gpgpu

SHA1 완성 DEMO

23

Page 91: Gpgpu how to make bruteforcing tool using gpgpu

공인인증서BruteForce DEMO

24

Page 92: Gpgpu how to make bruteforcing tool using gpgpu

Q & A

Page 93: Gpgpu how to make bruteforcing tool using gpgpu

THANK YOU