93
© 2009 IBM Corporation High Performance Power System 효효효효효 효효효효 High Performance Power System Date. 15/10/2009 DongJoon Cho ([email protected]) MTS, GTS, IBM Korea

© 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho ([email protected]) MTS, GTS,

Embed Size (px)

Citation preview

Page 1: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

High Performance Power System 효율적으로 사용하기

High Performance Power System

Date. 15/10/2009DongJoon Cho ([email protected])MTS, GTS, IBM Korea

Page 2: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Agenda

• Concerns about Power System

• Summary of the solutions

• Architectures for effective computing– H/W Architecture– System Architecture– S/W Architecture

Page 3: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Concerns about Power System

• 왜 고성능 Server 를 구매해놓고 100% 활용을 하지 못할까 ?

• CPU Clock 은 높아졌는데 왜 Application 성능은 나오지 않는 걸까 ?

• Clock 은 2 배로 빨라졌는데 왜 성능은 2 배가 되지 않는 걸까 ?

• Memory 를 2 배로 추가했는데 왜 사용률이 ½ 로 떨어지지 않는 걸까 ?

• IBM Power System 은 왜 다른 System 에 비해 tpmC 가 높게 나올까 ?

• IBM Power System 은 response time 은 좋은데 왜 사용량이 높을까 ?S/W 의 변화 없이 System

만바꾼다고 성능이 향상될까 ?

System 에 대해 CPU Clock

이외에 무엇을 더 알고 있을까 ?

Page 4: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Summary of the solutions

• 간접적 방법– Firmware update– AIX update– Software update

• 직접적 방법– AIX configuration– Plan/Selection Hardware– System Architecture– Software Architecture

대부분의 software 문제는 개발시간 및 비용문제로 인해 직접적인 방법으로 해결하기

어려움

Page 5: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - CPU

• CISC– Complex Instruction Set Computer Architecture– 필요한 모든 명령어 셋을 갖추도록 설계– VAX, x86

• EPIC– Explicitly Parallel Instruction Computing Architecture– HP/Intel 공동 설계 , 명시적 병렬 처리를 제공– IA64

• RISC– Reduced Instruction Set Computer Architecture– 명령어 셋 자체를 가장 자주 사용되는 명령어만으로 개수를 줄임으로써

대부분의 활용 업무 면에서 소요시간을 단축할 수 있도록 설계– SPARC, POWER, PA-RISC

Page 6: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - CPU Instructions

• Computation Instructions

• Operands Types

Arithmetic operations Logical operations

ADD Add AND True if A and B true

SUB Subtract OR True if A or B true

MUL Multiply NOT True if A is false

DIV Divide XOR True if only one of

INC Increment A and B is true

DEC Decrement SHL Shift bits left

CMP Compare SHR Shift bits right

BSWAP Reverse byte order

Stack Accumulator Register Memory

Push A Ld A Ld R1, A Add C, B, A

Push B Add B Ld R2, B

Add St C Add R3, R2, R1

Pop C St C, R3

Page 7: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - CPU Instructions

• Data Transfer InstructionsLD Load value from memory to a register

ST Store value from a register to memory

MOV Move value from register to register

CMOV Conditionally move value from register to register if a condition is met

PUSH Push value onto top of stack

POP Pop value from top of stack

Page 8: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - CPU Instructions

• Control Flow Instructions

• Control Flow Relative Frequency

JMP Unconditional jump to another instruction

BR Branch to instruction if condition is met

CALL Call a procedure

RET Return from procedure

INT Software interrupt

Instruction Integer programs Floating-point programs

Branch 75% 82%

Jump 6% 10%

Call & return 19% 8%

Page 9: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - CPU Instructions

• Common InstructionsInstruction Instruction type Percent of

instructions executed

Instruction type Overall percentage

Load Data transfer 22% Data transfer 38%

Branch Control flow 20% Computation 35%

Compare Computation 16% Control flow 22%

Store Data transfer 12%

Add Computation 8%

And Computation 6%

Sub Computation 5%

Move Data transfer 4%

Call Control flow 1%

Return Control flow 1%

Total 95%

Page 10: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - CPU and I/O

• CPU Speed versus I/O Speeds

• Several options to overcome I/O limitations– Incorporate more I/O buses (parallelism)– Extend current I/O technology (increase bandwidth, enhance

operating modes)– Develop new I/O technology

CPU 보다 느린 I/O

I/O 에 의한 wait 를 줄이는 여러 기술 필요

Page 11: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - CPU and I/O

• CPU Efficiency and CPU Access Costs

I/O 에 의한 성능 저하

Page 12: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - I/O

• The elements of an I/O system

Page 13: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - I/O : InfiniBand

• Comparing InfiniBand to Existing Technology– Differences and Benefits

Change Benefit

From: To:

Memory mapped Channel based CPU efficiency, scalability, isolation, recovery.

Parallel bus Switched fabric Scalability, isolation, redundancy, reduced pin-out, modularity, higher cross-sectional bandwidth.

Shared bus access Point to point Greater distance, higher speeds.

Load/store DMA scheduling Improved CPU efficiency.

Single open address space Independent address domains Protection, isolation, recovery, reliability.

Page 14: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - I/O : InfiniBand

Shared Bus Topology Shared Bus ArchitectureSwitched Fabric Topology

InfiniBand Switched Architecture

traditional

InfiniBand

InfiniBand Architecture

Page 15: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - I/O : InfiniBand

Accessing InfiniBand Services - The Channel Interface : Work / Completion Queue Architecture

Page 16: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - I/O : InfiniBand

InfiniBand Queue Operations – Operations on the send queue fall into three subclass

Queue 를 통해 wait 최소화 , 비동기 처리

Page 17: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - I/O : InfiniBand

• VIA (Virtual Interface Architecture)– Messages Model– Direct, protected access by user level software to the

communications hardware; the protection is effected by means of the virtual memory system.

Comparison of VIA and traditional communications

Send and receive packet descriptors that specify scatter-gather operations—specifying where data must be distributed to and collected up from—when sending and receiving

A send message queue and a receive message queue, comprising linked lists of packet descriptors

A means of notifying the network interface that packets have been placed on a queue

An asynchronous notification process for the status of the operations requested (completion of a send or receive operation is signaled by writing state information into a packet descriptor)

Registration of memory areas used for communications: before communications are started, the memory areas for each hardware unit are identified and noted, allowing expensive operations, such as locking the pages, to be used and translating from virtual to real addresses to be done once, outside performance-critical data transfers

Page 18: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - I/O : InfiniBand

Logical processing steps in TCP/IP

White indicates per-message processing: it is the processing load imposed by the system call on the sockets interface, and is independent of the size of the message

Light gray indicates per-fragment processing (a long message is broken up into several fragments): this covers TCP, IP, media access and interrupt handling

Dark grey indicates per-byte processing (actually, per fragment plus per byte in fragment): this covers the data-copying overhead along with computation of the checksum

Checksum 계산 , memory 관리에 의해서도 overhead

발생

Page 19: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Hardware Architecture - I/O : InfiniBand

Operation

Simple DMA Improved DMA

Send

•set up the DMA registers (with buffer address and size)•lock the page containing the buffers and purge corresponding addresses in the data cache•activate the send command•wait until the end of the operation•interrupt upon completion of the operation, and free (unlock) the page

•refill the free buffers with data to be sent•lock the buffer page(s) and purge corresponding addresses in the data cache•refill a descriptor with the addresses and sizes of the buffers just set up•change the descriptor status indicator to "DMA"•if the DMA was inactive, wake it up

Receive

•DMA interrupts processor•allocate a page and purge the cache of its addresses•set up the DMA registers (with buffer address and size)•when the operation completes the DMA will raise an interrupt

•refill descriptor(s) for receiving•purge corresponding addresses in the data cache•when a receive operation completes, DMA sets the descriptor indicator to System; the OS can test the status of different descriptors•if there are no free buffers, the DMA raises an interrupt

• Mechanisms to reduce the number of interrupts

개선된 DMA 방식으로 interrupt 횟수를 줄여

overhead 를 줄임

Page 20: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

System Architecture (Hardware)

• LPAR / DLPAR

Page 21: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

System Architecture (Hardware)

• LPAR / DLPAR– Hypervisor

Page 22: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

System Architecture (Hardware)

• Micro Partitioning– 프로세서당 최대 10 개의 파티션 작성– 여러 파티션 간 자원 공유

Page 23: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

System Architecture (Hardware)

• Micro Partitioning

Page 24: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

System Architecture (Hardware)

• VIO– Part of the Advanced POWER Virtualization feature– Allows for sharing of physical devices, including storage and network– Implemented as a customized AIX-based appliance– Requires careful planning to maintain VIO Server with minimal impact to VIO– Clients– Provides command line tools for maintenance or can be maintained with NIM

Page 25: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

System Architecture (System Software)

• SMT (Simultaneous Multi-Threading)– POWER5 에서 향상된 하드웨어 디자인으로 프로세서가 동시에 두 개의 개별

instruction 을 실행할 수 있는 기능– 하드웨어와 소프트웨어 thread 의 우선 순위 선정을 통해서 어플리케이션의

성능에 지장을 주지 않고 더 많은 하드웨어 자원의 사용률을 증대

• WLM (Workload Manager)– 시스템을 분할하지 않고서 운영중인 업무간에 동적으로 시스템자원을 할당– CPU 프로세서 단위가 아닌 CPU 시간을 분할하여 관리하므로 보다

세밀하게 CPU 자원을 제어– CPU 시간 , 메모리 , 입출력량 등의 개별적 제어를 통해 특성이 다른 여러

종류의 어플리케이션들을 하나의 서버상에서 관리

Page 26: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

System Architecture (System Software)

• WPARs (Workload Partitions)– A workload partition (WPAR), new with the IBM® AIX® 6.1

operating system, expands on the traditional IBM AIX logical partitioning (LPAR) technology by further allowing AIX to be virtualized within a single operating-system image.

– A simple definition of a WPAR is that it is a virtualized AIX instance that runs within a single AIX operating-system image.

Page 27: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Software Architecture - OS

• OS 와 Network Program 과의 관계– Network Program 의 구성요소

• Socket API• I/O• Multi Connection 처리를 위한 Process or Thread• Process or Thread 를 동기화하기 위한 IPC(Inter Process Communication)

H/W (Disk, NIC …)

OS

File System, Memory

Socket API

I/OProcessThread

IPC

OS 와 Network Program 과의 관계

Page 28: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Software Architecture – File on the Unix

• What is the File on the Unix?• Process 가 열고 있는 file 확인

– Process 가 생성되면 기본적으로 Open 하는 File• 0 : 표준입력• 1 : 표준출력• 2 : 표준오류

office2@root/proc/9804/fd>ls -altotal 120dr-x------ 1 root system 0 Sep 27 03:22 .dr-xr-xr-x 1 root system 0 Sep 27 03:22 ..lr-xr-xr-x 24 root system 1024 Sep 22 18:48 0 -> /lr-xr-xr-x 24 root system 1024 Sep 22 18:48 1 -> /lr-xr-xr-x 24 root system 1024 Sep 22 18:48 2 -> /--w--w---- 1 root system 12506 Sep 15 18:13 7--w--w---- 1 root system 12506 Sep 15 18:13 8--w--w---- 1 root system 12506 Sep 15 18:13 9

Page 29: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Software Architecture

• Application Programs and OS– Type of Software (Conceptual Model)

Page 30: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Software Architecture

• Application Programs and OS– Application Programs

Page 31: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Software Architecture

• Application Programs and OS– Operating Systems

Page 32: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Software Architecture

• Application Programs and OS– Device Drivers

Page 33: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Software Architecture

• Application Programs and OS– AIX 5L Structure

Page 34: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Multi-Process Model– Process : Program 이 실행될 때 생성되는 Program 을 대표하는

제어흐름과 System 자원 (memory,file,IPC…) 등을 의미– Process 생성 및 제어

• fork()– Process 복사본 생성– 자신과 코드를 공유하는 Child Process 생성

• exec()– 현재 Process 에 Program 의 실행 이미지를 변경– 새로운 Program 을 Load 해서 실행

Init Process Process’

Process A

Process’’

Process B

fork() fork()

exec()exec()

Multi-processing 으로 인한IPC 는 kernel overhead 를

증가시킴

Page 35: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Multi-Processing Model– Socket Program

socket

bind

listen

accept

read

write

socket

connect

write

read

close

Server Client

연결요청

데이터 요청

데이터 수신

fork()

Page 36: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Multi-Processing Model

Server App

Server AppServer App

Server AppServer App

Client App

Client App

Client App

Client App…

Server AppServer App

Server AppServer App

Client App

Client App

Client App

Client App

fork() ① connecting Client to Server ② fork()

요청이 있을 때마다 fork() 가 일어난다 .

1

2

process Pool ① fork() ② connecting Client to Server

fork() 시간이 오래 걸리므로 pool 에 미리 fork() 를 해서 child processes 를 만들어 놓는다 .

12

Page 37: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IPC (Inter Process Communication)– What is IPC?

• Process 간에 data 를 공유하고 동기화하기 위해 사용하는 방법

– IPC 종류• Semaphore

– 세마포어는 프로세스간 데이타를 동기화하고 보호• Shared Memory

– 다중프로세스들이 가상메모리를 공유 , 메모리 공유를 위한 가장 빠른 수단• Message Queues

– queue 는 자료구조의 한종류인데 , 먼저 들어온 자료가 먼저 나가는 구조– 메시지큐의 IPC 로써의 특징은 다른 공유방식에 비해서 사용방법이 매우 직관적이고

간단– 제어하기가 상당히 까다롭다 .

Page 38: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IPC– IPC 종류

• Pipe– 프로세스의 데이타를 다른 프로세스에게 넘기기 위한 목적으로 사용 . 데이타는

한쪽방향으로만 흐를수 있으며 ( 읽거나 쓸수만 있고 , 동시에 읽고 쓰기를 할수는 없다 .- Read only or Write only), 동일한 부모를 (PPID 가 같은 ) 가지는 process 사이에서만 사용이가능 하다

• FIFO (Named Pipe)– 연속처리 I/O STREAM 선입선출로 Pipe 와 비슷하나 이름을 부여해 서로다른

Process 사이의 사용이 가능한것이 Pipe 와 다른점– mknod 를 이용하여 FIFO 를 생성

• UDS (Unix Domain Socket)– socket API 를 수정없이 이용가능하며 , port 기반의 Internet Domain Socket 에

비해서 로컬 시스템의 파일시스템을 이용해서 내부프로세스간의 통신을 위해 사용한다 .

Page 39: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IPC– IPC commands

– lpcs comnand• ipcs -m ( shared memory )• ipcs -q ( message gueues )• ipcs -s ( semaphore )

– lpcrm comnand• 세마포어 , 메세지큐 , 공유메모리부분을 시스템에서 제거

기능 메세지큐 세마포어 공유메모리

   1.IPC 할당방법    msgget    semget    shmget

   2.IPC 제어방법    msgctl    semctl    shmctl

( 상태변경 , 해제 )

   3.IPC 작동방법    msgsnd    semop    shmat

     (send/receive)    msgrcv    shmdt

Page 40: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IPC– IPC Limits

Semaphores 4.3.0 4.3.1 4.3.2 5.1 5.2 5.3

Maximum number of semaphore IDs for 32-bit kernel

4096 4096 131072 131072 131072 131072

Maximum number of semaphore IDs for 64-bit kernel

4096 4096 131072 131072 1310721048576

Maximum semaphores per semaphore ID 65535 65535 65535 65535 65535 65535

Maximum operations per semop call 1024 1024 1024 1024 1024 1024

Maximum undo entries per process 1024 1024 1024 1024 1024 1024

Size in bytes of undo structure 8208 8208 8208 8208 8208 8208

Semaphore maximum value 32767 32767 32767 32767 32767 32767

Adjust on exit maximum value 16384 16384 16384 16384 16384 16384

Page 41: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IPC– IPC Limits

Message Queue 4.3.0 4.3.1 4.3.2 5.1 5.2 5.3

Maximum message size 4 MB 4 MB 4 MB 4 MB 4 MB 4 MB

Maximum bytes on queue 4 MB 4 MB 4 MB 4 MB 4 MB 4 MB

Maximum number of message queue IDs for 32-bit kernel

4096 4096 131072 131072 131072 131072

Maximum number of message queue IDs for 64-bit kernel

4096 4096 131072 131072 1310721048576

Maximum messages per queue ID 524288 524288 524288 524288 524288 524288

Page 42: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IPC– IPC Limits

Shared Memory 4.3.0 4.3.1 4.3.2 5.1 5.2 5.3

Maximum segment size (32-bit process) 256 MB 2 GB 2 GB 2 GB 2 GB 2 GB

Maximum segment size (64-bit process) for 32-bit kernel

256 MB 2 GB 2 GB 64 GB 1 TB 1 TB

Maximum segment size (64-bit process) for 64-bit kernel

256 MB 2 GB 2 GB 64 GB 1 TB 32 TB

Minimum segment size 1 1 1 1 1 1

Maximum number of shared memory IDs (32-bit kernel)

4096 4096 131072 131072 131072 131072

Maximum number of shared memory IDs (64-bit kernel)

4096 4096 131072 131072 1310721048576

Maximum number of segments per process (32-bit process)

11 11 11 11 11 11

Maximum number of segments per process (64-bit process)

268435456

268435456

268435456

268435456

268435456

268435456

Page 43: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IPC– IPC tunable parameters– msgmax

– msgmnb

Purpose: Specifies maximum message size.

Values: Dynamic with maximum value of 4 MB

Display: N/A

Change: N/A

Diagnosis:

N/A

Tuning: Does not require tuning because it is dynamically adjusted as needed by the kernel.

Purpose: Specifies maximum number of bytes on queue.

Values: Dynamic with maximum value of 4 MB

Display: N/A

Change: N/A

Diagnosis:

N/A

Tuning: Does not require tuning because it is dynamically adjusted as needed by the kernel.

Page 44: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IPC– IPC tunable parameters– msgmni

– msgmnm

Purpose: Specifies maximum number of message queue IDs.

Values: Dynamic with maximum value of 131072

Display: N/A

Change: N/A

Diagnosis:

N/A

Tuning: Does not require tuning because it is dynamically adjusted as needed by the kernel.

Purpose: Specifies maximum number of messages per queue.

Values: Dynamic with maximum value of 524288

Display: N/A

Change: N/A

Diagnosis:

N/A

Tuning: Does not require tuning because it is dynamically adjusted as needed by the kernel.

Page 45: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IPC– IPC tunable parameters– semaem

– semmni

Purpose: Specifies maximum value for adjustment on exit.

Values: Dynamic with maximum value of 16384

Display: N/A

Change: N/A

Diagnosis:

N/A

Tuning: Does not require tuning because it is dynamically adjusted as needed by the kernel.

Purpose: Specifies maximum number of semaphore IDs.

Values: Dynamic with maximum value of 131072

Display: N/A

Change: N/A

Diagnosis:

N/A

Tuning: Does not require tuning because it is dynamically adjusted as needed by the kernel.

Page 46: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IPC– IPC tunable parameters– semmsl

– semopm

Purpose: Specifies maximum number of semaphores per ID.

Values: Dynamic with maximum value of 65535

Display: N/A

Change: N/A

Diagnosis:

N/A

Tuning: Does not require tuning because it is dynamically adjusted as needed by the kernel.

Purpose: Specifies maximum number of operations per semop() call.

Values: Dynamic with maximum value of 1024

Display: N/A

Change: N/A

Diagnosis:

N/A

Tuning: Does not require tuning because it is dynamically adjusted as needed by the kernel.

Page 47: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IPC– IPC tunable parameters– semume

– semvmx

Purpose: Specifies maximum number of undo entries per process.

Values: Dynamic with maximum value of 1024

Display: N/A

Change: N/A

Diagnosis:

N/A

Tuning: Does not require tuning because it is dynamically adjusted as needed by the kernel.

Purpose: Specifies maximum value of a semaphore.

Values: Dynamic with maximum value of 32767

Display: N/A

Change: N/A

Diagnosis:

N/A

Tuning: Does not require tuning because it is dynamically adjusted as needed by the kernel.

Page 48: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IPC– IPC tunable parameters– shmmax

– shmmin

Purpose: Specifies maximum shared memory segment size.

Values: Dynamic with maximum value of 256 MB for 32-bit processes and 0x80000000u for 64-bit

Display: N/A

Change: N/A

Diagnosis:

N/A

Tuning: Does not require tuning because it is dynamically adjusted as needed by the kernel.

Purpose: Specifies minimum shared-memory-segment size.

Values: Dynamic with minimum value of 1

Display: N/A

Change: N/A

Diagnosis:

N/A

Tuning: Does not require tuning because it is dynamically adjusted as needed by the kernel.

Page 49: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IPC– IPC tunable parameters– shmmni

Purpose: Specifies maximum number of shared memory IDs.

Values: Dynamic with maximum value of 131072

Display: N/A

Change: N/A

Diagnosis:

N/A

Tuning: Does not require tuning because it is dynamically adjusted as needed by the kernel.

Page 50: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Multi-Thread Model– Thread : Process 내에서 존재하는 제어 흐름– Socket Program

socket

bind

listen

accept

read

write

socket

connect

write

read

close

Server Client

연결요청

데이터 요청

데이터 수신

pthread_create()

Page 51: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Multi-Thread Model

Server App

ThreadThread

ThreadThread

Client App

Client App

Client App

Client App…

ThreadThread

ThreadThread

Client App

Client App

Client App

Client App

pthread_create() ① connecting Client to Server ② pthread_create()

요청이 있을 때마다 pthread_create() 가 일어나지만 , fork() 보다는 훨씬 가볍다 .

1

Thread Pool ① pthread_create() ② connecting Client to Server

fork() 보다는 가볍지만 thread 생성시간 조차도 줄이기 위해 pool 을 사용 .

12

2

Page 52: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• N:N DB Connection (Multi-Process Model)

Oracle

Child processChild process

Child processChild process

Child processChild process.…

Process

Child processChild process

Child processChild process

Child processChild process.…

fork()

.…

fork()

DB Connection 은 n:n 으로 이루어지지만 Oracle 의 fork() 로 인해 system resource 를

낭비

Connection n:n

DB Query 의 가장 큰 load1.DB Connect (from network)2.DB Query 해석

Page 53: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• 1:1 DB Connection (Multi-Process Model)

Oracle

Child process

Process

Child processChild process

Child processChild process

Child processChild process.…

fork()fork()

DB Connection 은 1:1 로 oracle 의 fork()는 1 회로 제한되어 system resource

낭비가 적지만 client 의 연결이 원활하지 않을 수 있음

Connection 1:1

DB Query 의 가장 큰 load1.DB Connect (from network)2.DB Query 해석

Page 54: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• DB Connection Pool (Multi-Process Model)– Thread Pool or Process Pool

Oracle

Child process

Process

Child processChild process

Child processChild process

Child processChild process.…

fork()

Connection n:n

Thread

Thread

Thread

Thread

Child processChild process

Child process

Pool 내의 미리 맺어놓은 Connection 으로 처리 , Pool 의 자원을 빌려주는 형태로 , 부족할 때 Pool 의 자원을

유동적으로 할당 가능

Connection Pool

fork()

DB Query 의 가장 큰 load1.DB Connect (from network)2.DB Query 해석

Page 55: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• DB Connection Pool (Multi-Thread Model)

• Multi Treading Model (①, ⑤)• Thread Pool Model for DB Connection (①, ②, ③, ⑥)

Server App

Thread

Client App

Client App

Client App

Client App…

ThreadThread

ThreadThread

1 2

Oracle

Child processChild process

Child processChild process

54 3

6

ThreadThread

Thread

Pre-Process Model (Process Pool)Pre-Thread Model (Thread Pool)

Page 56: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• I/O Multiplexing Model– Socket 이 각자의 socket I/O 를 이용하여 통신하지 않고 하나의 socket

I/O 를 통해서 통신하는 방법으로 Socket 을 file descriptor table 에 등록한 후 file descriptor table 의 I/O 를 감시해서 다중 접속을 처리

– select / poll

Server Client연결요청

File descriptor 지정

Server Clientdata 송수신

File descriptor 감시

Server Client연결종료

File descriptor 해제

Page 57: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• I/O Multiplexing Model

– 단점

• I/O Multiplexing 을 위해 selec / poll 을 이용하는데 넓은 범위의 file descriptor array 중에 어떤 file descriptor 에서 event 가 발생하였는지 일일이 loop 를 돌며 확인해야 함

지정한 File descriptorFile descriptor table

모든 File descriptor 를 검사해야 함

I/O Multiplexing Model 의 단점

Page 58: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Event based I/O Model through Real-time Signal

– Event 기반의 socket 처리 방식

• UNIX/Linux : POSIX Real-time Signal, epoll

• Windows, AIX, iSeries OS : IOCP

• FreedBSD : kqueue (kernel queue)

Page 59: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Event based I/O Model through Real-time Signal– Real-time Signal

• 대기열이 존재하지 않는 Signal 의 단점과 이로인해 아무런 정보다 전달되지 않는 단점을 보완

• Real-time Signal 은 대기열이 존재하며 , 대기열의 크기만큼 event 를 저장할 수 있어 signal 의 손실을 피할 수 있다 .

• 또한 , real-time signal 을 발생시킨 socket 의 descriptor 등의 정보 전달이 가능하여 , 부가적인 정보를 저장할 수 있다 .

• select / poll 과 같이 file descriptor table 의 descriptor array 를 뒤지지 않아도 된다 .

Socket1 Thread1Client1Client1

Socket2 Thread2Client2Client2

Socket3 Thread3Client3Client3

SIGRTMIN+1

SIGRTMIN+2

SIGRTMIN+3

Thread-pool 을 이용하여 Real-time signal 을 thread 와 함께 사용

Page 60: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Applicatoin Architecture

• epoll– epoll : event poll– Real-time Signal 보다 약 10% ~ 20% 의 성능 향상– HP-UX, Redhat 지원 , AIX 미지원– Event poll 에 넣고 관리하기 때문에 read/write event 가 발생하면 관련

정보를 return 해줌 . Return 되는 정보는 descriptor 와 같은 정보로 poll과 같은 loop 를 통해 확인할 필요가 없다 .

Socket1Socket1

Socket2Socket2

Socket3Socket3

Event poll

File descriptor

Page 61: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

dphttpd symmetric multiprocessor result

• epoll– httpd test result

dphttpd uniprocessor result

Page 62: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• epoll– Pipetest

Pipetest symmetric multiprocessor result Pipetest uniprocessor result

Page 63: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• epoll– Dead connecton test

128bytes context ,Dead connections test result 1024ytes context ,Dead connections test result

Page 64: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• IOCP (I/O Completion Ports)– IOCP on iSeries

• AS/400 부터 지원 , i 는 1988년 AS/400 으로 시작 , AS/400, OS/400, i5/OS, i6/OS 로 발전

• AS/400 QMU 5.0.1.02 introduces asynchronous I/O completion ports (IOCP)

– IOCP on Windows NT• Windows NT Winsock2 부터 지원

– IOCP on AIX• I/O completion port support was first introduced in AIX 4.3 by APAR

IY06351. An I/O completion port was originally a Windows NT scheduling construct that has since been implemented in other OS's. Domino uses these constructs to improve the scalability of the server. It allows one thread to handle multiple session requests, so that a Notes client session is no longer bound to a single thread for its duration. The completion port is tied directly to a device handle and any network I/O requests that are made to that handle.

Page 65: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Parallel Programming– Fundamental of Parallel Programming

• Multi-Process/Multi-Thread• Asynchronous Procedure Calls• Signal, Event• Queuing Asynchronous Procedure Calls• IOCP

Ex) File Finder Agent

Page 66: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Parallel Programming – OpenMP(Open Multi-Processing)– An Application Program Interface (API) that may be used to

explicitly direct multi-threaded, shared memory parallelism– Comprised of three primary API components

• Compiler Directives• Runtime Library Routines• Environment Variables

– Portable– Standardized

Page 67: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Parallel Programming - MPI– MPI (Message Passing Interface)

• Message Passing Parallel Programming 을 위한 Standard Data Communication Library

• References– http://www.mcs.anl.gov/mpi/index.html– http://www.mpi-forum.org/docs/docs.html

– MPI 목표• 이식성 (portability)• 효율성 (efficiency)• 기능성 (functionality)

Page 68: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Parallel Programming– MPI 기본 개념

• Process 기준으로 작업 할당• Processor : Process = 1:1 or 1:N• Message = data + envelope

– 어떤 process 가 보내는가 ?– 어디에 있는 data 를 보내는가 ?– 어떤 data 를 보내는가 ?– 얼마나 보내는가 ?– 어떤 process 가 받는가 ?– 어디에 저장할 것인가 ?– 얼마나 받을 준비를 해야 하는가 ?

• Tag– Message matching 과 구분에 이용– 순서대로 메시지 도착을 처리할 수 있음– 와일드 카드 사용 가능

• Communicator– 서로간에 통신이 허용되는 프로세스들의 집합

Page 69: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Parallel Programming– MPI 기본 개념

• Process Rank– 동일한 communicator 내의 process 들을 식별하기 위한 식별자

• Point to Point Communication– 두 개 process 사이의 통신– 하나의 송신 process 에 하나의 수신 process 가 대응

• Collective communication– 동시에 여러 개의 process 가 참여– 1:N, N:1, N:N 대응 가능– 여러 번의 P2P Communication 사용을 하나의 Collective Communication 으로

대체» 오류 가능성 적음 , 최적화로 빠름

Page 70: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Java– Development and execution of Java applications

Page 71: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Java– Java application 을 이용하여 System 을 효율적으로 사용하는 방법

• NIO (New I/O)• NIO pollset• Garbage collector 는 자동으로 collect 하도록 나둘 것• 특별한 이유가 없으면 JRE 는 최신으로 update 할 것• 개발시 source code 는 최신으로 유지할 것 (Deprecated 로 명시된 API 는

되도록 다른 API 로 변경하여 사용 )• Framework 을 사용한다면 framework 을 최신으로 유지할 것

반드시 개선됨

JRE 나 Framework 로 인해개선이 안될 수도 있음

Page 72: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Java– pollset

• Java Source code– DatagramChannel channel = DatagramChannel.open();– Channel = configureBlocking(false);– Selector selector = Selector.open();– Channel.register(selector, SelectKey.OP_READ);– Channel.register(selector, SelectKey.OP_READ);– int poll(struct pollfd fds[], nfds_t nfds, int timeout);

• Native pollset interface C source code– pollset_t ps = pollset_create(int maxfd);– int rc = pollset_destory(pollset_t ps);– int rc = pollset_ctl(pollset_t ps, struct poll_ctl *pollctl_array, int array_length);– int nfound = pollset_poll(pollset_t ps, struct pollfd *polldata_array, int

array_length,

int timeout);

Page 73: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Java– pollset

• Traditional poll method

Page 74: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Java– pollset

• pollset method

Page 75: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Java– pollset

• pollcache internal– pollcache control block

Page 76: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Java– pollset

• pollset() – bulky update

Page 77: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Java– pollset

• The throughput performance two drivers(with poll() and with pollset())– pollset driver 가이 poll driver 보다 13.3% 성능 향상

Page 78: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Application Architecture

• Java– pollset

• Time spent on CPU

Page 79: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

AIX I/O Model

• select / poll

• pollset

• event

• Real-time Signal

• AIO

• IOCP

Page 80: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

AIX IOCP

• IOCP

– I/O completion port support was first introduced in AIX 4.3 by APAR IY06351. An I/O completion port was originally a Windows NT scheduling construct that has since been implemented in other OS's.

Page 81: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

AIX IOCP

• IOCP– Synchronous I/O versus asynchronous I/O

Page 82: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

AIX IOCP

• IOCP– IOCP Operation

Page 83: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

AIX IOCP

• IOCP– CreateIoCompletionPort Function

< IOCP on AIX >

#include <iocp.h>int CreateIoCompletionPort (FileDescriptor, CompletionPort, CompletionKey, ConcurrentThreads)HANDLE FileDescriptor, CompletionPort;DWORD CompletionKey, ConcurrentThreads;

< IOCP on Windows >

HANDLE CreateIoCompletionPort (HANDLE FileHandle, // handle to file (socket)HANDLE ExistingCompletionPort, // handle to I/O completion portULONG_PTR CompletionKey, // completion keyDWORD NumberOfConcurrentThreads // number of threads to execute concurrently);

Page 84: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

AIX IOCP

• IOCP– How to configure IOCP on AIX

• fileset : bos.iocp.rte

$ lslpp -l bos.iocp.rteThe output from the lslpp command should be similar to the following : Fileset Level State Description ----------------------------------------------------------------------------

Path: /usr/lib/objrepos bos.iocp.rte 5.3.9.0 APPLIED I/O Completion Ports API

Path: /etc/objrepos bos.iocp.rte 5.3.0.50 COMMITTED I/O Completion Ports API

office2@root/>lsdev -Cciocpiocp0 Available I/O Completion Ports

office2@root/>lsattr -Eliocp0autoconfig available STATE to be configured at system restart True

Page 85: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

AIX IOCP

• IOCP– How to configure IOCP on AIX

Page 86: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

AIX IOCP

• IOCP– How to configure IOCP on AIX

Page 87: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

AIX IOCP

• IOCP– How to configure IOCP on AIX

Page 88: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

AIX IOCP

• IOCP– How to configure IOCP on AIX

Page 89: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

AIX IOCP

• IOCP– How to configure IOCP on AIX

Page 90: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

AIX IOCP

• IOCP API– CreateCompletionPort – GetMultipleCompletionStatus – GetQueuedCompletionStatus – PostQueuedCompletionStatus – ReadFile – WriteFile

Page 91: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

iSeries IOCP

• IOCP API– QsoStartAccept– QsoCreateIOCompletionPort– QsoDestroyIOCompletionPort– QsoPostIOCompletion– QsoStartRecv– QsoStartSend– QsoCancelOperation– QsoWaitForIOCompletion

Page 92: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Windows IOCP

• IOCP API– CreateIoCompletionPort– GetQueuedCompletionStatus– GetQueuedCompletionStatusEx– PostQueuedCompletionStatus– ReadFileEx– WriteFileEx– Kernel Functions

• NtCreateIoCompletion, NtRemoveIoCompletion• KeInitializeQueue, KeRemoveQueue• KeInsertQueue• KeWaitForSingleObject• KeDelayExecutionThread• KiActivateWaiterQueue• KiUnwaitThread• NtSetIoCompletion

Page 93: © 2009 IBM Corporation High Performance Power System 효율적으로 사용하기 High Performance Power System Date. 15/10/2009 DongJoon Cho (djcho@kr.ibm.com) MTS, GTS,

© 2009 IBM Corporation

Q & A