OpenCL
Designed as an API and language specification
Standards maintained by the Khronos group Currently 1.0, 1.1, and 1.2
Manufacturers release their own SDK and drivers
Major backers: Apple, AMD/ATI, Intel
OpenCL
Alternative to CUDA
Not limited to ATI GPUs
Designed for “heterogenous computing”
Executable on many devices, including CPUs, GPUs, DSPs, and FPGAs
OpenCL
Similar structure of host programs and kernels
Set of compute devices is called a 'context'
Kernels executed by 'processing elements'
Kernels can be compiled at run-time or build-time
OpenCL
Task Parallelism – many kernels running at once
OpenCL 1.2 – device can be partitioned down to single Compute Unit
Built-in kernels for device-specific functionality
Advantages
Same code can be run on different devices Can also be run on NVIDIA GPUs!
AMD/ATI attempting to integrate compute elements into other platforms (Accelerated Processing Units)
Limited library of portable math routines Most common BLAST and FFT routines
Disadvantages
No “official” implementation
Vendors may meet specs or add restrictions Apple adds restrictions on group size
Devices need appropriate settings to perform well Different capabilities → different performance Solution: Tuning/load balancing framework
Restrictions
No recursion, variadics, or function pointer
Cannot dynamically allocate memory from device
No native variable-length arrays, double-precision
Some can be worked around by extensions
Terminology
CUDA: Scalar Core Streaming Multiprocssr Warp PTX
OpenCL: Stream Core Compute Unit Wavefront Intermediate Language
Terminology
CUDA: Host Memory Global/Device Memory Local Memory Constant Memory Shared Memory Registers
OpenCL: Host Memory Global Memory Global Memory Constant Memory Local Memory Private Memory
Terminology
CUDA: Grid Block Thread Thread ID Block Index Thread Index
OpenCL: NDRange Work group Work item Global ID Block ID Local ID
References
http://blog.accelereyes.com/blog/wp-content/uploads/2012/02/CUDAvsOpenCL.pdf
https://wiki.aalto.fi/download/attachments/40025977/Cuda+and+OpenCL+API+comparison_presented.pdf
http://www.hpcwire.com/hpcwire/2012-02-28/opencl_gains_ground_on_cuda.html
http://www.netlib.org/utk/people/JackDongarra/PAPERS/parcocudaopencl.pdf
http://www.netlib.org/lapack/lawnspdf/lawn228.pdf