Upload
karin-bennett
View
220
Download
0
Embed Size (px)
DESCRIPTION
Domain Distribution Domains may be distributed across locales var D: domain(2) dmapped Block(CompGrid, …) = …; A distribution defines… …ownership of the domain’s indices (and its arrays’ elements) …default work ownership for operations on the domains/arrays e.g., forall loops or promoted operations …memory layout/representation of array elements/domain indices …implementation of operations on its domains and arrays e.g., accessors, iterators, communication patterns, … D A B CompGrid L0L1L2L3 L4L5L6L7 Distributions Data Parallelism Task Parallelism Locality Control Target Machine Base Language
Citation preview
Chapel: User-Defined Distributions
Brad ChamberlainCray Inc.
CSEP 524May 20, 2010
Chapel DistributionsDistributions: “Recipes for parallel, distributed arrays”
• help the compiler map from the computation’s global view…
…down to the fragmented, per-processor implementation
=
α ·+
=
α ·+
=
α ·+
=
α ·+
=
α ·+
MEMORY MEMORY MEMORY MEMORY
Distributions
Data Parallelism
Task Parallelism
Locality Control
Target Machine
Base Language
Domain Distribution
Domains may be distributed across localesvar D: domain(2) dmapped Block(CompGrid, …) = …;
A distribution defines……ownership of the domain’s indices (and its arrays’ elements)…default work ownership for operations on the domains/arrays
e.g., forall loops or promoted operations…memory layout/representation of array elements/domain indices …implementation of operations on its domains and arrays
e.g., accessors, iterators, communication patterns, …
D AB
CompGrid
L0 L1 L2 L3
L4 L5 L6 L7
Distributions
Data Parallelism
Task Parallelism
Locality Control
Target Machine
Base Language
Domain Distributions Any domain type may be distributed Distributions do not affect program semantics
• only implementation details and therefore performance
“steve”“lee”“sung”“david”“jacob”“albert”“brad”
Distributions
Data Parallelism
Task Parallelism
Locality Control
Target Machine
Base Language
Domain Distributions
“steve”“lee”“sung”“david”“jacob”“albert”“brad”
Any domain type may be distributed Distributions do not affect program semantics
• only implementation details and therefore performance
Distributions
Data Parallelism
Task Parallelism
Locality Control
Target Machine
Base Language
Distributions: Goals & Research Advanced users can write their own distributions
• specified in Chapel using lower-level language features
Chapel will provide a standard library of distributions• written using the same user-defined distribution mechanism
(Draft paper describing user-defined distribution strategy available by request)
Distributions
Data Parallelism
Task Parallelism
Locality Control
Target Machine
Base Language
A Simple Distribution: Block1DIntent: block a 1D index space across a set of locales
L0 L1 L20 1-1 ……
Use a bounding boxto compute the blocking
8… …
Distributions
Data Parallelism
Task Parallelism
Locality Control
Target Machine
Base Language
Distributions vs. DomainsQ1: Why distinguish between distributions and domains?Q2: Why do distributions map an index space rather than a
fixed index set?A: To permit several domains to share a single distribution
• amortizes the overheads of storing a distribution• supports trivial domain/array alignment and compiler optimizations
const D : …dmapped B1 = [1..8], outerD: …dmapped B1 = [0..9], innerD: subdomain(D) = [2..7], slideD: subdomain(D) = [4..6];
L0 L1 L2
Sharing a distributionsupports trivial alignment
of these domains
Distributions
Data Parallelism
Task Parallelism
Locality Control
Target Machine
Base Language
Distributions vs. DomainsQ1: Why distinguish between distributions and domains?Q2: Why do distributions map an index space rather than a
fixed index set?A: To permit several domains to share a single distribution
• amortizes the overheads of storing a distribution• supports trivial domain/array alignment and compiler optimizations
const D : …dmapped B1 = [1..8], outerD: …dmapped B2 = [0..9], innerD: …dmapped B3 = [2..7], slideD: …dmapped B4 = [4..6];
L0 L1 L2
When each domain isgiven its own distribution,
the compiler cannot reasonabout alignment of indices.
Distributions
Data Parallelism
Task Parallelism
Locality Control
Target Machine
Base Language
The Block Distribution maps the indices of a domain in a dense fashion across the target Locales according to the boundingBox argument
The Block Distribution
const Dist = new dmap(new Block(boundingBox=[1..4, 1..8]));
var Dom: domain(2) dmapped Dist = [1..4, 1..8];
L0 L1 L2 L3
L4 L5 L6 L7distributed over
The Cyclic Distribution maps the indices of a domain in a round-robin fashion across the target Locales according to the startIdx argument
The Cyclic Distribution
const Dist = new dmap(new Cyclic(startIdx=(1,1)));
var Dom: domain(2) dmapped Dist = [1..4, 1..8];
L0 L1 L2 L3
L4 L5 L6 L7distributed over
Domain Maps: Distributions and LayoutsDomain Map: The general concept that indicates how to
implement a domain and its arrays
Two flavors:• layout: a domain map targeting a single locale
row-major order, column major order, Morton order hierarchically tiled, linked data structure, etc. compressed sparse row, open hashing w/ quadratic probing
• distribution: a domain map targeting multiple locales block, cyclic, block-cyclic recursive bisection, hashing across locales, graph partitioning …
Domain Map Framework: Layoutsdistribution domain array
Responsibility: How to generate new domains
Responsibility: How to store, iterate over domain indices
Responsibility: How to store, access, iterate over array elements
three descriptors:
Distributions
Data Parallelism
Task Parallelism
Locality Control
Target Machine
Base Language
Domain Map Framework: Distributionsdistribution domain array
Responsibility: How to generate new domains and map indices to locales
Responsibility: How to store, iterate over domain indices
Responsibility: How to store, access, iterate over array elements
global descriptors(one global instance
or replicated per locale)
local descriptors(one instance per locale)
Responsibility: How to store, iterate over local domain indices
Responsibility: How to store, access, iterate over local array elements
Responsibility: How to store asingle locale’sportion of theindex space
Distributions
Data Parallelism
Task Parallelism
Locality Control
Target Machine
Base Language
Domain Map Framework: Distributions
global descriptors(one global instance
or replicated per locale)
local descriptors(one instance per locale)
distribution domain array
domain descriptor
distribution descriptor
localarray
descriptor
local domain
descriptor
array descriptor
target locale setdistribution argsmap index to locale…
global index infoindex typeindex iterationallocate array…
element typeglobal value iterationrandom access…
store local indiceslocal index iterationadd new indices…
store local valueslocal value iterationlocal random access…
= descriptor state/type= descriptor methods
legend
local distributiondescriptor
Distributions
Data Parallelism
Task Parallelism
Locality Control
Target Machine
Base Language
myElems =
4 5
myElems =
6 8
myElems =
31
1D Block Distribution Classes
global descriptors
local descriptors
distribution domain arraycode
L0 L1 L2
myChunk =
min(idxType) 3
myChunk =
max(idxType)6
myChunk =
4 5
myElems =
4 5
myElems =
6 8
myElems =
31
const B1 = new dmap( new Block(bbox=[1..8]));
const D: domain(1) dmapped B1 = [1..8];
var A, B: [D] real;
myBlock =
4 5
myBlock =
6 8
myBlock =
31
boundingBox =
1 8
targetLocales =
L0 L1 L2
whole =
1 8
(LocaleSpace = [0..2])
myElems =
4 5
myElems =
6
myElems =
1D Block Distribution Classes
global descriptors
local descriptors
distribution domain arraycode
L0 L1 L2
myChunk =
min(idxType) 3
myChunk =
max(idxType)6
myChunk =
4 5
myElems =
4 5
myElems =
6
myElems =
const B1 = new dmap( new Block(bbox=[1..8]));
const sliceD: domain(1) dmapped B1 = [4..6];
var A2, B2: [sliceD] real;
myBlock =
4 5
myBlock =
6
myBlock =
-10 …
boundingBox =
1 8
targetLocales =
L0 L1 L2
whole =
4 6
(LocaleSpace = [0..2])