Upload
elias
View
26
Download
0
Embed Size (px)
DESCRIPTION
Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles (SOSP), 2003. Performance Debugging for Distributed Systems of Black Boxes. Marcos K. Aguilera, Jeffrey C. Mogul, Janet L. Wiener, Patrick Reynolds and Athicha Muthitacharoen Jihye Yun and Seohyun Kwak. Contents. - PowerPoint PPT Presentation
Citation preview
Performance Debugging for Distributed Systems of Black
BoxesMarcos K. Aguilera, Jeffrey C. Mogul, Janet L. Wiener, Patrick Reynolds and Athicha Muthi-
tacharoen
Jihye Yun and Seohyun Kwak
Proceedings of the Nineteenth ACM Symposium on Operating Systems Principles (SOSP), 2003
Contents• Introduction• Overview of Approach• Algorithms
– Nesting Algorithm (RPC)– Convolution Algorithm (free-form)
• Experimental Results• Conclusions
Contents• Introduction• Overview of Approach• Algorithms
– Nesting Algorithm (RPC)– Convolution Algorithm (free-form)
• Experimental Results• Conclusions
By Jihye Yun
Introduction• Distributed systems that are composed of “black-
box” components are hard to debug
• Goal of this paper is to design tools enabling to isolate performance bottlenecks in black-box dis-tributed systems
• Approach based on application-independent pas-sive tracing of communication between nodes in distributed system, and combined offline analysis of these traces
Problem• An external request to
system causes activities in graph along a causal path
• Assume that all laten-cies in such a system can be ascribed to node traversals
client client client client client
web server
web server
web server
authenticationserver
applicationserver
applicationserver
data-base
server
data-base
server
data-base
server
data-base
server
A series of node traversals where each traversal is caused by some message from a prior node on path
Tools and methodologies to understand sources of latency in distributed system
Goals• Isolate performance bottlenecks
– Find high-impact causal path patterns• High-impact: occur frequently, and contribute significantly to overall latency
– Identify high-latency nodes on high-impact patterns• Add significant latency to these patterns
• Make tools broadly applicable to systems of black-box com-ponents– Require minimal knowledge and no modification of applications,
middleware, messages or workloads– Not significantly perturb system performance
Overview of Approach• Rely on tracing messages between nodes
– Challenges• Real trace contain interleaved messages from many causal paths• Want only statistically significant causal paths
Overview of Approach• Exposing and tracing communication (online)
– Gather complete trace of all inter-node messages
• Inferring causal paths and patterns (offline)
– Post-process trace using one of several algorithms
• Visualization– Provide appropriate ways to visualize results
How obtain individual messages without perturbing sys-temHow convert messages to concise traceHow manage and store large amounts of trace data
Algorithms• Two distinctly different algorithms for inferring
causal path patterns and latencies from relatively simple message traces
– Nesting algorithm• Depend on use of RPC-style communication• Operate on individual messages in trace
– Convolution algorithm• Handle free-form message-based communication• Use signal-processing technique to extract causal in-
formation from traces
Nesting Algorithm• Infer causality from nesting relationships by
message timestamps• Roughly linear (in time and space) in size of trace• However, require RPC-style communication
(Messages are calls and returns)
Nesting Algorithm1. Find call pairs in trace
– (A, B, 1, 11), (B, C, 3, 5), (B, D, 7, 9)
2. Find and score all nesting relationships– (B, C, 3, 5) nested in (A, B, 1, 11)– (B, D, 7, 9) also nested in (A, B, 1, 11)
3. Pick best parents
4. Derive call paths– ABC;D
(call)
(return)
Nesting Algorithm• Find call pairs in trace
Based on packet header information
Match call and return trace entries into call pairs using hash table
Identify all possible parents for each call pair
procedure FindCallPairsfor each trace entry (t1, CALL/RET, sender A, receiver B, callid id) case CALL: store (t1, CALL, A, B, id) in Topencalls
case RETURN: find matching entry (t2, CALL, B, A, id) in Topencalls if match is found then remove entry from Topencalls update entry with return message timestamp t2 add entry to Tcallpairs
entry.parents := {all callpairs (t3, CALL, X, A, id2) in Topencalls with t3 < t2}
Nesting Algorithm• Find and score all nesting relationships
– Number of histograms is equal to number of (A, B, C) triples– This number, which is independent of trace length, is at
most n3, for n nodes
procedure ScoreNestingsfor each child (B, C, t2, t3) in Tcallpairs
for each parent(A, B, t1, t4) in child.parents scoreboard[A, B, C, t2- t1] += (1/|child.parents|)
Histogram of delays between two call messages in potentially-causal nesting
(call)
(call)
(return)
(return)
t1t2
t3
t4
node A node B node C
Nesting Algorithm• Find and score all nesting relationships
– Example of parallel calls
12
345
678
Possible sets of parent-child pairings1357 : t3-t11467 : t4-t12358 : t3-t22468 : t4-t2
short delay medium delay long delay
t3-t1t4-t2
t4-t1t3-t2
Nesting Algorithm• Pick best parents
procedure FindNestedPairsfor each child (B, C, t2, t3) in Tcallpairs
maxscore := 0 for each p (A, B, t1, t4) in child.parents score[p] = scoreboard[A, B, C, t2- t1]∙penalty if(score[p] > maxscore) then maxscore := score[p] parent := p parent.children := parent.children ∪ {child}
Using # of children already as-signed to given parent
Overlapping-child penaltyGeneric-child penalty Same-child penalty
Nesting Algorithm• Derive call paths
procedure FindCallPathsfor each callpair (A, B, t1, t2) if callpair.parent = ф then root := new path starting at A root.edges := { CreatePathNode(callpair, t1) } if root is in Tpaths then update its latencies else add root to Tpaths
procedure CreatePathNode(callpair (A, B, t1, t4), tp) node := new node with name B node.latency := t4 – t1 node.call_delay := t1 – tp for each child in callpair.children node.edges := node.edges ∪ { CreatePathNode(child, t1) } return node
Nesting Algorithm• Infer causality from nesting relationships by
message timestamps
• Time complexity: O(mp)– m = # of messages– p = mean of per-node parallelism
• However, require RPC-style communication
Contents• Introduction• Overview of Approach• Algorithms
– Nesting Algorithm (RPC)– Convolution Algorithm (free-form)
• Experimental Results• Conclusions
By Seohyun Kwak
Convolution Algorithm• Consider the aggregation of multiple messages• Convert traces into time signals and use signal
processing techniques to find correlation btw mes-sages
• Can be used on traces of free-form message-based communication, not just RPC-style traces
Convolution Algorithm• A sent message to B at times 1,2,5,6,7
S1(t)= AB messages 1 2 3 4 5 6 7 time
Convolution Algorithm• Look for time-shifted similarities
– Compute convolution X(t) = S2(t) S1(t)– Use Fast Fourier TransformsS1(t)
(AB)
S2(t)(BC)
X(t)
Peaks in X(t) suggest causality between
AB and BC
Time shift of a peak indicates delay
Convolution Algorithm• Spike
– Result has a spike at position d if and only if s2(t) contains a copy of s1(t) time-shifted by d.
– X-axis represents the time shift
– Y-axis roughly estimates the number of messages match-ing a given shift
• Time complexity: O(em+eSlogS)– m = # message– e = # edge in output graph– S = # time steps in trace
• Need to choose time step size – Must be shorter than delays of interest– Too coarse: poor accuracy– Too fine: long running time
• Robust to noise in trace• Run-time is dependent on the trace duration and
time-quantum, not the trace length
Convolution Algorithm
• Nesting– Looks at individual paths and then aggregates– Finds rare paths– Requires call/return style communication– Fast enough for real-time analysis
• Convolution– Applicable to a broader class of systems– Slower: more work with less information– May need to try different time steps to get good re-
sults– Reasonable for off-line analysis
Algorithm comparison
Communicationstyle
Rare events
Level ofTrace detail
Time and spacecomplexity
Visualization
Nesting Algorithm Convolution Algorithm
RPC only RPC or free-form messages
Yes, but hard No
<timestamp, sender, receiver><timestamp, sender, receiver>
+call/return tag
Linear spaceLinear time
Linear spacePolynomial time
RPC call and return combinedMore compact
Less compact
Summarize
Visualization of results• Nesting algorithm
• Convolution algorithm
Experiments : generated traces• Used a trace generator to simulate a distributed
system– Nesting algorithm
Infer causal path and delay of each edges and nodes
Experiments : generated traces• Used a trace generator to simulate a distributed
system
– Convolution algorithm
Accuracy• Trace parallelism
• Delay variation
• Message drop rate
ResultsTrace Length
(messages)Duration
(sec)Memory
(MB)CPU time
(sec)
Nesting
Multi-tier (short) 20,164 50 1.5 0.23
Multi-tier 202,520 500 13.8 2.27
Multi-tier (long) 2,026,658 5,000 136.8 23.97
PetStore 234,036 2,000 18.4 2.92
Convolution (20 ms time step)
PetStore 234,036 2,000 25.0 6,301.00
• Cannot detect rare events– Solved by employing sliding window– Detect frequent only in a short time interval
• Analyzing with real system is needed• Merge similar paths from nesting algo-
rithm to from a single visualization of the system
Future work
Thank you