Upload
buithuy
View
216
Download
3
Embed Size (px)
Citation preview
Probabilistic Detection and Sampling
of Concurrency Bugs
Yan Cai (蔡彦)[email protected]
State Key Lab. of Computer Science,
Institute of Software, Chinese Academy of Sciences中科院软件所·计算机科学国家重点实验室
Radius-aware
Probabilistic Deadlock detection
ASE’16
Yan Cai and Zijiang Yang
Locks and Deadlocks
Read
Thread 1
Write
Thread 2
Data1
Read
Thread 1
Write
Write
Thread 2
Read
Data2
Deadlock
3
Thread t1 Thread t2acq(m)
acq(n)
acq(n)
acq(m)
Deadlock Testing
• Random testing
– OS scheduling + random manipulation
– Stress testing
– Heuristic directed random testing
– Systematic scheduling
4
No Guarantee to find a
concurrency bug (e.g., Deadlock)
• PCT Algorithm
– Mathematical randomness with Probabilistic Guarantees
5
Thread t1 Thread t2
s01 acq(m)
s02 acq(n)
s03 rel(n)
s04 rel(m)
s05 acq(n)
s06 acq(m)
s07 rel(m)
s08 rel(n)
k =8, n =2, d =2
1
2 × 82−1= 1/16
1
𝑛 × 𝑘𝑑−1n: #threads, k: #events, d: bug depth
PCT – Probabilistic Concurrency Testing
PCT – Probabilistic Concurrency Testing
• PCT :
– Intuition of guaranteed probability:
1. satisfy the 1st order by assigning the thread a largest priority (1/𝑛)
2. select d – 1 priority change points at the remaining d – 1 order
position (1/𝑘 × 1/k ×…× 1/𝑘 = 1
𝑘𝑑−1) ⇒1
𝑛×𝑘𝑑−1
6
Thread t1 Thread t2
s01 acq(m)
s02 acq(n)
s03 rel(n)
s04 rel(m)
s05 acq(n)
s06 acq(m)
s07 rel(m)
s08 rel(n)
k =8, n =2, d =21
2 × 82−1= 1/16
PCT – Probabilistic Concurrency Testing
• Provide a guarantee (a probability ):
But …
• Theoretical model, not consider
thread interaction:
real executions do not follow designed
executions
• Guaranteed probability decreases
exponentially with increase of bug
depth: due to factor 1
𝑘𝑑−1.
7
Threads t1, t2, … tn
(a) Uniform distribution (b) Centralized distribution
Execu
tio
n
Execution of events
Coordination among threads
Ranges of events to be selected by PCT or RPro
Threads t1, t2, … tn
… …
1
𝑛 × 𝑘𝑑−1n: #threads, k: #events, d: bug depth
RPro- Radius aware
• Our approach: RPro – Radius aware Probabilistic testing
8
Threads t1, t2, … tn
(a) Uniform distribution (b) Centralized distribution
Execu
tio
n
Execution of events
Coordination among threads
Ranges of events to be selected by PCT or RPro
Threads t1, t2, … tn
… …
• Consider thread interaction
• Guaranteed probability
decreases:1
𝑟(not
1
𝑘, r≪ k)
1
𝑛 × 𝑘𝑑−1
1
𝑛 × 𝑘 × 𝑟𝑑−2
PCT v.s. RPro
Threads t1, t2, … tn
…
RPro- Radius aware
• RPro: Theoretical guarantee
9
0
Bug Radius
Probability
1 𝑛 × 𝑘𝑑−1
0
1 𝑛 × 𝑘 × 𝑟𝑑−2
rbugrbug – 1
PCT: Guaranteed probability
RPro: Guaranteed probability
RPro: Probability in practice
r = k
How to find rbug?
Experiment
• Results
10
p=0.0020
r=17, p=0.0439
0.00
0.01
0.02
0.03
0.04
0.05
0 15 30 45 60 75 90 105 120 135 150
PCT
RPro
p=0.0385
r=3, p=0.0632
0.02
0.03
0.04
0.05
0.06
0.07
0 15 30 45 60 75 90 105 120 135 150
p=0.0005
r=11, p=0.0229
0.00
0.01
0.01
0.02
0.02
0.03
0 15 30 45 60 75 90 105 120 135 150
p=0.0680
r=5, p=0.1123
0.06
0.07
0.08
0.09
0.10
0.11
0.12
0 15 30 45 60 75 90 105 120 135 150
p=0.1755
r=2, p=0.453
0.15
0.20
0.25
0.30
0.35
0.40
0.45
0.50
0 15 30 45 60 75 90 105 120 135 150
p=0.4326
r=2, p=0.6863
0.40
0.45
0.50
0.55
0.60
0.65
0.70
0 15 30 45 60 75 90 105 120 135 150
p=0.0004
r=47, p=0.0022
-0.0001
0.0004
0.0009
0.0014
0.0019
0.0024
0 50 100 150 200 250 300
p=0.0088
r=27, p=0.0256
0.0050
0.0100
0.0150
0.0200
0.0250
0.0300
0 15 30 45 60 75 90 105 120 135 150
p=0.0000
r=114, p=0.0039
-0.0001
0.0009
0.0019
0.0029
0.0039
0.0049
0 50 100 150 200 250 300
p=0.0000
r=20, p=0.0062
-0.0001
0.0009
0.0019
0.0029
0.0039
0.0049
0.0059
0.0069
0 15 30 45 60 75 90 105 120 135 150
(a) JDBC-1 (b) JDBC-2
(c) JDBC-3 (d) JDBC-4
(e) Hawknl (f) SQLite
(g) MySQL-1 (h) MySQL-2
(i) MySQL-3 (j) MySQL-4
0
Bug Radius
Probability
1 𝑛 × 𝑘𝑑−1
0
1 𝑛 × 𝑘 × 𝑟𝑑−2
rbugrbug – 1
PCT: Guaranteed probability
RPro: Guaranteed probability
RPro: Probability in practice
r = k
Table 1. The best radiuses (rbest) of each benchmarks.
Benchmark #
events
#
threads
bug
depth 𝒓𝒃𝒆𝒔𝒕*
𝒓𝒃𝒆𝒔𝒕
#𝒆𝒗𝒆𝒏𝒕𝒔 Probability
Hawknl 28 3 3 2 - 0.4530
SQLite 16 3 3 2 - 0.6863
JDBC-2 5,050 3 3 3 0.059% 0.0632
JDBC-4 5,090 3 3 5 0.098% 0.1123
JDBC-3 5,080 3 3 11 0.217% 0.0229
JDBC-1 5,088 3 3 17 0.334% 0.0439
MySQL-4 444,621 19 3 20 0.005% 0.0062
MySQL-2 15,066 17 3 27 0.179% 0.0256
MySQL-1 19,300 16 3 47 0.244% 0.0022
MySQL-3 406,117 22 6 114 0.028% 0.0039
(* All rows are sorted on the data in this column.)
Deployable
Data Race Sampling
FSE’16
Yan Cai, Jian Zhang, Lingwei Cao, and Jian Liu
Concurrency bugs
• Difficult to detect
– Non-determinism (space explosion)
– Inadequate test inputs
– …
• Even after software release,
concurrency bugs may still occur
12
Concurrency bugs
• It is necessary to detect concurrency
bugs in deployed products
• Challenges:
not to disturb normal executions
– light-weighted
– …
Sample user executions
Detector
<5% overhead
13
Existing works
• Data Race
• Happens-before (HB Race)• Access pairs not ordered by happens-before relation (HBR)
Two threads concurrently access the same memory
location and at least one access is a write.
Thread t1
x++;sync(m){}
Thread t2
sync(m){x++;}
Thread t1
x++;
sync(m){}
Thread t2
sync(m){x++;}
Value of x: +2. Value of x: +1 or +2? 14
Existing works
• Happens-before Races
– Track full Happens-before relation
• Incurring many O(n) operations
Insight 1:
Not to track Full Happens-before Relation
0% sampling rate => ~30%
overhead (Pacer, PLDI’10)
~15% in our experiment
15
Existing works
• Hardware based (e.g., DataCollider, OSDI’10)
– Code Breakpoints and Data Breakpoints
(or Watchpoints)
– Collision Races
• A data race: two accesses
– Select a memory address =>
Set a data breakpoint =>
Wait for the breakpoint to be fired
– The waiting time directly increases the sampling overhead
Insight 2: Not to directly delay executions16
Existing works
• …
• See our paper for more insights
17
Our Proposal
• Clock Race
– For data race sampling purpose
• CRSampler
– To detect clock races
18
Thread 1 1
Thread 2
2
1 𝑘 is not changed between time1 and time2.
time1
time2
Tim
e el
ap
se
Clock Race
• Clock Race
– Thread-local clock: an integer for each thread, increased on
synchronization operation.
– Two accesses (with at least a write) form a Clock Race if:
at least one thread-local clock is not changed
in between the two accesses
sync
No clock races
sync
19
Thread 1 1
Thread 2
2
1 𝑘 is not changed between time1 and time2.
time1
time2
Tim
e el
ap
se
Clock Race
• A Quick Demonstration
Thread 1
acquire(l) onSync( );x = 0; sample(x);…
release(l) onSync( );
Thread 2
acquire(k) onSync( );…
release(k) onSync( );x ++;
1 𝑘 2 𝑘10 8
11 9
11 9
11 9
11 10
11 10
12
On this read, t1.clock remains 11,
a clock race on x is reported
Maintain thread-local clocks
Sampled access
20
Clock Race
• Clock Race
– Race checking does not need to delay any thread.
– But: after e1 appears, how much time is required to check two
accesses? • Given a short time, it is not enough to trap the second access.
• Given a long time, all threads’ lock clocks are changed.
Thread 1 1
Thread 2
2
1 𝑘 is not changed between time1 and time2.
time1
time2
Tim
e el
ap
se
One second,
or …
21
Setup
• Implementation
– Jikes RVM
– Sampling: Java class load time
– Memory accesses Linux Kernel
• Benchmarks
– Dacapo benchmark suite
JikesRVM
User-site
Agent
Kernel
SiteCPU
Set breakpoints
On firing
User space Kernel space
Netlink
Com.
Core of
DC/CR
Execu
tio
n
22
Setup
• Comparisons
– Sampling rate: 0.1% to 1.0%
– Pacer (PLDI’10)
– Data Collider (OSDI’10)
– CRSampler
• ThinkPad Workstation
– I7-4710MQ CPU, four cores, 16G memory, 250G SSD
23
15ms, 30msDC15, DC30
CR15, CR30
Experiments
• Overall Results
– Effectiveness
• CR: more data races at
low sampling rates
– Overhead
Bench-
marks
Binary
Size (KB) # of
threads
# of
sync. Pacer*
DC
15
DC
30
CR
15
CR
30
avrora09 2,086 7 3,312,801 3 3 3 5 3
xalan06 1,027 9 35,859,489 5 5 5 87 81
xalan09 4,827 9 12,599,144 0 2 2 84 91
sunflow09 1,017 17 1,590 0 0 2 46 45
pmd09 2,996 9 20,550 4 2 2 110 121
eclipse06 41,822 16 51,131,093 19 2 6 58 63
Sum: 31 14 20 390 404
y = 9.0749x + 0.1474
R² = 0.9397
y = 92.529x + 0.0588
R² = 0.9851
y = 173.46x + 0.0997
R² = 0.9796
0%
40%
80%
120%
160%
200%
0.1% 0.2% 0.3% 0.4% 0.5% 0.6% 0.7% 0.8% 0.9% 1.0%
Avere
age O
verh
ead
Sampling rate
Avg. Overhead of Three + TrendlinesPacer
DC15
DC30
CR15
CR30
y = 9.0749x + 0.1474
R² = 0.9397
y = 2.6312x + 0.0252
R² = 0.9624
y = 2.0859x + 0.0269
R² = 0.98990%
5%
10%
15%
20%
25%
30%
0.1% 0.2% 0.3% 0.4% 0.5% 0.6% 0.7% 0.8% 0.9% 1.0%
Avere
age O
verh
ead
Sampling rate
Avg. Overhead of Pacer and CRSampler + TrendlinesPacer
CR15
CR30
y = 9.0749x + 0.1474
R² = 0.9397
y = 92.529x + 0.0588
R² = 0.9851
y = 173.46x + 0.0997
R² = 0.9796
0%
40%
80%
120%
160%
200%
0.1% 0.2% 0.3% 0.4% 0.5% 0.6% 0.7% 0.8% 0.9% 1.0%
Avere
age O
verh
ead
Sampling rate
Avg. Overhead of Three + TrendlinesPacer
DC15
DC30
CR15
CR30
y = 9.0749x + 0.1474
R² = 0.9397
y = 2.6312x + 0.0252
R² = 0.9624
y = 2.0859x + 0.0269
R² = 0.98990%
5%
10%
15%
20%
25%
30%
0.1% 0.2% 0.3% 0.4% 0.5% 0.6% 0.7% 0.8% 0.9% 1.0%
Avere
age O
verh
ead
Sampling rate
Avg. Overhead of Pacer and CRSampler + TrendlinesPacer
CR15
CR30
DC30
DC15
CR30 , CR15Pacer
5%
24
Experiments
• Discussions
– DataCollider: overhead from its delays.• DC30 has almost 2 times overhead than DC15.
– Pacer: basic overhead ~15%
– CRSampler: ~5% overhead at 1.0% sampling rate.
25
y = 9.0749x + 0.1474
R² = 0.9397
y = 92.529x + 0.0588
R² = 0.9851
y = 173.46x + 0.0997
R² = 0.9796
0%
40%
80%
120%
160%
200%
0.1% 0.2% 0.3% 0.4% 0.5% 0.6% 0.7% 0.8% 0.9% 1.0%
Avere
age O
verh
ead
Sampling rate
Avg. Overhead of Three + TrendlinesPacer
DC15
DC30
CR15
CR30
y = 9.0749x + 0.1474
R² = 0.9397
y = 2.6312x + 0.0252
R² = 0.9624
y = 2.0859x + 0.0269
R² = 0.98990%
5%
10%
15%
20%
25%
30%
0.1% 0.2% 0.3% 0.4% 0.5% 0.6% 0.7% 0.8% 0.9% 1.0%
Avere
age O
verh
ead
Sampling rate
Avg. Overhead of Pacer and CRSampler + TrendlinesPacer
CR15
CR30
y = 9.0749x + 0.1474
R² = 0.9397
y = 92.529x + 0.0588
R² = 0.9851
y = 173.46x + 0.0997
R² = 0.9796
0%
40%
80%
120%
160%
200%
0.1% 0.2% 0.3% 0.4% 0.5% 0.6% 0.7% 0.8% 0.9% 1.0%
Avere
age O
verh
ead
Sampling rate
Avg. Overhead of Three + TrendlinesPacer
DC15
DC30
CR15
CR30
y = 9.0749x + 0.1474
R² = 0.9397
y = 2.6312x + 0.0252
R² = 0.9624
y = 2.0859x + 0.0269
R² = 0.98990%
5%
10%
15%
20%
25%
30%
0.1% 0.2% 0.3% 0.4% 0.5% 0.6% 0.7% 0.8% 0.9% 1.0%
Avere
age O
verh
ead
Sampling rate
Avg. Overhead of Pacer and CRSampler + TrendlinesPacer
CR15
CR30
DC30
DC15
CR30 , CR15Pacer
5%
26
Thanks~