Upload
others
View
0
Download
0
Embed Size (px)
Citation preview
Law of Large Numbers
June 10, 2020
来嶋 秀治 (Shuji Kijima)
Dept. Informatics,
Graduate School of ISEE
Todays topics
Law of large numbers w/ proof
• Markov’s ineq.
• Chebyshev’s ineq.
Central limit theorem
• w/ affine trans. of r.v.
確率統計特論 (Probability & Statistics)
Lesson 5
2
Midterm exam (中間試験)
Date/time: June 24 (6/24), 13:00- 14:30
Place (場所): at moodle.
Submit electronic files (incl. photo: recommended). ≤10MB.
Keep your “original data” (I may ask to submit them later).
電子ファイルを提出 (写真可: 推奨).10MB以内.
紙/データを手元に保存しておくこと
(後日提出を求める場合がある).
Topics (範囲):
Fundamental probability (May 13 – June 17).
check the course page (講義ページを参照のこと)
http://tcs.inf.kyushu-u.ac.jp/~kijima/
Books, notes, google, etc. are allowed to use (持ち込み可).
Communication (e-mail, SNS, BBS) is prohibited (相談不可).
Today’s Summary3
Thm. (law of large numbers; 大数の法則)
Suppose 𝑋1, … , 𝑋𝑛 are i.i.d., w/ expectation 𝜇, and variance 𝜎2,
then 𝑋1+⋯+𝑋𝑛
𝑛converges 𝜇 in probability;
i.e., ∀𝜀 > 0, lim𝑛→∞
Pr𝑋1+⋯+𝑋𝑛
𝑛− 𝜇 < 𝜀 = 1.
Thm. (Central limit theorem; 中心極限定理)
Suppose 𝑋1, … , 𝑋𝑛 are i.i.d., w/ expectation 𝜇, and variance 𝜎2,
then 𝑍𝑛 ≔1
𝑛σ𝑖=1𝑛 𝑋𝑖−𝜇
𝜎converges to N(0,1) in distribution.
i.e., lim𝑛→∞
Pr 𝑍𝑛 < 𝑧 = −∞
𝑧 1
2𝜋e−
𝑥2
2 d𝑥 .
Prove it.
Make sense?
1. Road to Law of Large Numbers
w/ coupon collector
1.1. Markov’s inequality
1.2. Chebyshev’s inequality
1.3. Proof of law of large numbers
Ex. Coupon collector5
The are 𝑛 kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
•ビックリマンシール
•ポケモンカード
Ex. Coupon collector6
The are 𝑛 kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
Suppose you have already drawn 𝑘 − 1 kinds of coupon.
Let 𝑋𝑘 denote the number of draws from 𝑘 − 1 to 𝑘.
The probability is 𝑝𝑘 ≔𝑛−(𝑘−1)
𝑛
The expected number is
E 𝑋𝑘 =1
𝑝𝑘=
𝑛
𝑛 − 𝑘 + 1
•ビックリマンシール
•ポケモンカード
Thm.
𝑛 ln𝑛 ≤ 𝐸 𝑋 ≤ 𝑛 1 + ln 𝑛
Ex. Coupon collector7
The are 𝑛 kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
•ビックリマンシール
•ポケモンカード
harmonic number
E 𝑋 = E
𝑖=1
𝑛
𝑋𝑖
=
𝑖
𝑛
E 𝑋𝑖
=
𝑖=1
𝑛𝑛
𝑛 − 𝑖 + 1
= 𝑛
𝑖′=1
𝑛1
𝑖′
ln 𝑛 = න1
𝑛 1
𝑥d𝑥 ≤
𝑘=1
𝑛1
𝑘
1 +
𝑘=2
𝑛1
𝑘≤ 1 +න
1
𝑛 1
𝑥d𝑥 = 1 + ln𝑛
e.g., n=100, then
ln 100 ≃ 4.605, and hence
460 ≤ 𝐸 𝑋 ≤ 561
Ex. Coupon collector8
The are 𝑛 kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
What is the probability of completion after 𝑚 trials?
•ビックリマンシール
•ポケモンカード
1.1. Markov’s inequality
Markov’s inequality10
Thm. Markov’s inequality
Let X be a nonnegative random variable, then
Pr 𝑋 ≥ 𝑎 ≤E 𝑋
𝑎holds for any a 0.
Markov’s inequality11
E𝑋
𝑎= න
0
∞ 𝑥
𝑎𝑓(𝑥)d𝑥 = න
0
𝑎 𝑥
𝑎𝑓(𝑥)d𝑥 + න
𝑎
∞ 𝑥
𝑎𝑓(𝑥)d𝑥
≥ න𝑎
∞ 𝑥
𝑎𝑓(𝑥)d𝑥 ≥ න
𝑎
∞
𝑓(𝑥) d𝑥 = Pr[𝑋 ≥ 𝑎]
Pr 𝑋 ≥ 𝑎 ≤ E𝑋
𝑎=E 𝑋
𝑎
Thus,
Proof.
Thm. Markov’s inequality
Let X be a nonnegative random variable, then
Pr 𝑋 ≥ 𝑎 ≤E 𝑋
𝑎holds for any a 0.
Ex. Coupon collector12
The are 𝑛 kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
What is the probability of completion after 𝑚 trials?
•ビックリマンシール
•ポケモンカード
Using Markov’s inequality,
Pr 𝑋 ≥ 𝑚 ≤𝐸 𝑋
𝑚≤𝑛 1 + ln𝑛
𝑚e.g., n=100, m=1000,
Pr 𝑐𝑜𝑚𝑝100(1000) = 1 − Pr 𝑋 ≥ 1001 ≥ 1 −100 × (1 + ln(100))
1001≃ 0.44
e.g., n=100, m=10000,
Pr 𝑐𝑜𝑚𝑝100(10000) = 1 − Pr 𝑋 ≥ 10001 ≥ 1 −100 × (1 + ln(100))
1001≃ 0.94
too loose?
rem.
𝑛 ln 𝑛 ≤ 𝐸 𝑋 ≤ 𝑛 1 + ln 𝑛
1.2. Chebyshev’s inequality
Chebyshev’s inequality14
Thm. Chebyshev’s inequality
For any a 0.
Pr 𝑋 − E 𝑋 ≥ 𝑎 ≤Var 𝑋
𝑎2
Remark that
Pr 𝑋 − E 𝑋 ≥ 𝑎 = Pr 𝑋 − E 𝑋 2 ≥ 𝑎2
Using Markov’s inequality,
Pr 𝑋 − E 𝑋 2 ≥ 𝑎2 ≤E 𝑋 − E 𝑋 2
𝑎2=Var 𝑋
𝑎2
proof.
Chebyshev’s inequality15
Cor. Chebyshev’s inequality
For any t 0.
Pr 𝑋 ≥ 1 + 𝑡 E 𝑋 ≤Var 𝑋
𝑡E 𝑋 2
proof.
Pr 𝑋 ≥ 1 + 𝑡 E 𝑋 = Pr 𝑋 − E 𝑋 ≥ 𝑡E[𝑋]
≤ Pr 𝑋 − 𝐸 𝑋 ≥ 𝑡E 𝑋
≤Var 𝑋
𝑡E 𝑋 2
Ex. Coupon collector16
The are n kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
What is the probability of completion after m trials?
•ビックリマンシール
•ポケモンカード
Using Chevyshev’s inequality,
Pr 𝑋 ≥ 1 + 𝑡 𝐸[𝑋] ≤Var 𝑋
𝑡E 𝑋 2
rem.
𝑛 ln 𝑛 ≤ 𝐸 𝑋 ≤ 𝑛 1 + ln 𝑛
Ex. Coupon collector17
The are n kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
What is the probability of completion after m trials?
•ビックリマンシール
•ポケモンカード
Var 𝑋
=
𝑖=1
𝑛
Var 𝑋𝑖 =
𝑖=1
𝑛1 − 𝑝𝑖
𝑝𝑖2
≤
𝑖=1
𝑛1
𝑝𝑖2 =
𝑖=1
𝑛𝑛
𝑛 − 𝑖 + 1
2
= 𝑛2
𝑖=1
𝑛1
𝑖2≤ 𝑛2
𝜋2
6
Ex. 2.
Ex. Coupon collector18
The are n kinds of coupons.
How many coupons do you need to draw, in expectation,
before having drawn each coupon at least once ?
What is the probability of completion after m trials?
•ビックリマンシール
•ポケモンカード
Using Chevyshev’s inequality,
Pr 𝑋 ≥ 1 + 𝑡 𝐸[𝑋] ≤Var 𝑋
𝑡E 𝑋 2≤
𝑛2𝜋2
6𝑡2 𝑛 ln𝑛 2
=𝜋2
6𝑡2 ln 𝑛 2
rem.
𝑛 ln 𝑛 ≤ 𝐸 𝑋 ≤ 𝑛 1 + ln 𝑛
e.g., n=100, m=1000 (𝑡 ≃𝑚
𝑛 ln 𝑛− 1 ≃ 1.1),
Pr 𝑋 ≥ 1001 ≤ Pr 𝑋 ≥ 1.78 E 𝑋 ≤𝜋2
6 × 0.782 × ln 100 2 ≤ 0.127
Pr 𝑐𝑜𝑚𝑝100 1000 ≥ 1 − 0.127 ≃ 0.87
still loose?
Chernoff’s bound
1.3. Law of Large number
Law of large numbers (大数の法則)20
Def.
A series {𝑌𝑛} converges 𝑌 in probability (𝑌に確率収束する), if
∀𝜀 > 0, lim𝑛→∞
Pr 𝑌𝑛 − 𝑌 < 𝜀 = 1
Thm. (law of large numbers; 大数の法則)
Suppose 𝑋1, … , 𝑋𝑛 are i.i.d., w/ expectation 𝜇, and variance 𝜎2,
then 𝑋1+⋯+𝑋𝑛
𝑛converge 𝜇 in probability;
i.e., ∀𝜀 > 0, lim𝑛→∞
Pr𝑋1+⋯+𝑋𝑛
𝑛− 𝜇 < 𝜀 = 1
independent and identically distributed
(独立同一分布)
Suppose 𝑋1, … , 𝑋𝑛 are i.i.d., w/ expectation 𝜇, and variance 𝜎2,
then 𝑋1+⋯+𝑋𝑛
𝑛converge 𝜇 in probability;
i.e., ∀𝜀 > 0, lim𝑛→∞
Pr𝑋1+⋯+𝑋𝑛
𝑛− 𝜇 < 𝜀 = 1
Thm. (law of large numbers; 大数の法則)21
Proof.
Let 𝑌𝑛 ≔𝑋1+⋯+𝑋𝑛
𝑛, for simplicity.
E 𝑌𝑛 = E𝑋1+⋯+𝑋𝑛
𝑛=
E 𝑋1 +⋯+E[𝑋𝑛]
𝑛=
𝜇+⋯+𝜇
𝑛= 𝜇
Var 𝑌𝑛 = Var𝑋1+⋯+𝑋𝑛
𝑛=
Var 𝑋1 +⋯+Var[𝑋𝑛]
𝑛2=
𝜎2+⋯+𝜎2
𝑛2=
𝜎2
𝑛
By Chebyshev’s inequality,
∀𝜀 > 0, ∀𝑛 > 0, Pr𝑋1+⋯+𝑋𝑛
𝑛− 𝜇 ≥ 𝜀 ≤
Var 𝑌𝑛
𝜀2=
𝜎2
𝑛𝜀2
∀𝜀 > 0, Pr𝑋1+⋯+𝑋𝑛
𝑛− 𝜇 < 𝜀 ≥ 1 −
𝜎2
𝑛𝜀2
𝑛→∞1
2. Central Limit Theorem
Central Limit Theorem (中心極限定理)23
Def.
A series 𝑌𝑛 w/ distribution functions 𝐹𝑛
converges 𝑌 in distribution (𝑌に分布収束する), if
lim𝑛→∞
𝐹𝑛 = 𝐹 where 𝐹 is the distr. func. of 𝑌.
Thm. Central limit theorem
Suppose 𝑋1, … , 𝑋𝑛 are i.i.d., w/ expectation 𝜇, and variance 𝜎2,
then 𝑍𝑛 ≔1
𝑛σ𝑖=1𝑛 𝑋
𝑖−𝜇
𝜎converges to N(0,1) in distribution.
i.e., lim𝑛→∞
Pr 𝑍𝑛 < 𝑧 = −∞
𝑧 1
2𝜋e−
𝑥2
2 d𝑥
pdf of normal distribution24
http://en.wikipedia.org/wiki/Normal_distribution
Distr. func. of normal distrbution25
http://en.wikipedia.org/wiki/Normal_distribution
Central Limit Theorem (中心極限定理)26
Suppose 𝑋1, … , 𝑋𝑛 are i.i.d., w/ expectation 𝜇, and variance 𝜎2,
then 𝑍𝑛 ≔1
𝑛σ𝑖=1𝑛 𝑋
𝑖−𝜇
𝜎converges to N(0,1) in distribution.
i.e., lim𝑛→∞
Pr 𝑍𝑛 < 𝑧 = ∞−𝑧 1
2𝜋e−
𝑥2
2 d𝑥
Before the proof...
Central Limit Theorem (中心極限定理)27
Corollary
Suppose 𝑋1, … , 𝑋𝑛 are i.i.d., w/ expectation 𝜇, and variance 𝜎2,
then 𝑋1+⋯+𝑋𝑛
𝑛converges to N 𝜇,
𝜎2
𝑛in distribution.
Suppose 𝑋1, … , 𝑋𝑛 are i.i.d., w/ expectation 𝜇, and variance 𝜎2,
then 𝑍𝑛 ≔1
𝑛σ𝑖=1𝑛 𝑋
𝑖−𝜇
𝜎converges to N(0,1) in distribution.
i.e., lim𝑛→∞
Pr 𝑍𝑛 < 𝑧 = ∞−𝑧 1
2𝜋e−
𝑥2
2 d𝑥
Prop.
Let 𝑎 ∈ 𝐑>0, 𝑏 ∈ 𝐑. Suppose that 𝑋 ∼ 𝑁(𝜇, 𝜎2), and
let 𝑌:= 𝑎𝑋 + 𝑏. Then, 𝑌 ∼ 𝑁 𝑎𝜇 + 𝑏, 𝑎2𝜎2 .
Affine transform. of a normal distribution28
2.1. Affine transform. of a random variable
Prop.
Let 𝑎 ∈ 𝐑>0, 𝑏 ∈ 𝐑. Suppose that 𝑋 is
a discrete random variable w/ pmf. 𝑓𝑋(𝑥), and
let 𝑌:= 𝑎𝑋 + 𝑏. Then, 𝑌 follows the pmf.
𝑓𝑌 𝑦 = 𝑓𝑋𝑦−𝑏
𝑎
Affine transform. of a discrete random variable30
Proof.
Since 𝑌:= 𝑎𝑋 + 𝑏,
𝑌 = 𝑦 ⇔ [𝑎𝑋 + 𝑏 = 𝑦] ⇔ 𝑋 =𝑦−𝑏
𝑎
i.e.,
𝑓𝑌 𝑦 = 𝑓𝑋𝑦−𝑏
𝑎.
Pr 𝑌 = 𝑦 Pr 𝑋 =𝑦−𝑏
𝑎
Prop.
Let 𝑎 ∈ 𝐑>0, 𝑏 ∈ 𝐑. Suppose that 𝑋 is
a continuous random variable w/ pdf 𝑓𝑋(𝑥), and
let 𝑌:= 𝑎𝑋 + 𝑏. Then, 𝑌 follows the pdf.
𝑓𝑌 𝑦 =1
𝑎𝑓𝑋
𝑦−𝑏
𝑎.
Affine transform. of a continuous random variable31
Proof.
Since 𝑌:= 𝑎𝑋 + 𝑏,
𝑌 ≤ 𝑦 ⇔ [𝑎𝑋 + 𝑏 ≤ 𝑦] ⇔ 𝑋 ≤𝑦−𝑏
𝑎
And then …
Prop.
Let 𝑎 ∈ 𝐑>0, 𝑏 ∈ 𝐑. Suppose that 𝑋 is
a continuous random variable w/ pdf 𝑓𝑋(𝑥), and
let 𝑌:= 𝑎𝑋 + 𝑏. Then, 𝑌 follows the pdf.
𝑓𝑌 𝑦 =1
𝑎𝑓𝑋
𝑦−𝑏
𝑎.
Affine transform. of a continuous random variable32
Proof.
Since 𝑌:= 𝑎𝑋 + 𝑏,
𝑌 ≤ 𝑦 ⇔ [𝑎𝑋 + 𝑏 ≤ 𝑦] ⇔ 𝑋 ≤𝑦−𝑏
𝑎
i.e.,
𝐹𝑌(𝑦) = 𝐹𝑋𝑦−𝑏
𝑎.
By differentiating the both sides, we obtain
𝑓𝑌 𝑦 =1
𝑎𝑓𝑋
𝑦−𝑏
𝑎.
Pr 𝑌 ≤ 𝑦 = Pr 𝑋 ≤𝑦−𝑏
𝑎
Prop.
Let 𝑎 ∈ 𝐑>0, 𝑏 ∈ 𝐑. Suppose that 𝑋 ∼ 𝑁(𝜇, 𝜎2), and
let 𝑌:= 𝑎𝑋 + 𝑏. Then, 𝑌 ∼ 𝑁 𝑎𝜇 + 𝑏, 𝑎𝜎 2 .
Affine transform. of a normal distribution33
Proof.
By the proposition in the previous page, 𝑌 follows the pdf
𝑓𝑌 𝑦 =1
𝑎𝑓𝑋
𝑦 − 𝑏
𝑎
=1
𝑎
1
2𝜋𝜎exp −
𝑦 − 𝑏𝑎
− 𝜇2
2𝜎2
=1
2𝜋𝑎𝜎exp −
𝑦 − 𝑎𝜇 + 𝑏2
2 𝑎𝜎 2.
This implies 𝑌 ∼ 𝑁 𝑎𝜇 + 𝑏, 𝑎2𝜎2 .
Recall
𝑓𝑋 𝑥 =1
2𝜋𝜎exp −
𝑥 − 𝜇 2
2𝜎2
The pdf of 𝑁 𝑎𝜇 + 𝑏, 𝑎2𝜎2 is given by
𝑓 𝑡 =1
2𝜋𝑎𝜎exp −
𝑡 − (𝑎𝜇 + 𝑏) 2
2 𝑎𝜎 2
Central Limit Theorem (中心極限定理)34
Corollary
Suppose 𝑋1, … , 𝑋𝑛 are i.i.d., w/ expectation 𝜇, and variance 𝜎2,
then 𝑋1+⋯+𝑋𝑛
𝑛converges to N 𝜇,
𝜎2
𝑛in distribution.
Suppose 𝑋1, … , 𝑋𝑛 are i.i.d., w/ expectation 𝜇, and variance 𝜎2,
then 𝑍𝑛 ≔1
𝑛σ𝑖=1𝑛 𝑋
𝑖−𝜇
𝜎converges to N(0,1) in distribution.
i.e., lim𝑛→∞
Pr 𝑍𝑛 < 𝑧 = ∞−𝑧 1
2𝜋e−
𝑥2
2 d𝑥
Prop.
Let 𝑎 ∈ 𝐑>0, 𝑏 ∈ 𝐑. Suppose that 𝑋 ∼ 𝑁(𝜇, 𝜎2), and
let 𝑌:= 𝑎𝑋 + 𝑏. Then, 𝑌 ∼ 𝑁 𝑎𝜇 + 𝑏, 𝑎2𝜎2 .
Apex. Affine transform. of a normal distribution35
Another proof. Since Pr 𝑌 ≤ 𝑦 = Pr 𝑋 ≤𝑦−𝑏
𝑎,
𝐹𝑌 𝑦 = 𝐹𝑋𝑦−𝑏
𝑎=
−∞
𝑦−𝑏
𝑎1
2𝜋𝜎exp −
𝑡−𝜇 2
2𝜎2d𝑡 (∗)
let 𝑠 = 𝑎𝑡 + 𝑏, then d𝑠 = 𝑎d𝑡 and
∗ = −∞
𝑦 1
2𝜋𝜎exp −
𝑡−𝑏
𝑎−𝜇
2
2𝜎21
𝑎d𝑠
= −∞
𝑦 1
2𝜋𝜎exp −
𝑠−𝑏
𝑎−𝜇
2
2𝜎21
𝑎d𝑠
= −∞
𝑦 1
2𝜋𝑎𝜎exp −
𝑠−(𝑎𝜇+𝑏) 2
2𝑎2𝜎2d𝑠
𝑡 −∞ →𝑦−𝑏
𝑎
𝑠 = 𝑎𝑡 + 𝑏 −∞ → 𝑦
density function of
𝑁 𝑎𝜇 + 𝑏, 𝑎𝜎 2
Sum of random variables
…for a proof of the central limit theorem
Next week:
Ex. Normal distr. 37
Suppose 𝑋 ∼ N 𝜇1, 𝜎12 , 𝑌 ∼ N 𝜇2, 𝜎2
2 are independent.
Compute the density function of 𝑍 ≔ 𝑋 + 𝑌.
𝑓𝑍 𝑥 = න−∞
∞
𝑓𝑋 𝑡 𝑓𝑌 𝑥 − 𝑡 d𝑡
= න−∞
∞ 1
2𝜋 𝜎1exp −
𝑡 − 𝜇12
𝜎12
1
2𝜋 𝜎2exp −
𝑥 − 𝑡 − 𝜇22
𝜎22 d𝑡
= ⋯
Hard!