An Information Complexity Approach to Extended...

Preview:

Citation preview

An Information Complexity Approach to

Extended Formulations

Ankur Moitra Institute for Advanced Study

joint work with Mark Braverman

The Permutahedron

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

How many facets of P have?

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

How many facets of P have? exponentially many!

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

How many facets of P have?

e.g. S [n], Σi in S xi ≥ 1 + 2 + … + |S| = |S|(|S|+1)/2

exponentially many!

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

How many facets of P have?

e.g. S [n], Σi in S xi ≥ 1 + 2 + … + |S| = |S|(|S|+1)/2

Let Q = {A| A is doubly-stochastic}

exponentially many!

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

How many facets of P have?

e.g. S [n], Σi in S xi ≥ 1 + 2 + … + |S| = |S|(|S|+1)/2

Let Q = {A| A is doubly-stochastic}

Then P is the projection of Q: P = {A t | A in Q }

exponentially many!

The Permutahedron

Let t = [1, 2, 3, … n], P = conv{π(t) | π is permutation}

How many facets of P have?

e.g. S [n], Σi in S xi ≥ 1 + 2 + … + |S| = |S|(|S|+1)/2

Let Q = {A| A is doubly-stochastic}

Then P is the projection of Q: P = {A t | A in Q }

Yet Q has only O(n2) facets

exponentially many!

Extended Formulations

The extension complexity (xc) of a polytope P is the

minimum number of facets of Q so that P = proj(Q)

Extended Formulations

The extension complexity (xc) of a polytope P is the

minimum number of facets of Q so that P = proj(Q)

e.g. xc(P) = Θ(n logn)

for permutahedron

Extended Formulations

The extension complexity (xc) of a polytope P is the

minimum number of facets of Q so that P = proj(Q)

e.g. xc(P) = Θ(n logn)

for permutahedron

xc(P) = Θ(logn) for a

regular n-gon, but Ω(√n)

for its perturbation

Extended Formulations

The extension complexity (xc) of a polytope P is the

minimum number of facets of Q so that P = proj(Q)

e.g. xc(P) = Θ(n logn)

for permutahedron

xc(P) = Θ(logn) for a

regular n-gon, but Ω(√n)

for its perturbation

In general, P = {x | y, (x,y) in Q} E

Extended Formulations

The extension complexity (xc) of a polytope P is the

minimum number of facets of Q so that P = proj(Q)

e.g. xc(P) = Θ(n logn)

for permutahedron

xc(P) = Θ(logn) for a

regular n-gon, but Ω(√n)

for its perturbation

In general, P = {x | y, (x,y) in Q} E

…analogy with quantifiers in Boolean formulae

Applications of EFs

In general, P = {x | y, (x,y) in Q} E

Applications of EFs

In general, P = {x | y, (x,y) in Q} E

Through EFs, we can reduce # facets exponentially!

Applications of EFs

In general, P = {x | y, (x,y) in Q} E

Through EFs, we can reduce # facets exponentially!

Hence, we can run standard LP solvers instead of

the ellipsoid algorithm

Applications of EFs

In general, P = {x | y, (x,y) in Q} E

Through EFs, we can reduce # facets exponentially!

Hence, we can run standard LP solvers instead of

the ellipsoid algorithm

EFs often give, or are based on new combinatorial

insights

Applications of EFs

In general, P = {x | y, (x,y) in Q} E

Through EFs, we can reduce # facets exponentially!

Hence, we can run standard LP solvers instead of

the ellipsoid algorithm

EFs often give, or are based on new combinatorial

insights

e.g. Birkhoff-von Neumann Thm and permutahedron

Applications of EFs

In general, P = {x | y, (x,y) in Q} E

Through EFs, we can reduce # facets exponentially!

Hence, we can run standard LP solvers instead of

the ellipsoid algorithm

EFs often give, or are based on new combinatorial

insights

e.g. Birkhoff-von Neumann Thm and permutahedron

e.g. prove there is low-cost object, through its polytope

Explicit, Hard Polytopes?

Explicit, Hard Polytopes?

Definition: TSP polytope:

P = conv{1F| F is the set of edges on a tour of Kn}

Explicit, Hard Polytopes?

Definition: TSP polytope:

P = conv{1F| F is the set of edges on a tour of Kn}

(If we could optimize over this polytope, then P = NP)

Explicit, Hard Polytopes?

Can we prove unconditionally there is no small EF?

Definition: TSP polytope:

P = conv{1F| F is the set of edges on a tour of Kn}

(If we could optimize over this polytope, then P = NP)

Explicit, Hard Polytopes?

Can we prove unconditionally there is no small EF?

Definition: TSP polytope:

P = conv{1F| F is the set of edges on a tour of Kn}

Caveat: this is unrelated to proving complexity l.b.s

(If we could optimize over this polytope, then P = NP)

Explicit, Hard Polytopes?

Can we prove unconditionally there is no small EF?

[Yannakakis ’90]: Yes, through the nonnegative rank

Definition: TSP polytope:

P = conv{1F| F is the set of edges on a tour of Kn}

Caveat: this is unrelated to proving complexity l.b.s

(If we could optimize over this polytope, then P = NP)

Theorem [Yannakakis ’90]: Any symmetric LP for

TSP or matching has size 2Ω(n)

Theorem [Yannakakis ’90]: Any symmetric LP for

TSP or matching has size 2Ω(n)

Theorem [Fiorini et al ’12]: Any LP for TSP has size

2Ω(√n) (based on a 2Ω(n) lower bd for clique)

Theorem [Yannakakis ’90]: Any symmetric LP for

TSP or matching has size 2Ω(n)

Theorem [Fiorini et al ’12]: Any LP for TSP has size

2Ω(√n) (based on a 2Ω(n) lower bd for clique)

Theorem [Braun et al ’12]: Any LP that approximates

clique within n1/2-eps has size exp(neps)

Hastad’s proved an n1-o(1) hardness of approx. for

clique, can we prove the analogue for EFs?

Theorem [Yannakakis ’90]: Any symmetric LP for

TSP or matching has size 2Ω(n)

Theorem [Fiorini et al ’12]: Any LP for TSP has size

2Ω(√n) (based on a 2Ω(n) lower bd for clique)

Theorem [Braun et al ’12]: Any LP that approximates

clique within n1/2-eps has size exp(neps)

Hastad’s proved an n1-o(1) hardness of approx. for

clique, can we prove the analogue for EFs?

Theorem [Yannakakis ’90]: Any symmetric LP for

TSP or matching has size 2Ω(n)

Theorem [Fiorini et al ’12]: Any LP for TSP has size

2Ω(√n) (based on a 2Ω(n) lower bd for clique)

Theorem [Braun et al ’12]: Any LP that approximates

clique within n1/2-eps has size exp(neps)

Theorem [Braverman, Moitra ’13]: Any LP that

approximates clique within n1-eps has size exp(neps)

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

The Factorization Theorem

The Factorization Theorem

How can we prove lower bounds on EFs?

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

Definition of the slack matrix…

The Slack Matrix

The Slack Matrix

P

The Slack Matrix

P S(P)

vertex

face

t

The Slack Matrix

P S(P)

vertex

face

t

vj

The Slack Matrix

P S(P)

vertex

face

t

vj

<ai,x> ≤ bi

The Slack Matrix

The entry in row i, column j is how slack the jth vertex

is on the ith constraint

P S(P)

vertex

face

t

vj

<ai,x> ≤ bi

The Slack Matrix

The entry in row i, column j is how slack the jth vertex

is on the ith constraint

bi - <ai, vj>

P S(P)

vertex

face

t

vj

<ai,x> ≤ bi

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

Definition of the slack matrix…

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

Definition of the slack matrix…

Definition of the nonnegative rank…

Nonnegative Rank

S =

Nonnegative Rank

S M1 Mr = + + …

rank one, nonnegative

Nonnegative Rank

S M1 Mr = + + …

Definition: rank+(S) is the smallest r s.t. S can be

written as the sum of r rank one, nonneg. matrices

rank one, nonnegative

Nonnegative Rank

S M1 Mr = + + …

Definition: rank+(S) is the smallest r s.t. S can be

written as the sum of r rank one, nonneg. matrices

rank one, nonnegative

Note: rank+(S) ≥ rank(S), but can be much larger too!

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

xc(P) = rank+(S(P))

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

xc(P) = rank+(S(P))

Intuition: the factorization gives a change of variables

that preserves the slack matrix!

The Factorization Theorem

[Yannakakis ’90]:

How can we prove lower bounds on EFs?

Geometric

Parameter

Algebraic

Parameter

xc(P) = rank+(S(P))

Intuition: the factorization gives a change of variables

that preserves the slack matrix!

We will give a new way to lower bound nonnegative

rank via information theory…

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

The Rectangle Bound

=

rank one, nonnegative

S M1 Mr + + …

The Rectangle Bound

=

rank one, nonnegative

M1 Mr + + …

The Rectangle Bound

= + + …

rank one, nonnegative

Mr

The Rectangle Bound

= + + …

rank one, nonnegative

Mr

The Rectangle Bound

= + + …

rank one, nonnegative

The Rectangle Bound

= + + …

rank one, nonnegative

The support of each Mi is a combinatorial rectangle

The Rectangle Bound

= + + …

rank one, nonnegative

The support of each Mi is a combinatorial rectangle

rank+(S) is at least # rectangles needed to cover supp of S

The Rectangle Bound

= + + …

rank one, nonnegative

rank+(S) is at least # rectangles needed to cover supp of S

The Rectangle Bound

= + + …

rank one, nonnegative

rank+(S) is at least # rectangles needed to cover supp of S

Non-deterministic Comm. Complexity

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

A Sampling Argument

= + + …

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

Choose Mi proportional to total value on T

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

Choose Mi proportional to total value on T

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

Choose Mi proportional to total value on T

Choose (a,b) in T proportional to relative value in Mi

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

Choose Mi proportional to total value on T

Choose (a,b) in T proportional to relative value in Mi

A Sampling Argument

= + + …

T = { }, set of entries in S with same value

Choose Mi proportional to total value on T

Choose (a,b) in T proportional to relative value in Mi

A Sampling Argument

= + + …

If r is too small, this procedure uses too little entropy!

T = { }, set of entries in S with same value

Choose Mi proportional to total value on T

Choose (a,b) in T proportional to relative value in Mi

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

The Construction of [Fiorini et al]

correlation polytope: Pcorr= conv{aaT|a in {0,1}n }

The Construction of [Fiorini et al]

S

constraints:

vertices:

correlation polytope: Pcorr= conv{aaT|a in {0,1}n }

The Construction of [Fiorini et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

correlation polytope: Pcorr= conv{aaT|a in {0,1}n }

The Construction of [Fiorini et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

(1-aTb)2

correlation polytope: Pcorr= conv{aaT|a in {0,1}n }

The Construction of [Fiorini et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

(1-aTb)2

UNIQUE DISJ.

Output ‘YES’ if a

and b as sets

are disjoint, and

‘NO’ if a and b

have one index

in common

correlation polytope: Pcorr= conv{aaT|a in {0,1}n }

The Reconstruction Principle

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

Recall: Sa,b=(1-aTb)2, so Sa,b=1 for all pairs in T

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

How does the sampling procedure specialize to

this case? (Recall it generates (a,b) unif. from T)

Recall: Sa,b=(1-aTb)2, so Sa,b=1 for all pairs in T

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

Sampling Procedure:

How does the sampling procedure specialize to

this case? (Recall it generates (a,b) unif. from T)

Recall: Sa,b=(1-aTb)2, so Sa,b=1 for all pairs in T

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

How does the sampling procedure specialize to

this case? (Recall it generates (a,b) unif. from T)

Recall: Sa,b=(1-aTb)2, so Sa,b=1 for all pairs in T

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

How does the sampling procedure specialize to

this case? (Recall it generates (a,b) unif. from T)

Recall: Sa,b=(1-aTb)2, so Sa,b=1 for all pairs in T

The Reconstruction Principle

Let T = {(a,b) | aTb = 0}, |T| = 3n

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

How does the sampling procedure specialize to

this case? (Recall it generates (a,b) unif. from T)

Recall: Sa,b=(1-aTb)2, so Sa,b=1 for all pairs in T

Entropy Accounting 101

Entropy Accounting 101

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Entropy Accounting 101

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

≤ n log23

Entropy Accounting 101

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

+

choose i

choose (a,b)

conditioned on i

≤ n log23

Entropy Accounting 101

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

log2r +

choose i

choose (a,b)

conditioned on i

≤ n log23

Entropy Accounting 101

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

log2r (1-δ)n log23 +

choose i

choose (a,b)

conditioned on i

≤ n log23

Entropy Accounting 101

Sampling Procedure:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

log2r (1-δ)n log23 +

choose i

choose (a,b)

conditioned on i

≤ n log23 (?)

Suppose that a-j and b-j are fixed

a b bj

aj

Suppose that a-j and b-j are fixed

a b bj

aj

Mi restricted to (a-j,b-j)

Suppose that a-j and b-j are fixed

a b bj

aj

Mi restricted to (a-j,b-j)

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b)

Mi(a,b)

Mi(a,b)

Mi(a,b)

(…bj=0…) (…bj=1…)

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b)

Mi(a,b)

Mi(a,b)

Mi(a,b)

(…bj=0…) (…bj=1…)

If aj=1, bj=1 then aTb = 1, hence Mi(a,b) = 0

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b)

Mi(a,b)

Mi(a,b)

Mi(a,b)

(…bj=0…) (…bj=1…)

If aj=1, bj=1 then aTb = 1, hence Mi(a,b) = 0

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b)

Mi(a,b)

Mi(a,b)

(…bj=0…) (…bj=1…)

zero

If aj=1, bj=1 then aTb = 1, hence Mi(a,b) = 0

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b)

Mi(a,b)

Mi(a,b)

(…bj=0…) (…bj=1…)

But rank(Mi)=1, hence there must be another

zero in either the same row or column

zero

If aj=1, bj=1 then aTb = 1, hence Mi(a,b) = 0

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b) Mi(a,b)

(…bj=0…) (…bj=1…)

But rank(Mi)=1, hence there must be another

zero in either the same row or column

zero zero

If aj=1, bj=1 then aTb = 1, hence Mi(a,b) = 0

(a1..j-1,aj=0,aj+1…n)

(a1..j-1,aj=1,aj+1…n)

Mi(a,b) Mi(a,b)

(…bj=0…) (…bj=1…)

But rank(Mi)=1, hence there must be another

zero in either the same row or column

zero zero

H(aj,bj| i, a-j, b-j) ≤ 1 < log23

Entropy Accounting 101

Generate uniformly random (a,b) in T:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

log2r +

choose i

choose (a,b)

conditioned on i

≤ n log23

Entropy Accounting 101

Generate uniformly random (a,b) in T:

Let Ri be the sum of Mi(a,b) over (a,b) in T and

let R be the sum of Ri

Choose i with probability Ri/R

Choose (a,b) with probability Mi(a,b)/Ri

Total Entropy:

log2r n +

choose i

choose (a,b)

conditioned on i

≤ n log23

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Approximate EFs [Braun et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

(1-aTb)2

Approximate EFs [Braun et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

(1-aTb)2

Is there a K (with small xc) s.t. Pcorr K (C+1)Pcorr?

Approximate EFs [Braun et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

(1-aTb)2

Is there a K (with small xc) s.t. Pcorr K (C+1)Pcorr?

+ C

Approximate EFs [Braun et al]

S

constraints:

vertices: a in {0,1}n

b in {0,1}n

(1-aTb)2

New Goal:

Output the answer to

UDISJ with prob. at

least ½ + 1/2(C+1)

Is there a K (with small xc) s.t. Pcorr K (C+1)Pcorr?

+ C

Is the correlation polytope hard to approximate for

large values of C?

Analogy: Is UDISJ hard to compute with prob.

½+1/2(C+1) for large values of C?

Is the correlation polytope hard to approximate for

large values of C?

Analogy: Is UDISJ hard to compute with prob.

½+1/2(C+1) for large values of C?

There is a natural barrier at C = √n for proving l.b.s:

Is the correlation polytope hard to approximate for

large values of C?

Analogy: Is UDISJ hard to compute with prob.

½+1/2(C+1) for large values of C?

Claim: If UDISJ can be computed with prob.

½+1/2(C+1) using o(n/C2) bits, then UDISJ can be

computed with prob. ¾ using o(n) bits

There is a natural barrier at C = √n for proving l.b.s:

Is the correlation polytope hard to approximate for

large values of C?

Analogy: Is UDISJ hard to compute with prob.

½+1/2(C+1) for large values of C?

Claim: If UDISJ can be computed with prob.

½+1/2(C+1) using o(n/C2) bits, then UDISJ can be

computed with prob. ¾ using o(n) bits

Proof: Run the protocol O(C2) times and take the

majority vote

There is a natural barrier at C = √n for proving l.b.s:

Information Complexity

Information Complexity

In fact, a more technical barrier is:

Information Complexity

[Bar-Yossef et al ’04]:

# Bits exchanged ≥ information revealed

In fact, a more technical barrier is:

Information Complexity

[Bar-Yossef et al ’04]:

# Bits exchanged ≥ information revealed

In fact, a more technical barrier is:

≥ n x information revealed

for a one-bit problem

Information Complexity

[Bar-Yossef et al ’04]:

# Bits exchanged ≥ information revealed

In fact, a more technical barrier is:

≥ n x information revealed

for a one-bit problem Direct Sum Theorem

Information Complexity

[Bar-Yossef et al ’04]:

# Bits exchanged ≥ information revealed

In fact, a more technical barrier is:

≥ n x information revealed

for a one-bit problem Direct Sum Theorem

Problem: AND has a protocol with advantage 1/C

that reveals only 1/C2 bits…

Problem: AND has a protocol with advantage 1/C

that reveals only 1/C2 bits…

Problem: AND has a protocol with advantage 1/C

that reveals only 1/C2 bits…

Send her bit with prob.

½ + 1/C, else send

the complement

Problem: AND has a protocol with advantage 1/C

that reveals only 1/C2 bits…

Send her bit with prob.

½ + 1/C, else send

the complement

Send his bit with prob.

½ + 1/C, else send

the complement

Send her bit with prob.

½ + 1/C, else send

the complement

Send his bit with prob.

½ + 1/C, else send

the complement

Is there a stricter one bit problem that we could

reduce to instead?

Send her bit with prob.

½ + 1/C, else send

the complement

Send his bit with prob.

½ + 1/C, else send

the complement

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Outline

Part I: Tools for Extended Formulations

Yannakakis’s Factorization Theorem

The Rectangle Bound

A Sampling Argument

Part II: Applications

Correlation Polytope

Approximating the Correlation Polytope

A Better Lower Bound for Disjointness

Definition: Output matrix is the prob. of outputting

“one” for each pair of inputs

Definition: Output matrix is the prob. of outputting

“one” for each pair of inputs

For the previous protocol:

½ + 5/C ½ + 1/C

½ + 1/C ½ - 3/C

A = 0

A = 1

B = 0 B = 1

For the previous protocol:

½ + 5/C ½ + 1/C

½ + 1/C ½ - 3/C

The constraint that a protocol achieves advantage at

least 1/C is a set of linear constraints on this matrix

A = 0

A = 1

B = 0 B = 1

For the previous protocol:

½ + 5/C ½ + 1/C

½ + 1/C ½ - 3/C

The constraint that a protocol achieves advantage at

least 1/C is a set of linear constraints on this matrix

(Using Hellinger): bits revealed ≥ (max– min)2 = Ω(1/C2)

A = 0

A = 1

B = 0 B = 1

For the previous protocol:

½ + 5/C ½ + 1/C

½ + 1/C ½ - 3/C

The constraint that a protocol achieves advantage at

least 1/C is a set of linear constraints on this matrix

(Using Hellinger): bits revealed ≥ (max– min)2 = Ω(1/C2)

(New): bits revealed ≥ |diagonal – anti-diagonal| = 0

A = 0

A = 1

B = 0 B = 1

For the previous protocol:

½ + 5/C ½ + 1/C

½ + 1/C ½ - 3/C

A = 0

A = 1

B = 0 B = 1

What if we also require the output distribution to be

the same for inputs {0,0}, {0,1}, {1,0}?

For the previous protocol:

½ + 5/C ½ + 1/C

½ + 1/C ½ - 3/C

A = 0

A = 1

B = 0 B = 1

What if we also require the output distribution to be

the same for inputs {0,0}, {0,1}, {1,0}?

For the previous new protocol:

½ + 1/C ½ + 1/C

½ + 1/C ½ - 3/C

A = 0

A = 1

B = 0 B = 1

What if we also require the output distribution to be

the same for inputs {0,0}, {0,1}, {1,0}?

(Using Hellinger): bits revealed ≥ (max – min)2 = Ω(1/C2)

For the previous new protocol:

½ + 1/C ½ + 1/C

½ + 1/C ½ - 3/C

A = 0

A = 1

B = 0 B = 1

What if we also require the output distribution to be

the same for inputs {0,0}, {0,1}, {1,0}?

(Using Hellinger): bits revealed ≥ (max – min)2 = Ω(1/C2)

(New): bits revealed ≥ |diagonal – anti-diagonal| = Ω(1/C)

For the previous new protocol:

½ + 1/C ½ + 1/C

½ + 1/C ½ - 3/C

A = 0

A = 1

B = 0 B = 1

How can we reduce to such a one-bit problem?

n/2 bits n/2 bits

a

b

How can we reduce to such a one-bit problem?

n/2 bits n/2 bits

a

b

Symmetrized Protocol T’:

How can we reduce to such a one-bit problem?

n/2 bits n/2 bits

a

b

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

How can we reduce to such a one-bit problem?

n/2 bits n/2 bits

0

0

0

1

1

0

a

b

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

How can we reduce to such a one-bit problem?

n/2 bits n/2 bits

0

0

0

1

1

0

a

b

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

Permute the n bits uniformly at random

How can we reduce to such a one-bit problem?

n/2 bits n/2 bits

0

0

0

1

1

0

a

b

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

Permute the n bits uniformly at random

Run T on the n bit string, return the output

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

Permute the n bits uniformly at random

Run T on the n bit string, return the output

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

Permute the n bits uniformly at random

Run T on the n bit string, return the output

Claim: The protocol T’ has bias 1/C for DISJ.

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

Permute the n bits uniformly at random

Run T on the n bit string, return the output

Claim: The protocol T’ has bias 1/C for DISJ.

Claim: Yet flipping a pair (ai, bi) between (0,0), (0,1) or

(1,0) results in two distributions p,q (on inputs to T) with

|p-q|1 ≤ 1/n

Symmetrized Protocol T’:

Generate a random partition (x, y, z) of n/2 and

fill in x (0,0), y (0,1) and z (1,0) pairs

Permute the n bits uniformly at random

Run T on the n bit string, return the output

Claim: The protocol T’ has bias 1/C for DISJ.

Claim: Yet flipping a pair (ai, bi) between (0,0), (0,1) or

(1,0) results in two distributions p,q (on inputs to T) with

|p-q|1 ≤ 1/n

Proof:

Theorem: any protocol for UDISJ with adv. 1/C must

reveal Ω(n/C) bits (because it can be symmetrized)

Theorem: any protocol for UDISJ with adv. 1/C must

reveal Ω(n/C) bits (because it can be symmetrized)

Similarly sampling a pair (aj, bj) needs entropy log23 – δ,

where δ is the difference of diagonals

Theorem: any protocol for UDISJ with adv. 1/C must

reveal Ω(n/C) bits (because it can be symmetrized)

Similarly sampling a pair (aj, bj) needs entropy log23 – δ,

where δ is the difference of diagonals

Theorem: For any K with Pcorr K (C+1)Pcorr, the

extension complexity of K is at least exp(Ω(n/C))

Theorem: any protocol for UDISJ with adv. 1/C must

reveal Ω(n/C) bits (because it can be symmetrized)

Can our framework be used to prove further lower

bounds for extension complexity?

Similarly sampling a pair (aj, bj) needs entropy log23 – δ,

where δ is the difference of diagonals

Theorem: For any K with Pcorr K (C+1)Pcorr, the

extension complexity of K is at least exp(Ω(n/C))

Theorem: any protocol for UDISJ with adv. 1/C must

reveal Ω(n/C) bits (because it can be symmetrized)

Can our framework be used to prove further lower

bounds for extension complexity?

Similarly sampling a pair (aj, bj) needs entropy log23 – δ,

where δ is the difference of diagonals

Theorem: For any K with Pcorr K (C+1)Pcorr, the

extension complexity of K is at least exp(Ω(n/C))

For average case instances?

Theorem: any protocol for UDISJ with adv. 1/C must

reveal Ω(n/C) bits (because it can be symmetrized)

Can our framework be used to prove further lower

bounds for extension complexity?

Similarly sampling a pair (aj, bj) needs entropy log23 – δ,

where δ is the difference of diagonals

Theorem: For any K with Pcorr K (C+1)Pcorr, the

extension complexity of K is at least exp(Ω(n/C))

For average case instances? For SDPs?

Thanks!

Any Questions?

P vj

Recommended