Upload
oliver-perry
View
218
Download
0
Embed Size (px)
DESCRIPTION
JL Lemma Lemma: Any n points in L 2 can be embedded into L 2 k, k=O((log n)/ε 2 ) with 1+ε distortion Extremely useful for many applications: – Machine learning – Compressive sensing – Nearest Neighbor search – Many others… Limitations: specific to L 2, dimension depends on n – There are lower bounds for dimension reduction in L 1, L ∞
Citation preview
On the Impossibility of Dimension Reduction for
Doubling Subsetsof Lp
Yair BartalLee-Ad Gottlieb
Ofer Neiman
Embedding and Distortion
• Lp spaces: Lpk is the metric space
• Let (X,d) be a finite metric space• A map f:X→ Lp
k is called an embedding• The embedding is non-expansive and has distortion
D, if for all x,yϵX :
JL Lemma• Lemma: Any n points in L2 can be embedded into L2
k, k=O((log n)/ε2) with 1+ε distortion
• Extremely useful for many applications:– Machine learning– Compressive sensing– Nearest Neighbor search– Many others…
• Limitations: specific to L2, dimension depends on n– There are lower bounds for dimension reduction in L1, L∞
Lower bounds on Dimension Reduction
• For general n-point sets in Lp, Ω(logD n) dimensions are required for distortion D (volume argument)
• BC’03 (and also LN’04, ACNN’11, R’12) showed strong impossibility results in L1
– The dimension must be for distortion D
Doubling Dimension
• Doubling constant: The minimal λ so that every ball of radius 2r can be covered by λ balls of radius r
• Doubling dimension: log2λ• A measure for dimensionality of a
metric space• Generalizes the dimension for normed
space: Lpk has doubling dimension Θ(k)
• The volume argument holds only for metrics with high doubling dimension
Overcoming the Lower Bounds?
• One could hope for an analogous version of the JL-Lemma for doubling subsets
• Question: Does every set of points in L2 of constant doubling dimension, embeds to constant dimensional space with constant distortion?
• More ambitiously: Any subset of L2 with doubling constant λ, can be embedded into L2
k, k=O((log λ)/ε2) with 1+ε distortion
Our Result
• Such a dimension reduction is impossible in the Lp spaces with p>2
• Thm: For any p>2 there is a constant c, such that for any n, there is a subset A of Lp of size n with doubling constant O(1), and any embedding of A into Lp
k with distortion at most D satisfies
Our Result• Thm: For any p>2 there is a constant c, such that for any n, there is a
subset A of Lp of size n with doubling constant O(1), and any embedding of A into Lp
k with distortion at most D satisfies
• Note: any sub-logarithmic dimension requires non-constant distortion
• We also show a similar bound for embedding from Lp into Lq, for all q≠2
• Lafforgue and Naor concurrently proved this using analytic tools, and their counterexample is based on the Heisenberg group
Implications
• Rules out a class of algorithms for NN-search, clustering, routing etc.
• The first non-trivial result on non-linear dimension reduction for Lp with p≠1,2,∞
• Comment: For p=1, there is a stronger lower bound for doubling subsets, the dimension of any embedding with distortion D (into L1) must be at least (LMN’05)
The Laakso Graph
• A recursive graph, Gi+1 is obtained from Gi by replacing every edge with a copy of G1
• A series-parallel graph• Has doubling constant 6
G0 G1 G2
Simple Case: p=∞
• The Laakso graph lies in high dimensional L∞
• Assume w.l.o.g that there is a non-expansive embedding f with distortion D into L∞
k
• Proof idea:– Follow the recursive construction– At each step, find an edge whose L2 stretch is increased by
some value, compared to the stretch of its parent edge
– When stretch(u,v) > k, we will have a contradiction, as•
v
Simple Case: p=∞
• Consider a single iteration• The pair a,b is an edge of the
previous iteration• Let fj be the j-th coordinate• There is a natural embedding
that does not increase stretch...• But then u,v may be distorted
a bs tu
fj(a) fj(b)
Simple Case: p=∞
• For simplicity (and w.l.o.g) assume – fj(s)=(fj(b)-fj(a))/4
– fj(t)=3(fj(b)-fj(a))/4
– fj(v)=(fj(b)-fj(a))/2
• Let Δj(u) be the difference between fj(u) and fj(v)
• The distortion D requirement imposes that for some j, Δj(u)>1/D (normalizing so that d(u,v)=1)
v
a bs tu
fj(a) fj(b)Δj(u)
Simple Case: p=∞
• The stretch of u,s will increase due to the j-th coordinate
• But may decrease due to other coordinates..
• Need to prove that for one of the pairs {u,s}, {u,t}, the total L2 stretch increases by at least – Compared to the stretch of a,b
v
a bs t
fj(a) fj(b)Δj(u)
u
v
a bs t
fh(a) fh(b)-Δh(u)
u
Simple Case: p=∞• Observe that in the j-th coordinate:
– If the distance between u,s increases by Δj(u),
– Then the distance between u,t decreases by Δj(u) (and vise versa)
• Denote by x the stretch of a,b in coordinate j
• The average of the L2 stretch of {u,s} and {u,t} (in the j-th coordinate alone) is:
v
a bs t
fj(a) fj(b)Δj(u)
u
Simple Case: p=∞
• For one of the pairs {u,s}, {u,t}, the total L2 stretch (over all coordinates) increases by
• Continue with this edge• The number of iterations must be at
most kD2 (otherwise the stretch will be greater than k)
• But # of iterations ≈ log n• Finally,
u
sa bt
v
Going Beyond Infinity
• For p<∞, we cannot use the Laakso graph– Requires high distortion to embed it into Lp
• Instead, we build an instance in Lp, inspired by the Laakso graph
• The new points u,v will use a new dimension
• Parameter ε determines the (scaled) u,v distance a s
u
v
t bε
Going Beyond Infinity
• Problem: the u,s distance is now larger than 1, roughly 1+εp
• Causes a loss of ≈ εp in the stretch of each level
• Since u,v are at distance ε, the increase to the stretch is now only (ε/D)2
• When p>2, there is a choice of ε for which the increase overcomes the loss
a su
v
t bε
Conclusion
• We show a strong lower bound against dimension reduction for doubling subsets of Lp, for any p>2
• Can our techniques be extended to 1<p<2 ?– The u,s distance when p<2 is quite large, ≈ 1+(p-1)ε2 , so a
different approach is required
• General doubling metrics embed to Lp with distortion O(log1/pn) (for p≥2)– Can this distortion bound be obtained in constant
dimension?