Maximizing Time-decaying Influence
in Social Networks
2016/9/22 ECML-PKDD 2016
Naoto Ohsaka (UTokyo)
Yutaro Yamaguchi (Osaka Univ.)
Naonori Kakimura (UTokyo)
Ken-ichi Kawarabayashi (NII)
ERATO Kawarabayashi Large Graph Project
Given
โถ Social graph ๐บ = ๐, ๐ธโถ Diffusion model โณโถ Integer ๐
Goal
โถ maximize ๐โณ ๐ ( ๐ = ๐)๐โณ ๐ โ ๐[size of cascade initiated by ๐ under โณ]
Influence maximization[Kempe-Kleinberg-Tardos. KDD'03]
2
Viral marketing application[Domingos-Richardson. KDD'01]
incentive
incentive
incentive
Formulation by[Kempe-Kleinberg-Tardos. KDD'03]
3
Used classical diffusion models
independent cascade [Goldenberg-Libai-Muller. Market. Lett.'01]
linear threshold [Granovetter. Am. J. Sociol.'78]
๏ผ Tractable
Monotonicity & submodularity of ๐ โ [Kempe-Kleinberg-Tardos. KDD'03]
Near-linear time approx. algorithms[Borgs-Brautbar-Chayes-Lucier. SODA'14] [Tang-Shi-Xiao. SIGMOD'15]
๏ผ Too simple
No time-decay effects, no time-delay propagation
โ๐ โ ๐, ๐ฃ โ ๐๐ ๐ โช ๐ฃ โ ๐ ๐ โฅ ๐ ๐ โช ๐ฃ โ ๐ ๐
The power of influence (rumor)
depends on arrival time
decreases over time
Our aspect : Time-decay effects
4
Our question
What if incorporate time-decay effects?Monotonicity? Submodularity? Scalable algorithm?
Time-delay propagations have been studied[Saito-Kimura-Ohara-Motoda. ACML'09] [Goyal-Bonchi-Lakshmanan. WSDM'10]
[Saito-Kimura-Ohara-Motoda. PKDD'10] [Rodriguez-Balduzzi-Schรถlkopf. ICML'11]
[Chen-Lu-Zhang. AAAI'12] [Liu-Cong-Xu-Zeng. ICDM'12]
Time-decay effects have not much โฆ[Cui-Yang-Homan. CIKM'14] No guarantee for influence maximization
u
Incentive
vDay 1
v
s uDay 2
Day 1
Incentive
Our contribution
5
Generalize independent cascade & linear threshold with
Time-decay effects & Time-delay propagationNon-increasing probability Arbitrary length distribution
Diffusion process of TV-ICReachability on a random
shrinking dynamic graph=โถ Characterization
โถ Monotonicity & submodularity of ๐TVIC โ
โถ Scalable and accurate greedy algorithmsbased on reverse influence set [Tang-Shi-Xiao. SIGMOD'15]
๐ช ๐ ๐ธ + ๐ธ 2
๐
log2 |๐|
๐2time & 1 โ eโ1 โ ๐ -approx.
Essential
Time-varying independent cascade (TV-IC) model
๐บ = ๐, ๐ธ Social graph
๐๐๐ Non-increasing probability function
๐๐๐ Arbitrary length distribution
6
Objective function
๐TVIC ๐ โ ๐[# active vertices initiated by ๐]
0. ๐ข became active at ๐ก๐ข1. Sample a delay ๐ฟ๐ข๐ฃ from ๐๐๐
v
Incentive
u
s
time ๐ก๐ข
0. ๐ข became active at ๐ก๐ข1. Sample a delay ๐ฟ๐ข๐ฃ from ๐๐๐2A. Success w.p. ๐๐๐ ๐๐ + ๐น๐๐
๐ฃ becomes active at ๐ก๐ข + ๐ฟ๐ข๐ฃ
Time-varying independent cascade (TV-IC) model
๐บ = ๐, ๐ธ Social graph
๐๐๐ Non-increasing probability function
๐๐๐ Arbitrary length distribution
7
v
Incentive
u
s
time ๐ก๐ข time ๐ก๐ข + ๐ฟ๐ข๐ฃ
Objective function
๐TVIC ๐ โ ๐[# active vertices initiated by ๐]
Success
0. ๐ข became active at ๐ก๐ข1. Sample a delay ๐ฟ๐ข๐ฃ from ๐๐๐2B. Failure w.p. ๐ โ ๐๐๐ ๐๐ + ๐น๐๐
๐ฃ remains inactive
Time-varying independent cascade (TV-IC) model
๐บ = ๐, ๐ธ Social graph
๐๐๐ Non-increasing probability function
๐๐๐ Arbitrary length distribution
8
v
Incentive
u
s
time ๐ก๐ข
Objective function
๐TVIC ๐ โ ๐[# active vertices initiated by ๐]
Failure
Characterization of the IC model[Kempe-Kleinberg-Tardos. KDD'03]
9
๐ influences ๐งin the IC process
๐ can reach ๐งin a random subgraph ๐บโฒโ
S z S z
No need to consider time
โถ Dynamic โฆ
๐บt changes over time
โต ๐๐ข๐ฃ changes over time
โถ Shrinking โฆ
๐บt has only edge deletions
โต ๐๐ข๐ฃ is non-increasing
Characterization of the TV-IC model
10
๐ influences ๐งin the TV-IC process
๐ can reach ๐ง in a random
shrinking dynamic graph ๐บtโ
S z S z
S z
S z
time 1
time 2
time 3
Monotonicity & submodularity of ๐TVIC(โ )
11
โ Greedy algorithm is 1 โ eโ1 -approx.[Nemhauser-Wolsey-Fisher. Math. Program.'78]
๐TVIC ๐ = ๐๐บt # vertices reachable from ๐ in ๐บt
Our results : Monotone & submodularShrinking is a MUST
๐ influences ๐งin the TV-IC process
๐ can reach ๐ง in a random
shrinking dynamic graph ๐บtโ
Reverse influence set [Tang-Shi-Xiao. SIGMOD'15]
๐ ๐ โ {vertices that would have influenced ๐ง๐}
๐ ๐ โ ๐[# sets in โ intersecting ๐]
Efficient approximation of ๐TVIC โ
12
โ chosen u.a.r.
BFS-like alg. for the IC [Borgs-Brautbar-Chayes-Lucier. SODA'14]
Dijkstra-like alg. for the continuous IC [Tang-Shi-Xiao. SIGMOD'15]
Cannot be applied to our model
โ =
Greedy algorithm requires estimations of ๐TVIC โ
a
bz1
ca
f
z3
az2
๐ 1 ๐ 2 ๐ 3
Naive reverse influence set generation
under the TV-IC model
13
๐ฃ โ ๐ ๐๐ฃ can reach ๐ง๐ in a random
shrinking dynamic graphโ
โ ฮฉ ๐ โ ๐ธ time
Check each vertex separately
Efficient reverse influence set generation
under the TV-IC model
14
Computing ๐ ๐ฃโ ๐ฃ must pass through ๐ฃ,๐ค for some ๐ค to reach ๐ง๐ โ
๐ ๐ฃ = max๐ฃ,๐ค โ๐ธ
min ๐ ๐ค โ ๐ฟ๐ฃ๐ค, ๐๐ฃ๐คโ1 ๐ฅ๐ฃ๐ค
๐ฃ โ ๐ ๐ ๐ ๐ โฅ ๐โ
Key = latest activation time ๐ ๐ฃmax. ๐ ๐ฃ s.t. (๐ฃ gets active in ๐ ๐ฃ ) โ (๐ง๐ gets active)
Dynamic programming yields
+โ = ๐ ๐ง๐ โฅ ๐ ๐ฃ1 โฅ โฏ โฅ ๐ ๐ฃ๐ โฅ 0
v
w1
w2
w3
zi
Performance analysis
15
Lemma 2 Correctness
Lemma 3 Running time = ๐ช ๐ธ โ OPT
๐log ๐ in exp.
Theorem 4
Our reverse influence set generation method +
IMM (Influence Maximization via Martingales) framework[Tang-Shi-Xiao. SIGMOD'15]
1 โ eโ1 โ ๐ -approx. w.h.p.
๐ช ๐ ๐ธ + ๐ธ 2
๐
log2 |๐|
๐2expected time
๐ธ
๐is small for real graphs
Experiments: Setting
โถComparative algorithms
This work
IMM-CTIC[Tang-Shi-Xiao. SIGMOD'15]
IMM-IC[Tang-Shi-Xiao. SIGMOD'15]
LazyGreedy[Minoux. Optim. Tech.'78]
Degree
Efficient algorithmsunder different models
Accurately estimate๐TVIC โ by simulations
Simple baseline
โถ๐๐ข๐ฃ ๐ก = inโdeg ๐ฃ โ ๐ โ ๐ก โ1
โถ๐๐ข๐ฃ โ = (Weibull distribution) [Lawless. '02]
Experiments: Solution quality
17
LazyGreedy > This work > IMM-IC > IMM-CTIC > Degree
ego-Twitter |๐|=81K ๐ธ =2,421Kca-GrQc |๐|=5K ๐ธ =29K
๐TVIC โ of seed sets produced by each method
Bett
er
Accuracy guarantee Ignore time-decay effects
Dataset : Stanford Large Network Dataset Collection. http://snap.stanford.edu/data
Environment : Intel Xeon E5540 2.53 GHz CPU + 48 GB memory & g++v4.8.2 (-O2)
Ob
ject
ive
va
lue
Seed size k Seed size k
Ob
ject
ive
va
lue
Experiments: Scalability
18
Degree โซ This work โ IMM-IC > IMM-CTIC โซ LazyGreedy
Running time for selecting seed sets of size ๐ for each method
Quadratic timeNear-linear time
Just sorting
Bett
er
Dataset : Stanford Large Network Dataset Collection. http://snap.stanford.edu/data
Environment : Intel Xeon E5540 2.53 GHz CPU + 48 GB memory & g++v4.8.2 (-O2)
Seed size k Seed size k
ego-Twitter |๐|=81K ๐ธ =2,421Kca-GrQc |๐|=5K ๐ธ =29K
Summary & future work
19
ClassicalConstant probability
ๅณ
Monotone & submodular
[Kempe+'03]
Influence maximization
๐ช ๐ ๐ + ๐ธlog ๐
๐2time
1 โ eโ1 โ ๐ -approx.[Tang+'15]
Our studyTime-decay probability
ๅณ
Monotone & submodular
Influence maximization
๐ช ๐ ๐ธ + ๐ธ 2
๐
log2 |๐|
๐2time
1 โ eโ1 โ ๐ -approx.
More generalTime-varying probability
ๅณ
Simple extension violates
submodularity
Efficient influence max.?
๐ก
๐๐ข๐ฃ ๐ก
๐ก
๐๐ข๐ฃ ๐ก
๐ก
๐๐ข๐ฃ ๐ก
generalize generalize