35
DeepWalk: Online Learning of Social Representations Bryan Perozzi, Rami Al-Rfou, Steven Skiena Stony Brook University ACM SIG-KDD August 26, 2014

DeepWalk: Online Learning of Representations

Embed Size (px)

Citation preview

Page 1: DeepWalk: Online Learning of Representations

DeepWalk: Online Learning of Social Representations

Bryan Perozzi, Rami Al-Rfou, Steven SkienaStony Brook University

ACM SIG-KDDAugust 26, 2014

Page 2: DeepWalk: Online Learning of Representations

Outline

Bryan Perozzi DeepWalk: Online Learning of Social Representations

● Introduction: Graphs as Features● Language Modeling● DeepWalk● Evaluation: Network Classification● Conclusions & Future Work

Page 3: DeepWalk: Online Learning of Representations

Features From Graphs

● Anomaly Detection● Attribute Prediction● Clustering● Link Prediction● ...A

djac

ency

Mat

rix

|V|

Bryan Perozzi DeepWalk: Online Learning of Social Representations

A first step in machine learning for graphs is to extract graph features:● node: degree● pairs: # of common neighbors● groups: cluster assignments

Page 4: DeepWalk: Online Learning of Representations

What is a Graph Representation?

● Anomaly Detection● Attribute Prediction● Clustering● Link Prediction● ...

|V| d << |V|

Latent Dimensions

Adj

acen

cy M

atrix

Bryan Perozzi DeepWalk: Online Learning of Social Representations

We can also create features by transforming the graph into a lower dimensional latent representation.

Page 5: DeepWalk: Online Learning of Representations

DeepWalk

● Anomaly Detection● Attribute Prediction● Clustering● Link Prediction● ...

|V|

DeepWalk

d << |V|

Latent Dimensions

Adj

acen

cy M

atrix

Bryan Perozzi DeepWalk: Online Learning of Social Representations

DeepWalk learns a latent representation of adjacency matrices using deep learning techniques developed for language modeling.

Page 6: DeepWalk: Online Learning of Representations

Visual Example

Bryan Perozzi DeepWalk: Online Learning of Social Representations

On Zachary’s Karate Graph:

Input Output

Page 7: DeepWalk: Online Learning of Representations

Advantages of DeepWalk

● Anomaly Detection● Attribute Prediction● Clustering● Link Prediction● ...

|V|

DeepWalk

d << |V|

Latent Dimensions

Adj

acen

cy M

atrix

Bryan Perozzi DeepWalk: Online Learning of Social Representations

● Scalable - An online algorithm that does not use entire graph at once

● Walks as sentences metaphor● Works great!● Implementation available: bit.ly/deepwalk

Page 8: DeepWalk: Online Learning of Representations

Outline

Bryan Perozzi DeepWalk: Online Learning of Social Representations

● Introduction: Graphs as Features● Language Modeling● DeepWalk● Evaluation: Network Classification● Conclusions & Future Work

Page 9: DeepWalk: Online Learning of Representations

Language ModelingLearning a representationmeans learning a mapping function from word co-occurrence

Bryan Perozzi DeepWalk: Online Learning of Social Representations [Rumelhart+, 2003]

We hope that the learned representations capture inherent structure

[Baroni et al, 2009]

Page 10: DeepWalk: Online Learning of Representations

World of Word Embeddings

This is a very active research topic in NLP.• Importance sampling and hierarchical classification were proposed to speed up training.

[F. Morin and Y.Bengio, AISTATS 2005] [Y. Bengio and J. Sencal, IEEENN 2008] [A. Mnih, G. Hinton, NIPS 2008]

• NLP applications based on learned representations.[Colbert et al. NLP (Almost) from Scratch, (JMLR), 2011.]

• Recurrent networks were proposed to learn sequential representations.[Tomas Mikolov et al. ICASSP 2011]

• Composed representations learned through recursive networks were used for parsing, paraphrase detection, and sentiment analysis.[ R. Socher, C. Manning, A. Ng, EMNLP (2011, 2012, 2013) NIPS (2011, 2012) ACL (2012, 2013) ]

• Vector spaces of representations are developed to simplify compositionality.[ T. Mikolov, G. Corrado, K. Chen and J. Dean, ICLR 2013, NIPS 2013]

Page 11: DeepWalk: Online Learning of Representations

Word Frequency in Natural Language

Co-Occurrence Matrix

■ Words frequency in a natural language corpus follows a power law.

Bryan Perozzi DeepWalk: Online Learning of Social Representations

Page 12: DeepWalk: Online Learning of Representations

Connection: Power Laws

Vertex frequency in random walks on scale free graphs also follows a power law.

Bryan Perozzi DeepWalk: Online Learning of Social Representations

Page 13: DeepWalk: Online Learning of Representations

Vertex Frequency in SFG

■ Short truncated random walks are sentences in an artificial language!

■ Random walk distance is known to be good features for many problems

Scale Free Graph

Bryan Perozzi DeepWalk: Online Learning of Social Representations

Page 14: DeepWalk: Online Learning of Representations

The Cool Idea

Short random walks = sentences

Bryan Perozzi DeepWalk: Online Learning of Social Representations

Page 15: DeepWalk: Online Learning of Representations

Outline

Bryan Perozzi DeepWalk: Online Learning of Social Representations

● Introduction: Graphs as Features● Language Modeling● DeepWalk● Evaluation: Network Classification● Conclusions & Future Work

Page 16: DeepWalk: Online Learning of Representations

Deep Learning for Networks

Bryan Perozzi DeepWalk: Online Learning of Social Representations

Input: Graph

Random Walks

Representation Mapping

2

1 3

4 5Hierarchical Softmax Output: Representation

Page 17: DeepWalk: Online Learning of Representations

Deep Learning for Networks

Bryan Perozzi DeepWalk: Online Learning of Social Representations

Input: Graph

Random Walks

Representation Mapping

2

1 3

4 5Hierarchical Softmax Output: Representation

Page 18: DeepWalk: Online Learning of Representations

Random Walks

■ We generate random walks for each vertex in the graph.

■ Each short random walk has length .

■ Pick the next step uniformly from the vertex neighbors.

■ Example:

Bryan Perozzi DeepWalk: Online Learning of Social Representations

Page 19: DeepWalk: Online Learning of Representations

Deep Learning for Networks

Bryan Perozzi DeepWalk: Online Learning of Social Representations

Input: Graph

Random Walks

Representation Mapping

2

1 3

4 5Hierarchical Softmax Output: Representation

Page 20: DeepWalk: Online Learning of Representations

Representation Mapping

Bryan Perozzi DeepWalk: Online Learning of Social Representations

■ Map the vertex under focus ( ) to its representation.

■ Define a window of size

■ If = 1 and =

Maximize:

Page 21: DeepWalk: Online Learning of Representations

Deep Learning for Networks

Bryan Perozzi DeepWalk: Online Learning of Social Representations

Random Walks

2

1 3

4 5

Input: Graph Representation Mapping

Hierarchical Softmax Output: Representation

Page 22: DeepWalk: Online Learning of Representations

Hierarchical SoftmaxCalculating involves O(V) operations for each update! Instead:

Bryan Perozzi DeepWalk: Online Learning of Social Representations

● Consider the graph vertices as leaves of a balanced binary tree.

● Maximizing is equivalent to maximizing the probability of the path from the root to the node. specifically, maximizing C1

C2

C3

Each of {C1, C2, C3} is a logistic binary classifier.

Page 23: DeepWalk: Online Learning of Representations

Learning■ Learned parameters:

■ Vertex representations■ Tree binary classifiers weights

■ Randomly initialize the representations.

■ For each {C1, C2, C3} calculate the loss function.

■ Use Stochastic Gradient Descent to update both the classifier weights and the vertex representation simultaneously.

■Bryan Perozzi DeepWalk: Online Learning of Social Representations

[Mikolov+, 2013]

Page 24: DeepWalk: Online Learning of Representations

Deep Learning for Networks

Bryan Perozzi DeepWalk: Online Learning of Social Representations

Input: Graph

Random Walks

Representation Mapping

2

1 3

4 5Hierarchical Softmax Output: Representation

Page 25: DeepWalk: Online Learning of Representations

Outline

Bryan Perozzi DeepWalk: Online Learning of Social Representations

● Introduction: Graphs as Features● Language Modeling● DeepWalk● Evaluation: Network Classification● Conclusions & Future Work

Page 26: DeepWalk: Online Learning of Representations

Attribute PredictionThe Semi-Supervised Network Classification problem:

Bryan Perozzi DeepWalk: Online Learning of Social Representations

21

3

46

5

Stony BrookstudentsGooglers

INPUTA partially labelled graph with node attributes.

OUTPUTAttributes for nodes which do not have them.

Page 27: DeepWalk: Online Learning of Representations

Baselines

■ Approximate Inference Techniques:❑ weighted vote Relational Neighbor (wvRN)

■ Latent Dimensions❑ Spectral Methods

■ SpectralClustering [Tang+, ‘11]

■ MaxModularity [Tang+, ‘09]

❑ k-means■ EdgeCluster [Tang+, ‘09]

Bryan Perozzi DeepWalk: Online Learning of Social Representations

[Macskassy+, ‘03]

Page 28: DeepWalk: Online Learning of Representations

Results: BlogCatalog

Bryan Perozzi

DeepWalk performs well, especially when labels are sparse.

Page 29: DeepWalk: Online Learning of Representations

Results: Flickr

Bryan Perozzi

Page 30: DeepWalk: Online Learning of Representations

Results: YouTube

Bryan Perozzi

Spectral Methods do not scale to large graphs.

Page 31: DeepWalk: Online Learning of Representations

Parallelization

Bryan Perozzi DeepWalk: Online Learning of Social Representations

● Parallelization doesn’t affect representation quality.

● The sparser the graph, the easier to achieve linear scalability. (Feng+, NIPS ‘11)

Page 32: DeepWalk: Online Learning of Representations

Outline

Bryan Perozzi DeepWalk: Online Learning of Social Representations

● Introduction: Graphs as Features● Language Modeling● DeepWalk● Evaluation: Network Classification● Conclusions & Future Work

Page 33: DeepWalk: Online Learning of Representations

Variants / Future Work■ Streaming

❑ No need to ever store entire graph❑ Can build & update representation as new data

comes in.

Bryan Perozzi DeepWalk: Online Learning of Social Representations

■ “Non-Random” Walks❑ Many graphs occur through as a by-product of

interactions❑ One could outside processes (users, etc) to feed

the modeling phase❑ [This is what language modeling is doing]

Page 34: DeepWalk: Online Learning of Representations

Take-away

Language Modeling techniques can be used for online learning of network

representations.

Bryan Perozzi DeepWalk: Online Learning of Social Representations

Page 35: DeepWalk: Online Learning of Representations

Thanks!

Bryan Perozzi DeepWalk: Online Learning of Social Representations

Bryan Perozzi@phanein

[email protected]

DeepWalk available at:http://bit.ly/deepwalk