Narayanan-Shmatikov attack (IEEE S&P 2009)

Preview:

DESCRIPTION

Narayanan-Shmatikov attack (IEEE S&P 2009). A. Narayanan and V. Shmatikov, De-anonymizing Social Networks , IEEE S&P 2009 . Anonymized data: Twitter (crawled in late 2007) A microblogging service 224K users, 8.5M edges Auxiliary data: Flicker (crawled in late 2007/early 2008) - PowerPoint PPT Presentation

Citation preview

Structural Data De-anonymization: Quantification, Practice, and Implications

Structural Data De-anonymization: Quantification, Practice, and Implications

Shouling Ji, Weiqing Li, and Raheem Beyah Georgia Institute of Technology

Mudhakar SrivatsaIBM T. J. Watson Research Center

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Narayanan-Shmatikov attack (IEEE S&P 2009)• A. Narayanan and V. Shmatikov, De-anonymizing Social Networks, IEEE S&P 2009.

• Anonymized data: Twitter (crawled in late 2007)

– A microblogging service

– 224K users, 8.5M edges

• Auxiliary data: Flicker (crawled in late 2007/early 2008)

– A photo-sharing service

– 3.3M users, 53M edges

• Result: 30.8% of the users are successfully de-anonymized

Twitter Flicker

User mapping

HeuristicsEccentricityEdge directionalityNode degreeRevisiting nodesReverse match

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Srivatsa-Hicks Attacks (ACM CCS 2012)• M. Srivatsa and M. Hicks, De-anonymizing Mobility Traces: using Social Networks as a Side-

Channel, ACM CCS 2012.

• Anonymized data

– Mobility traces: St Andrews, Smallblue, and Infocom 2006

• Auxiliary data

– Social networks: Facebook, and DBLP

• Over 80% users can be successfully de-anonymized

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Motivation• Question 1: Why can structural data be de-anonymized?

• Question 2: What are the conditions for successful data de-anonymization?

• Question 3: What portion of users can be de-anonymized in a structural dataset?

[1] P. Pedarsani and M. Grossglauser, On the Privacy of Anonymized Networks, KDD 2011.

[2] L. Yartseva and M. Grossglauser, On the Performance of Percolation Graph Matching, COSN 2013.

[3] N. Korula and S. Lattanzi, An Efficient Reconciliation Algorithm for Social Networks, VLDB 2014.

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Motivation• Question 1: Why can structural data be de-anonymized?

• Question 2: What are the conditions for successful data de-anonymization?

• Question 3: What portion of users can be de-anonymized in a structural dataset?

Our Constribution Address the above three open questions under a practical

data model.

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Outline

• Introduction and Motivation• System Model• De-anonymization Quantification• Evaluation • Implication 1: Optimization based De-anonymization (ODA) Practice• Implication 2: Secure Data Publishing• Conclusion

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

System Model

• Anonymized Data

• Auxiliary Data

• De-anonymization

• Measurement1

2

3

4 5

6

78

9

10

1

2

3

4 5

6

78

9

10

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

System Model

• Anonymized Data

• Auxiliary Data

• De-anonymization

• Measurement

• Quantification

1

2

3

4 5

6

78

9

10

1

2

3

4 5

6

78

9

10

1

2

3

4 5

6

78

9

10

conceptual underlying graph

Configuration ModelG can have an arbitrary degree sequence

that follows any distribution

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Outline

• Introduction and Motivation• System Model• De-anonymization Quantification• Evaluation • Implication 1: Optimization based De-anonymization (ODA) Practice• Implication 2: Secure Data Publishing• Conclusion

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

De-anonymization Quantification• Perfect De-anonymization Quantification

Structural Similarity Condition Graph/Data Size Condition

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

De-anonymization Quantification• -Perfect De-anonymization Quantification

Structural Similarity ConditionGraph/Data Size Condition

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Outline

• Introduction and Motivation• System Model• De-anonymization Quantification• Evaluation • Implication 1: Optimization based De-anonymization (ODA) Practice• Implication 2: Secure Data Publishing• Conclusion

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Evaluation• Datasets

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Evaluation• Perfect De-anonymization Condition Structural Similarity Condition

Graph/Data Size Condition

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Evaluation• -Perfect De-anonymization Condition

Structural Similarity Condition Graph/Data Size ConditionProjection/Sampling Condition

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Evaluation• -Perfect De-anonymizability Structural Similarity Condition

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Evaluation• -Perfect De-anonymizability Structural Similarity Condition

How many users can be successfully de-anonymized

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Outline

• Introduction and Motivation• System Model• De-anonymization Quantification• Evaluation • Implication 1: Optimization based De-anonymization (ODA) Practice• Implication 2: Secure Data Publishing• Conclusion

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Optimization based De-anonymization (ODA)• Our quantification implies

– An optimum de-anonymization solution exists

– However, it is difficult to find it.

Select candidate users from unmapped users with top degrees

Mapping candidate users by minimizing the Edge Error

function

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Optimization based De-anonymization (ODA)• Our quantification implies

– An optimum de-anonymization solution exists

– However, it is difficult to find it.

Space complexity

Time complexity

ODA Features1. Cold start (seed-free)

2. Can be used by other attacks for landmark (seed) identification

3. Optimization based

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

ODA Evaluation • Dataset

– Google+ (4.7M users, 90.8M edges): using random sampling to get anonymized graphs and auxiliary graphs

– Gowalla: • Anonymized graphs: constructed based on 6.4M check-ins <UserID, latitude, longitude,

timestamp, location ID> generated by .2M users

• Auxiliary graph: the Gowalla social graph of the .2 users (1M edges)

– Results: landmark identification

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

ODA Evaluation • Dataset

– Google+ (4.7M users, 90.8M edges): using random sampling to get anonymized graphs and auxiliary graphs

– Gowalla: • Anonymized graphs: constructed based on 6.4M check-ins <UserID, latitude, longitude,

timestamp, location ID> generated by .2M users

• Auxiliary graph: the Gowalla social graph of the .2 users (1M edges)

– Results: de-anonymization

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Outline

• Introduction and Motivation• System Model• De-anonymization Quantification• Evaluation • Implication 1: Optimization based De-anonymization (ODA) Practice• Implication 2: Secure Data Publishing• Conclusion

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Secure Structural Data Publishing

• Structural information is important

• Based on our quantification– Secure structural data publishing is difficult, at least theoretically

• Open problem …

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Conclusion

– We proposed the first quantification framework for structural data de-anonymization under a practical data model

– We conducted a large-scale de-anonymizability evaluation of 26 real world structural datasets

– We designed a cold-start optimization-based de-anonymization algorithm

AcknowledgementWe thank the anonymous reviewers very much

for their valuable comments!

S. Ji, W. Li, M. Srivatsa and R. Beyah Structural Data De-anonymization

Thank you!Shouling Ji

sji@gatech.edu

http://users.ece.gatech.edu/sji/

Recommended