The Curse of Big Data in Mobile Analytics Dr. Guodong (Gordon) Gao M-CERSI Workshop, 9/11/2015

Preview:

DESCRIPTION

Even more data 3

Citation preview

The Curse of Big Data in Mobile Analytics

Dr. Guodong (Gordon) Gao M-CERSI Workshop, 9/11/2015

Mobile devices = Big Data User generated data

Facebook ingests 500 terabytes of new data every day. Text messages, diet log, photos, videos, …

System generated data App download and usage Gesture, touches Communications with other wearable devices

Sensor-generated data 6 billion mobile phones

Geo-location data, pedometer, heart beat sensor, and oxygen saturation sensor

2

Even more data

3

7

5

6

Causal inference Most the statistical methods try to measure correlations,

not causation.

For actionable knowledge, we need causation! Does the roster crowing cause the sun to rise?

Confusing correlation with causality can be dangerous

7

8

9

Does Anne Hathaway help Warren Buffet get richer?

10

The curse of big data Heterogeneity in Treatment Effects (HTE)

Sub-group analysis Helps answer:

Which sub-group will benefit from this treatment? Should I prescribe the treatment to this particular

patient? With dozens of variable, and thousands of

combinations, we can define sub-group in many ways

e.g. 10 variables, each with 3 levels, there are 3^10 = 59,049 combinations!

We are doomed to find something statistically significant in certain sub-groups

11

Yet another curse of big data

12

Do not ignore the fundamentals Patient #11

13