38
Good Ugly Bad . . Big Data Analytics िशवकुमार G. Sivakumar வமா Computer Science and Engineering भारतीय ौोिगकी संान म ु ंबई (IIT Bombay) [email protected] October 17, 2013 • The Good (Data Deluge, Web 1.0, 2.0, 3.0, Analytics, Opportunities, Benefits) • The Ugly? (Underlying Technologies, Hadoop, MapReduce, NoSQL) • The Bad (Hype, Challenges, Cost) िशवकुमार G. Sivakumar வமாComputer Science and Engineering भारतीय ौोिगकी संान म ु ंबई (IIT Bomb Big Data Analytics

Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

  • Upload
    others

  • View
    1

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.

...... Big Data Analytics

िशवकुमार G. Sivakumar சிவகுமார்

Computer Science and Engineeringभारतीय ौोिगकी संान म ुबंई (IIT Bombay)

[email protected]

October 17, 2013

• The Good (Data Deluge, Web 1.0, 2.0, 3.0, Analytics,Opportunities, Benefits)

• The Ugly? (Underlying Technologies, Hadoop, MapReduce, NoSQL)• The Bad (Hype, Challenges, Cost)

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 2: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Blind men and the Elephant - अ-गज ायः

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 3: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Dengue Trends

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 4: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Twitter Trends

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 5: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Google Trends

Popular search terms on Oct 16th.

What’s Wonobo?

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 6: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Dengue Trends (Validation)

Thought Experiments: How did they do this? Is this really BigData Analytics?Compare with: Phailin, Real-time Traffic control forMumbai/Chennai, Amazon/Facebook recommendations.

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 7: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Internet (Web 1.0)

Milestones

12

3

4 5 6

02

http://www.isc.org/

9796959493888270s

10 80k 1M 4.5M 16M1k

100M30M2k

5

25

90 150

20k 50k 800k

500 200k 1.2M

Academic WWW(steroids)

Java

LAN−boom!

(TCP/IP)

(DoD funds)

Hosts

INTERNET GROWTH

99

Users

Countries

Domains

WWW sites

Commercial UsersE−commerce

147M

All

Motto: Information AnyTime, AnyWhere, AnyForm, AnyDevice, ...WebTone like DialTone

Basic Hardware (sine qua non!)Democratized access to information! (Digital Divide)

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 8: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Social Networking (Web 2.0)The OS/system software that empowers users to become producersof knowledge and ensures their right to collaboration/assembly.Examples: Wikipedia, Flickr, Orkut, Twitter, ....Mantras: Architecture of participation, Wisdom of crowds, Betteras more use - Long tail, Tagging, commenting, blogs, Open access(source/content) for Remix/Mashup

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 9: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. पवू प (Purva Paksha)

.

......

Web 1.0 may have democratized access to information, but it is likedrinking water from a fire hose!Search engines provide partial solutions, but cannot combine, categorizeand infer!

.

......

Web 2.0 may have allowed right to assembly/collaboartion, but• Proliferated unreliable, contradictory information.• Facilitated malicious uses including loss of privacy, security.

.

......

What do you want from Web 3.0?What you want to see/hear when you wakeup?I have a dream ...

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 10: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Semantic WebThe application layer tapping the hardware (Web 1.0) and OS(Web 2.0)?

..Ramana Maharishi. author-of..

NaanYaar?

.Aksharamanamalai

.Vichara

Mani Mala

.

Realityin FortyVerses

.

contemporaries

.

.

KanchiChan-

drasekaraSaraswathi

.

Jiddu Kr-ishnamurti

.

Place: Tiru-vannamali,Tamil Nadu

.

.Lived30/12/1879

to14/4/1950

.

Combined, categorized information inferred from various sites,languages. www.dbpedia.org comes close today!

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 11: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Approach 1: Artificial IntelligenceCan AI of computers match NS of humans? Consider chess, oncethe holy grail of AI.

Does not play the human way at all!Mostly parallelized search in hardware (200 millionpositions/second!) and almost no learning/improvement. But,enough to beat humans. Has changed chess irrevocably! (seeRybka analysis on left).Very tough challenge: Natural Language! (out of sight, out ofmind)

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 12: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Approach 2: Smart Data

Embed all the knowledge in the data.The Semantic Web will enable machines to understandsemantic documents and data, NOT human speech andwriting.

Natural language is a real problem (invisible idiot! ;-)Two broad approaches here also.

• Mostly manual process. Exhaustive annotation with metadata.• Computers do most of the grunt work. Humans only supervise

and edit.Big Data Analtyics is only tip of the iceberg with main successes inmarketing and advertisement (Amazon, Google, Facebook,Twitter).

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 13: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Tomorrow’s Information, Yesterday’s Technologies

How much useful data is accessible today? What types?(structured, semi-structured, unstructured) What characteristics?(4Vs)What actionable inferences can we make? (wireless moksha!)

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 14: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. 4-Vs of Big Data (IBM’s Version)

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 15: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. So, What’s Big Data?

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 16: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Size of Web (Indexed)

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 17: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. data.gov.in

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 18: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. data.gov.in DataSets

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 19: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Why Analytics?

िचनीया िह िवपदां आदाववे ितियान कूपखननं यंु दी े विना गहृेThe effect of disasters should be thought of beforehand. It is notappropriate to start digging a well when the house is ablaze withfire.There is a tide in the affairs of men, Which taken at the flood,leads on to fortune. Omitted, all the voyage of their life is boundin shallows and in miseries. Shakespeare

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 20: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. What is Analytics?

Includes the following.Databases, Queries, Text Mining, Data Mining, DataWarehouse, OLAP, Statistics, Pattern Recognition,Machine Learning, Artificial Intelligence

Looking for value!• Need to build models for data analysis. (who?)• Must ultimately result in some meaningful action and return on

investment• 3 step process: Identify objectives, Identify controls, Collect data

needed and Analyze

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 21: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Bhuta Bhavya Bhavatprabhu ( भतूभ भवभःु )

॥ हिरः ॐ ॥िवं िववु षारो भतूभभवभःु।

• Bhuta (What happened? Reactive)Reports- Standard, ad-hoc, Query Drill down

• Bhavat (What is happening?)Statistical Analysis, Anamolies, Alerts

• Bhavya (What will happen? Pro-active)Predictive Forecast, Optimize

Analytics can convert data to knowledge to wisdom.

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 22: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Typical Examples of analysis

• BIWhich products give higher revenue, how it varies overmonths (seasonality), which contribute to profitability, whichregions, which salesmen are doing well, …

• Predictive/online analyticsLikely response to new products, repeat customers, high riskcustomers, who visits our online resources, which videos getgood rating, is there negative sentiment, …

• Marketing analytics:Which advt. media to select, how sales will be affected ...

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 23: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. What’s special in Big Data Analytics?What more/different do we need from

• Databases (RDBMS)Used for Transaction processing, consolidation, reports,business performance

• Data WarehouseUsed to Co-relate various data, strategies for pricing, supplychains, cross-sale, ….

Can these not handle Web log for profiling and recommendations,Sentiment analysis for product evaluation, positioning, marketing?No. Biggest problem is the ACID properties requirement whichmakes scale-out and fault-tolerance near impossible.Some Weaknesses:

• Disk oriented storage and indexing structures• Multithreading to hide latency• Locking-based concurrency control mechanisms• Log-based recovery

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 24: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Big Data Analytics Requirements

Think: Facebook likes, Shopping cart recommendations, ...• Ingest data at very high speeds and rates• Scale easily to meet growth and demand peaks• Support integrated fault tolerance• Support a wide range of real-time (or “near-real-time”)

analytics• Integrate easily with high volume analytic datastores

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 25: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. High Speed Data Ingestion

• Support millions of write operations per second at scale• Read and write latencies below 50 milliseconds• Do not need ACID-level consistency guarantees (Eventual is

fine!)• Support one or more well-known application interfaces

• SQL• Key/Value• Documents (NLP)

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 26: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Scaling Requirements

• Scale-out on commodity hardware• Built-in database partitioning

Manual sharding and/or add-on solutions not practicalDatabase must automatically implement defined partitioningstrategy

• Application should see a single database instance• Database should encourage scalability best practices

For example, replication of reference data minimizes need formulti-partition operations

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 27: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Fault-tolerance Requirements

• Database should transparently support built-in“Tandem-style” HA

• Users should be able to easily increase/decrease faulttolerance levels

• Database should be easily and quickly recoverable in the eventof severe hardware failures

• Database should be able to automatically detect and managea variety of partition fault conditions

• Downed nodes should be rejoinable without the need forservice windows

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 28: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. hadoop.apache.org

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 29: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Open Source Big Data Stack

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 30: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Goals of HDFS/HBase

• Need to process Multi Petabyte Datasets• Very Large Distributed File System

10K nodes, 100 million files, 10 PB Provides very highaggregate bandwidth

• Assumes Commodity HardwareProvide very high reliability : replication; detect failures andrecover Nodes fail every day

• Optimized for Batch Processing• Data locations exposed so that computations can move to

where data resides• Files are insufficient data abstractions : HBase

Need tables, schemas, partitions, indices• Need for an open data format

flexible schema

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 31: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. HDFS Architecture

Files are broken into 128MB blocks, each block replicated onmultiple data nodes.

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 32: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Map-Reduce

Borrowed from functional programming (1980s!)

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 33: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Hive: (No)SQL over Hadoop

This is the Big Analytics...• Supports normal SQL operators

Projections, equi-joins, group by, order by• Can run on selected samples of data• Provides custom scripts using map-reduce• Map-reduce based implementation of joins, group by, sorting

(order by)• Optimization

Merging sequential map-reduce jobs, sharing common reads,early selections and projections

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 34: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Hype or Real?

In the Big Data Analytics context consider the following..Wikipedia Definition..

......

Technology is the usage and knowledge of tools, techniques, crafts,systems or methods of organization in order to solve a problem orserve some purpose.

Technology is not created in a vaccuum. It has a context and issocially relevant.Mother:Necessity Father:Profit!As can be expected, mother is always good, father can be verybad! IPR/Profit: Boon or Bane?There are many who believe that Google/Facebook has damagedthe human brain! ;-)

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 35: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Hype or Real?

In the Big Data Analytics context consider the following..Wikipedia Definition..

......

Technology is the usage and knowledge of tools, techniques, crafts,systems or methods of organization in order to solve a problem orserve some purpose.

Technology is not created in a vaccuum. It has a context and issocially relevant.Mother:Necessity Father:Profit!As can be expected, mother is always good, father can be verybad! IPR/Profit: Boon or Bane?There are many who believe that Google/Facebook has damagedthe human brain! ;-)

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 36: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Big Data Challenges (Gartner Survey)

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 37: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. Hype Cycle (Gartner 2013)

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics

Page 38: Big Data Analyticssiva/talks/nls/myBigData.pdf · Good Ugly Bad.. Big Data Analytics िशवकुमारG. Sivakumar சிவகுமார் Computer Science and Engineering

Good Ugly Bad

.. MIT’s DataHub (VLDB 2013 Keynote talk)

The middle-path?

िशवकुमार G. Sivakumarசிவகுமார்Computer Science and Engineering भारतीय ौोिगकी संान म ुबंई (IIT Bombay) [email protected] Data Analytics