41
1 © Bull, 2014 Andrew Carr CEO, Bull UK & Ireland

The Good, The Bad, and The Ugly

Embed Size (px)

DESCRIPTION

High Performance Computing and Big Data Conference

Citation preview

Page 1: The Good, The Bad, and The Ugly

1© Bull, 2014

Andrew CarrCEO, Bull UK & Ireland

Page 2: The Good, The Bad, and The Ugly

2© Bull, 2014

High Performance Computingand Big Data Conference

Data: the Good, the Bad, and the Ugly

Page 3: The Good, The Bad, and The Ugly

3© Bull, 2014

Page 5: The Good, The Bad, and The Ugly

5© Bull, 2014

Distributed IT

IT as-a-Service

Information as-a-Service

Centralised IT

TECHNOLOGY

USAGE

Its main driver transitioning from

to

The IT market is at an inflection point:

T E C H N O L O G Y1970 2020USAGE2010

Page 6: The Good, The Bad, and The Ugly

6© Bull, 2014

IT INFRASTRUCTURE

COMPLEX INTEGRATION

HIGH PERFORMANCE COMPUTING

SECURITY

M2M

CLOUD

BIG DATA

Information as-a-Service

Distributed IT

IT as-a-Service

Centralised IT

TRANSPARENT PLATFORMS

VALUE FROM DATA enabling

The IT market is at an inflection point:

Page 7: The Good, The Bad, and The Ugly

7© Bull, 2014

Time to results…

Speed has Value Greater than Size

Think Fast Data more than Big Data

Page 8: The Good, The Bad, and The Ugly

8© Bull, 2014

A real Big Data problem…but Fast Results?

14 Jan 2014 - Illumina Announces the Thousand Dollar Genome• $800 for reagents, $60 for sample preparation, $137 for ‘hardware’ over lifetime• Assuming you can afford 10 HiSeq X machines at $1 Million each You will be able to process 5 whole genomes/day – 18,000 a year for X10

So just 30 systems non-stop 24/7 to meet Genomics England 100K 2017 goal !

Page 9: The Good, The Bad, and The Ugly

9© Bull, 2014

A real Big Data problem…but Fast Results?

But when you’ve done that, how to process the results?• You now have 30-50 Terabytes of raw data per machine per week• HiSeq X10 cluster will require ~ 175,000 CPU core hours just to align results

and even more to perform variant analysis to detect cancer anomalies Delivering 250,000 core hours/week 24/7 and storing results is not trivial

Page 10: The Good, The Bad, and The Ugly

10© Bull, 2014

Why is data important?

Page 12: The Good, The Bad, and The Ugly

12© Bull, 2014

Turning Fans into Customers…

Page 13: The Good, The Bad, and The Ugly

13© Bull, 2014

Smart Stadiums…..

• 90% Increase in RESPECT services & ‘Report an incident’.• 12% New revenue £1 per bet ‘Man of the match’ /First Sub betting • 85% Increase In Social Media usage• 35% increase in Stadium sponsored betting • 8% -15% increase in Club Merchandising• Discounts on food & beverage to remove wastage• Twitter wall for live interactions (advertorials)• Real time non-contentious replays• Access to secure club content (premium)

Smart Stadiums Value:Become aware:

Traffic managementSecurity challengesWeatherCrowd controlFoot-fall management

Page 14: The Good, The Bad, and The Ugly

14© Bull, 2014

Professor Stephen JarvisDirector for Computing Research University of Warwick

Page 15: The Good, The Bad, and The Ugly

15© Bull, 2014

Smart Cities

RetailPolice

Telecoms

Government

Healthcare

Forensic science

InterpolOpinion polls

Page 16: The Good, The Bad, and The Ugly

16© Bull, 2014

Source cameraidentification

Used by Interpol to classify and group

explicit images

Fingerprint analysis Used in UNHCR camps

Biometric solutions

FBI certified

Performance tuningand debugging tools

Used on the world’s Largest supercomputers

Page 17: The Good, The Bad, and The Ugly

17© Bull, 2014

1. Characteristics of the problem domain

2. Characteristics of the solution

Volume – terabytes to exabytes of existing data to process

Variety – structured, unstructured, text multimedia

Veracity – uncertainty due to incompleteness or ambiguities

Velocity – streaming data, milliseconds to seconds response time

Storage – should this increase your data storage requirements?

Processing – should data processing be done sequentially or in parallel?

Let’s investigate some case studies …

Speed – where should you maximise latency: memory, network, both?

Page 18: The Good, The Bad, and The Ugly

18© Bull, 2014

Case study 1: You like pink milk

Page 19: The Good, The Bad, and The Ugly

19© Bull, 2014

• 1993, Tesco’s CEO was looking to replace Green Shield trading stamps

• DunnHumby, a small London start-up, introduced the notion of a clubcard

“you know more about my customers after three months, than I know after 30 years”

Lord MacLaurin, Tesco Chairman

Case study 1: You like pink milk

Page 20: The Good, The Bad, and The Ugly

20© Bull, 2014

• Single most significant factor in the success of the company

• 43M clubcard holders worldwide• Allows Tesco to stock unpopular brands for big

spending customers

• 6M transactions per day presents significant volume

• Wide application: Calorie counting with Diabetes UK

Case study 1: You like pink milk

Page 21: The Good, The Bad, and The Ugly

21© Bull, 2014

Case study 1: You like pink milk

BIG DATA

Characteristics: Terabytes to exabytes of existing data is processed

Processing: Batch and in parallel

Storage: Very large volumes of datastored

Speed: Access of data from disk; transfer of data to / from memory; delivery of results potentially slow

Page 22: The Good, The Bad, and The Ugly

22© Bull, 2014

Case study 2: Take heart

Page 23: The Good, The Bad, and The Ugly

23© Bull, 2014

• Some problems are not so much volume as velocity, as you want to analyse data in motion

• Non-relational data, such as email, text, voice, video, data from instruments

Case study 2: Take heart

Page 24: The Good, The Bad, and The Ugly

24© Bull, 2014

• Monitoring needs to be real-time and continuous• Not so much a question of storage, as of spotting outliers

Case study 2: Take heart

Page 25: The Good, The Bad, and The Ugly

25© Bull, 2014

Case study 2: Take heart

• Streaming analytic solutions being deployed into intensive care and mobile continuous health monitoring

• Text analysis of social media for flu

Page 26: The Good, The Bad, and The Ugly

26© Bull, 2014

• Health analytics market estimated to be worth $21.3B by 2020

• Compound annual growth rate of 25%

Case study 2: Take heart

Page 27: The Good, The Bad, and The Ugly

27© Bull, 2014

BIG DATA

Characteristics: Streaming data; could be from heterogeneous sources from multiple sites

Processing: Real-time and in parallel; may alert further batch

Storage: Minimal storage requirements;

Speed: Transfer ‘from the pipe’ toregisters for processing; results often delivered as alerts

Case study 2: Take heart

Page 28: The Good, The Bad, and The Ugly

28© Bull, 2014

Case study 3: We built this city

Page 29: The Good, The Bad, and The Ugly

29© Bull, 2014

• Annual global market for Smart Cities solutions is £200B

• Over 1,000 cities in the world with populations >500,000

• Smart Cities research shows us the variety of data

Case study 3: We built this city

• Transport cards (oyster)• Sensors (traffic, pollution, weather)• Camera data (security, traffic)• GIS (people, vehicles)• Buildings (temperature,

occupation)

Page 30: The Good, The Bad, and The Ugly

30© Bull, 2014

Case study 3: We built this city

Click here to play the video

Page 31: The Good, The Bad, and The Ugly

31© Bull, 2014

Case study 3: We built this city

What 100 million callsto NYC 311 reveal

Page 32: The Good, The Bad, and The Ugly

32© Bull, 2014

BIG DATA

Characteristics: Streaming and/orbatch analytics; from heterogeneous sources from multiple sites

Processing: Real-time and in parallel; may alert further batch

Storage: Minimal storage requirements;

Speed: Transfer ‘from the pipe’ toregisters for processing; results often delivered as alerts

Case study 3: We built this city

Page 33: The Good, The Bad, and The Ugly

33© Bull, 2014

Case study 4: The Blackberry Riots

Page 34: The Good, The Bad, and The Ugly

34© Bull, 2014

Case study 4: The Blackberry Riots

• Between 6 and 10 August 2011, thousands of people took to the streets in London

• The disturbances began after a police shooting on 4 August in Tottenham

• The resulting chaos required mass police deployment

• The rioting soon spread to Birmingham, Bristol, Liverpool and Manchester

• “Everyone watching these horrific actions will be struck by how they were organised with social media” David Cameron, Prime Minister

Page 35: The Good, The Bad, and The Ugly

35© Bull, 2014

Case study 4: The Blackberry Riots

• Professor Rob Procter and a team from LSE and The Guardian set about investigating this claim

• One of the largest studies of social media analytics

• What can we learn from use of social media during times of crisis?

• What does this tell us about veracity of data?

Page 36: The Good, The Bad, and The Ugly

36© Bull, 2014

Case study 4: The Blackberry Riots

9pm on 8th August @Twiggy_Garcia circulatesunconfirmed reports thatrioters releasing animals at London Zoo

Re-tweeted by influential users with many followers. Rumours spread in viral-like way over non-hierarchical network

Opposition seeds within 13 minutes. Pictures are identified as fake

Click here to play the video

Page 37: The Good, The Bad, and The Ugly

37© Bull, 2014

BIG DATA

Characteristics: Uncertainty and Incompleteness exists in all data; streaming has the advantage of ‘in-flight correction’.Processing: Real-time and in parallel; inc. background analysis

Storage: Minimal additional storage requirements;

Speed: Inevitably impacts speed

Case study 4: The Blackberry Riots

Page 38: The Good, The Bad, and The Ugly

38© Bull, 2014

• Identifying characteristics of problem domain• Working with experts, formulate technology (hardware/software) needs• ‘Big data’ solutions are commonplace; ‘Fast data’ solutions are not

Page 39: The Good, The Bad, and The Ugly

39© Bull, 2014

® Copyright 2011 Gigaspaces Ltd. All Rights Reserved39

Conclusion…….

Page 40: The Good, The Bad, and The Ugly

40© Bull, 2014

Discussion

[email protected]

[email protected]

[email protected]

Page 41: The Good, The Bad, and The Ugly

41© Bull, 2014

[email protected]@Bull_UKBull-Information-Systems

0870 240 0040www.bull.co.ukHemel Hempstead HP2 7DZ