Upload
bull-uki
View
567
Download
0
Embed Size (px)
DESCRIPTION
High Performance Computing and Big Data Conference
Citation preview
1© Bull, 2014
Andrew CarrCEO, Bull UK & Ireland
2© Bull, 2014
High Performance Computingand Big Data Conference
Data: the Good, the Bad, and the Ugly
3© Bull, 2014
4© Bull, 2014
Click here to play the video
5© Bull, 2014
Distributed IT
IT as-a-Service
Information as-a-Service
Centralised IT
TECHNOLOGY
USAGE
Its main driver transitioning from
to
The IT market is at an inflection point:
T E C H N O L O G Y1970 2020USAGE2010
6© Bull, 2014
IT INFRASTRUCTURE
COMPLEX INTEGRATION
HIGH PERFORMANCE COMPUTING
SECURITY
M2M
CLOUD
BIG DATA
Information as-a-Service
Distributed IT
IT as-a-Service
Centralised IT
TRANSPARENT PLATFORMS
VALUE FROM DATA enabling
The IT market is at an inflection point:
7© Bull, 2014
Time to results…
Speed has Value Greater than Size
Think Fast Data more than Big Data
8© Bull, 2014
A real Big Data problem…but Fast Results?
14 Jan 2014 - Illumina Announces the Thousand Dollar Genome• $800 for reagents, $60 for sample preparation, $137 for ‘hardware’ over lifetime• Assuming you can afford 10 HiSeq X machines at $1 Million each You will be able to process 5 whole genomes/day – 18,000 a year for X10
So just 30 systems non-stop 24/7 to meet Genomics England 100K 2017 goal !
9© Bull, 2014
A real Big Data problem…but Fast Results?
But when you’ve done that, how to process the results?• You now have 30-50 Terabytes of raw data per machine per week• HiSeq X10 cluster will require ~ 175,000 CPU core hours just to align results
and even more to perform variant analysis to detect cancer anomalies Delivering 250,000 core hours/week 24/7 and storing results is not trivial
10© Bull, 2014
Why is data important?
11© Bull, 2014
Click here to play the video
12© Bull, 2014
Turning Fans into Customers…
13© Bull, 2014
Smart Stadiums…..
• 90% Increase in RESPECT services & ‘Report an incident’.• 12% New revenue £1 per bet ‘Man of the match’ /First Sub betting • 85% Increase In Social Media usage• 35% increase in Stadium sponsored betting • 8% -15% increase in Club Merchandising• Discounts on food & beverage to remove wastage• Twitter wall for live interactions (advertorials)• Real time non-contentious replays• Access to secure club content (premium)
Smart Stadiums Value:Become aware:
Traffic managementSecurity challengesWeatherCrowd controlFoot-fall management
14© Bull, 2014
Professor Stephen JarvisDirector for Computing Research University of Warwick
15© Bull, 2014
Smart Cities
RetailPolice
Telecoms
Government
Healthcare
Forensic science
InterpolOpinion polls
16© Bull, 2014
Source cameraidentification
Used by Interpol to classify and group
explicit images
Fingerprint analysis Used in UNHCR camps
Biometric solutions
FBI certified
Performance tuningand debugging tools
Used on the world’s Largest supercomputers
17© Bull, 2014
1. Characteristics of the problem domain
2. Characteristics of the solution
Volume – terabytes to exabytes of existing data to process
Variety – structured, unstructured, text multimedia
Veracity – uncertainty due to incompleteness or ambiguities
Velocity – streaming data, milliseconds to seconds response time
Storage – should this increase your data storage requirements?
Processing – should data processing be done sequentially or in parallel?
Let’s investigate some case studies …
Speed – where should you maximise latency: memory, network, both?
18© Bull, 2014
Case study 1: You like pink milk
19© Bull, 2014
• 1993, Tesco’s CEO was looking to replace Green Shield trading stamps
• DunnHumby, a small London start-up, introduced the notion of a clubcard
“you know more about my customers after three months, than I know after 30 years”
Lord MacLaurin, Tesco Chairman
Case study 1: You like pink milk
20© Bull, 2014
• Single most significant factor in the success of the company
• 43M clubcard holders worldwide• Allows Tesco to stock unpopular brands for big
spending customers
• 6M transactions per day presents significant volume
• Wide application: Calorie counting with Diabetes UK
Case study 1: You like pink milk
21© Bull, 2014
Case study 1: You like pink milk
BIG DATA
Characteristics: Terabytes to exabytes of existing data is processed
Processing: Batch and in parallel
Storage: Very large volumes of datastored
Speed: Access of data from disk; transfer of data to / from memory; delivery of results potentially slow
22© Bull, 2014
Case study 2: Take heart
23© Bull, 2014
• Some problems are not so much volume as velocity, as you want to analyse data in motion
• Non-relational data, such as email, text, voice, video, data from instruments
Case study 2: Take heart
24© Bull, 2014
• Monitoring needs to be real-time and continuous• Not so much a question of storage, as of spotting outliers
Case study 2: Take heart
25© Bull, 2014
Case study 2: Take heart
• Streaming analytic solutions being deployed into intensive care and mobile continuous health monitoring
• Text analysis of social media for flu
26© Bull, 2014
• Health analytics market estimated to be worth $21.3B by 2020
• Compound annual growth rate of 25%
Case study 2: Take heart
27© Bull, 2014
BIG DATA
Characteristics: Streaming data; could be from heterogeneous sources from multiple sites
Processing: Real-time and in parallel; may alert further batch
Storage: Minimal storage requirements;
Speed: Transfer ‘from the pipe’ toregisters for processing; results often delivered as alerts
Case study 2: Take heart
28© Bull, 2014
Case study 3: We built this city
29© Bull, 2014
• Annual global market for Smart Cities solutions is £200B
• Over 1,000 cities in the world with populations >500,000
• Smart Cities research shows us the variety of data
Case study 3: We built this city
• Transport cards (oyster)• Sensors (traffic, pollution, weather)• Camera data (security, traffic)• GIS (people, vehicles)• Buildings (temperature,
occupation)
30© Bull, 2014
Case study 3: We built this city
Click here to play the video
31© Bull, 2014
Case study 3: We built this city
What 100 million callsto NYC 311 reveal
32© Bull, 2014
BIG DATA
Characteristics: Streaming and/orbatch analytics; from heterogeneous sources from multiple sites
Processing: Real-time and in parallel; may alert further batch
Storage: Minimal storage requirements;
Speed: Transfer ‘from the pipe’ toregisters for processing; results often delivered as alerts
Case study 3: We built this city
33© Bull, 2014
Case study 4: The Blackberry Riots
34© Bull, 2014
Case study 4: The Blackberry Riots
• Between 6 and 10 August 2011, thousands of people took to the streets in London
• The disturbances began after a police shooting on 4 August in Tottenham
• The resulting chaos required mass police deployment
• The rioting soon spread to Birmingham, Bristol, Liverpool and Manchester
• “Everyone watching these horrific actions will be struck by how they were organised with social media” David Cameron, Prime Minister
35© Bull, 2014
Case study 4: The Blackberry Riots
• Professor Rob Procter and a team from LSE and The Guardian set about investigating this claim
• One of the largest studies of social media analytics
• What can we learn from use of social media during times of crisis?
• What does this tell us about veracity of data?
36© Bull, 2014
Case study 4: The Blackberry Riots
9pm on 8th August @Twiggy_Garcia circulatesunconfirmed reports thatrioters releasing animals at London Zoo
Re-tweeted by influential users with many followers. Rumours spread in viral-like way over non-hierarchical network
Opposition seeds within 13 minutes. Pictures are identified as fake
Click here to play the video
37© Bull, 2014
BIG DATA
Characteristics: Uncertainty and Incompleteness exists in all data; streaming has the advantage of ‘in-flight correction’.Processing: Real-time and in parallel; inc. background analysis
Storage: Minimal additional storage requirements;
Speed: Inevitably impacts speed
Case study 4: The Blackberry Riots
38© Bull, 2014
• Identifying characteristics of problem domain• Working with experts, formulate technology (hardware/software) needs• ‘Big data’ solutions are commonplace; ‘Fast data’ solutions are not
39© Bull, 2014
® Copyright 2011 Gigaspaces Ltd. All Rights Reserved39
Conclusion…….
41© Bull, 2014
[email protected]@Bull_UKBull-Information-Systems
0870 240 0040www.bull.co.ukHemel Hempstead HP2 7DZ