Upload
michael-peacock
View
727
Download
0
Embed Size (px)
Citation preview
1 Dealing with Continuous Data Processing, ConFoo 2012
DEALING WITH CONTINUOUS DATA PROCESSING
Michael Peacock, February, 2012
2 Dealing with Continuous Data Processing, ConFoo 2012
WHAT’S IN STORE
• Case study: large data focused project• Challenges and solutions we faced• Dealing with legacy code and legacy issues• Application availability• Handling incoming data• Processing incoming data• Storing the data• Querying the data quickly• Reporting against the data• Keeping the application responsive
3 Dealing with Continuous Data Processing, ConFoo 2012
A DISCLAIMER
• What we have done isn’t perfect• Requirement changes, rapid growth and the
move from contractors to an in-house team meant everyone learnt more as we went along
• Some decisions have been good• Some decisions have been bad• We have to live with both because of the size
of our application and cost of alterations
• We have learnt a lot; and we have lots of tips for architecture, code, planning and scaling
4 Dealing with Continuous Data Processing, ConFoo 2012
MICHAEL PEACOCK
Web Systems DeveloperTelemetry TeamSmith Electric Vehicles Corp
Lead Developer / Technical Author / Occasional Speaker
5 Dealing with Continuous Data Processing, ConFoo 2012
SMITH ELECTRIC VEHICLES
6 Dealing with Continuous Data Processing, ConFoo 2012
NO. NOT MILK FLOATS (ANYMORE)
ALL ELECTRIC, COMMERCIAL VEHICLES.
Photo courtesy of kenjonbro: http://www.flickr.com/photos/kenjonbro/4037649210/in/set-72157623026469013
7 Dealing with Continuous Data Processing, ConFoo 2012
ALL-ELECTRIC COMMERCIAL VEHICLES
8 Dealing with Continuous Data Processing, ConFoo 2012
ELECTRIC VEHICLES
• Quick Specs• 16,500 – 26,000 lbs gross vehicle weight• 7,121 – 16,663 lbs payload• 50 – 240km• Top Speed 80km/h
• Applications• Flat bed• Military applications• Mini busses• School busses
9 Dealing with Continuous Data Processing, ConFoo 2012
DATA CHALLENGES FROM ELECTRIC VEHICLES
• New, continually evolving technology• Viability evidence required• Government research• Customers wanting to continually monitor performance• How much energy are they using• How much does each charge cost• How far have they travelled• How much energy did they regenerate• Have their drivers been converted correctly
10 Dealing with Continuous Data Processing, ConFoo 2012
ENTER TELEMETRY
• Monitor the vehicles CANBus*• Drive train• Battery management system• Battery modules• Charger
• Monitor additional sensors• GPS Location• Climate
• Collect and broadcast information• Team: Contractors to staff
11 Dealing with Continuous Data Processing, ConFoo 2012
HUGE DATA VOLUMES
2,500pieces of data, per second, per vehicle
12 Dealing with Continuous Data Processing, ConFoo 2012
CURRENT STATS
• ~ 500 telemetry enabled vehicles in service• Telemetry has now been made a standard
feature for all new vehicles that roll of the production line
• Large MySQL solution• 1.5 billion inserts per day• Constant minimum of 4000 inserts per second
• More data on the way
13 Dealing with Continuous Data Processing, ConFoo 2012
INITIAL MANDATE
For a small number of vehicles, collect a small amount of data that we can provide to
customer-authorised grant authorities.
14 Dealing with Continuous Data Processing, ConFoo 2012
INITIAL SOLUTION
• Single table of data• Key field to identify the vehicle• Vehicles communicating directly with our servers
15 Dealing with Continuous Data Processing, ConFoo 2012
NEW MANDATE
• Collect everything• We need data for:
• Grant authorities• Customer performance data• Service issues• Warranty issues• Technology evaluation and continuous improvement• It’s all important!
• Display data in real time• Keep and display historical data• Calculate and show performance data
16 Dealing with Continuous Data Processing, ConFoo 2012
DATA STARTS TO INCREASE
• Initial team took a sledge hammer to the huge database table:• Created a separate database
for each vehicle
http://www.flickr.com/photos/robotapocalypse/245508884/
17 Dealing with Continuous Data Processing, ConFoo 2012
http://www.flickr.com/photos/holyoutlaw/5920882576
18 Dealing with Continuous Data Processing, ConFoo 2012
SCALING
Initial application level sharding meant we could scale out with new databases and new database servers as the number of vehicles went up.
It doesn’t help us if the data we collect, or the data retention policy, increases; it also still leaves us with slow running queries.
19 Dealing with Continuous Data Processing, ConFoo 2012
PROBLEM #0: INSERTS
• We utilise a dedicated server and number of dedicated applications to pull these messages and process them• This needs to happen quick enough for live
data to be seen through the web interface• Data is rapidly converted into batch SQL files,
which are imported to MySQL via “LOAD DATA INFILE”
• Results in high number of inserts per second (20,000 – 80,000)
• LOAD DATA INFILE isn’t enough on its own...
20 Dealing with Continuous Data Processing, ConFoo 2012
INITIAL ARCHITECTURE
21 Dealing with Continuous Data Processing, ConFoo 2012
PROBLEM #1: AVAILABILITY
22 Dealing with Continuous Data Processing, ConFoo 2012
PROBLEM #2: CAPACITY
Sometimes, the data is too much
www.flickr.com/photos/eveofdiscovery/3149008295
23 Dealing with Continuous Data Processing, ConFoo 2012
CAPACITY
24 Dealing with Continuous Data Processing, ConFoo 2012
OPTION 1: CLOUD INFRASTRUCTURE
• Cloud based infrastructure gives:• More capacity• More failover• Higher availability
25 Dealing with Continuous Data Processing, ConFoo 2012
PROBLEM WITH CLOUD INFRASTRUCTURE
• Huge volumes of data inserts into a MySQL solution: sub-optimal on virtualised environments• Existing enterprise hardware investment • Security and legal issues for us storing the data off-site
26 Dealing with Continuous Data Processing, ConFoo 2012
SOLUTION #1: MQ
www.flickr.com/photos/gadl/89650415/inphotostream
27 Dealing with Continuous Data Processing, ConFoo 2012
CLOUD BASED MESSAGE QUEUE
28 Dealing with Continuous Data Processing, ConFoo 2012
“THE CLOUD” ISN’T PERFECT
www.flickr.com/photos/brapps/403257780
29 Dealing with Continuous Data Processing, ConFoo 2012
PROBLEM #3: STORAGE SYSTEM STARTS TO CRACK
http://www.flickr.com/photos/mknott/2855987266
30 Dealing with Continuous Data Processing, ConFoo 2012
SOLUTION: GENIUS DBA & VENDOR CONSULTANCY
• Constantly tweaking the servers and configuration to get more and more performance
• Pushing the capabilities of our SAN, tweaking configs where no DBA has gone before
• www.samlambert.com• http://www.samlambert.com/2011/07/how-t
o-push-your-san-with-open-iscsi_13.html• http://www.samlambert.com/2011/07/diagn
osing-and-fixing-mysql-io.html• [email protected]
Sam Lambert – DBA Extraordinaire
31 Dealing with Continuous Data Processing, ConFoo 2012
LIVE, REAL-TIME INFORMATION
32 Dealing with Continuous Data Processing, ConFoo 2012
LIVE DATA: PROBLEMS
• Showing data in real time causes a number of headaches:• Processing the huge number of inserts• Data and legacy application architecture • Race conditions• Accessing the data quickly• A global view
33 Dealing with Continuous Data Processing, ConFoo 2012
LIVE, REAL TIME INFORMATION: PROBLEM
• Original database design dictated:• All data-points were stored in the same table• Each type of data point required a separate
query, sub-query or join to obtain
• Workings of the remote device collecting the data, and the processing server, dictated:• GPS Co-ordinates can be up to 6 separate data
points, including: Longitude; Latitude; Altitude; Speed; Number of Satellites used to get location; Direction
34 Dealing with Continuous Data Processing, ConFoo 2012
LIVE DATA: GLOBAL MAP
35 Dealing with Continuous Data Processing, ConFoo 2012
REAL TIME INFORMATION: CONCURRENT
• Initial Solution from the original developers:
• Pull as many pieces of real time information through asynchronously
• Involved the use of Flash based “widgets” which called separate PHP scripts to query the data
• Pages loaded relatively quickly• Data points took a little time to load• Not good enough & caused race conditions
36 Dealing with Continuous Data Processing, ConFoo 2012
RACE CONDITIONS
Initial team made use of Flash based charts library. This was both good and bad.• The Good:• Requests for data were asynchronous• Main page loaded quickly
• The bad• Each chart was a separate query• Each chart was a separate request• Session authentication introduced race
conditions
37 Dealing with Continuous Data Processing, ConFoo 2012
LOTS OF DATA: RACE CONDITIONS
• Sessions in PHP close at the end of the execution cycle• Unpredictable query times• Large number of concurrent requests per
screen
Session Locking
• Completely locks out a users session, as PHP hasn’t closed the session
38 Dealing with Continuous Data Processing, ConFoo 2012
RACE CONDITIONS: PHP & SESSIONS
session_write_close()
Added after each write to the $_SESSION array. Closes the current session.
(requires a call to session_start immediately before any further reads or writes)
39 Dealing with Continuous Data Processing, ConFoo 2012
LIVE: STABLE BUT SLOW
40 Dealing with Continuous Data Processing, ConFoo 2012
CACHE THE DATA
• We cached most upto date datapoints for each vehicle in an in-memory key-value store• Memcache!• Allows us to quickly get access to live data (which continually changes) without hitting the database
41 Dealing with Continuous Data Processing, ConFoo 2012
MEMCACHE
// instantiate$mc = new Memcache(); // set the memcache server and port$mc->connect( MEMCACHE_SERVER, MEMCACHE_PORT );// get data based on a key$value = $mc->get(‘key’);
42 Dealing with Continuous Data Processing, ConFoo 2012
MEMCACHE FAILOVER
Depending on the importance of your caching system, you might want to consider High Availability.
Tools such as Moxi Memcached Proxy can do this, acting as a single memcached server, connecting to many.
43 Dealing with Continuous Data Processing, ConFoo 2012
CACHING WITHIN LEGACY CODE
Integrating caching with both legacy and new code was difficult. We used a lazy loading registry to pass around a common object which provided access to databases, caching servers, and other shared data.
44 Dealing with Continuous Data Processing, ConFoo 2012
LAZY LOADING REGISTRYpublic function getObject( $key ){ if( in_array( $key, array_keys( $this->objects ) ) ) { return $this->objects[$key]; } elseif( in_array( $key, array_keys( $this->objectSetup ) ) ) { if( ! is_null( $this->objectSetup[ $key ]['abstract'] ) ) { require_once( FRAMEWORK_PATH . 'registry/aspects/' . $this->objectSetup[ $key ]['folder'] . '/' . $this->objectSetup[ $key]['abstract'] .'.abstract.php' ); } require_once( FRAMEWORK_PATH . 'registry/aspects/' . $this->objectSetup[ $key ]['folder'] . '/' . $this- >objectSetup[ $key]['file'] . '.class.php' ); $o = new $this->objectSetup[ $key ]['class']( $this ); $this->storeObject( $o, $key ); return $o; } elseif( $key == 'memcache' ) { // requesting memcache for the first time, instantiate, connect, store and return $mc = new Memcache(); $mc->connect( MEMCACHE_SERVER, MEMCACHE_PORT ); $this->storeObject( $mc, 'memcache' ); return $mc; }}
45 Dealing with Continuous Data Processing, ConFoo 2012
LLR: NOT FIT FOR PURPOSE
Although for us, the Lazy Loading Registry works, the inclusion of the memcache connection is stretching its flexibility.
Better approach: Dependency Injection Container
We could instead prepare configuration data and pass it within a DIC to the relevant features
46 Dealing with Continuous Data Processing, ConFoo 2012
REAL TIME INFORMATION: # OF REQUESTS
• Currently, each piece of “live data” is loaded into a flash graph or widget, which updates every 30 seconds using an AJAX request
• The move from MySQL to Memcache reduces database load, but large number of requests still add strain to web server
• Making more use of text and image based representations, all of which can be updated from a single AJAX request
47 Dealing with Continuous Data Processing, ConFoo 2012
RACE CONDITIONS: USE A TEMPLATE ENGINE
• V1 of the system mixed PHP and HTML
• You can’t re-initialise your session once output has been sent
• All new code uses a template engine, so session interaction has no bearing on output. When the template is processed and output, all database and session work has been completed long before.
48 Dealing with Continuous Data Processing, ConFoo 2012
RACE CONDITIONS: USE A SINGLE ENTRY POINT
• Race conditions are further exacerbated by the PHP timeout values
• Certain exports, actions and processes take longer than 30 seconds, so the default execution time is longer
• Initially the project lacked a single entry point, and execution flow was muddled
• Single Entry Point makes it easier to enforce a lower time out, which is overridden by intensive controllers or models
49 Dealing with Continuous Data Processing, ConFoo 2012
A LITTLE BREATHING SPACE
We now had an application which was:• Stable: sessions were not locking• Quick in parts:• Fleet overview (map)• Vehicle live screen
• Gaining the confidence of users• Unfortunately...• Speed was only the appearance
50 Dealing with Continuous Data Processing, ConFoo 2012
THINGS WERE STILL SLOW
• Generating performance data • Backing up and archiving data• Exporting data for grant authorities • Our initial mandate!
• Viewing historical data• Viewing analytical data
51 Dealing with Continuous Data Processing, ConFoo 2012
52 Dealing with Continuous Data Processing, ConFoo 2012
GENERATING PERFORMANCE REPORTS
To look at detailed performance data we needed to:• Extract a range of data points for a particular day and vehicle:
• Battery current and voltage• Motor speed (revs per minute)• Pedal positions
• Perform some calculations:• Energy usage• Distance and speeds• Driving styles
53 Dealing with Continuous Data Processing, ConFoo 2012
NOSQL
NoSQL solutions sound like a prime candidate for this type of problem; and NoSQL is something we are looking to experiment with. However we had a problem:• The original team picked MySQL• Our skill set is MySQL• Testing, R&D & Migration would take a long time
54 Dealing with Continuous Data Processing, ConFoo 2012
REGULAR PROCESSING
• Introduced regular, automated data processing• Performance calculations were done over night,
for every vehicle• Saved as a summary record, one per vehicle
per day• Really, really, really easy and quick to pull out,
aggregate, compare and analyse
55 Dealing with Continuous Data Processing, ConFoo 2012
56 Dealing with Continuous Data Processing, ConFoo 2012
QUERY SPEED BECOMES MORE IMPORTANT
• We were now doing this processing for every vehicle• Not just requested vehicles on demand
We needed to be able to pull out the data even faster than before
57 Dealing with Continuous Data Processing, ConFoo 2012
SHARDING
• Pulling data out was still very slow, especially when analysing data over the course of a day• We decided to shard the database again, this time based on the date: Week Number• Data stored before we implemented sharding wasn’t stored in a sharded table
58 Dealing with Continuous Data Processing, ConFoo 2012
WHICH BUCKET?
public function getTableNameFromDate( $date ) { // we can’t query for future data, so if the date is in the future // reset it to today $date = ( $date > date( 'Y-m-d') ) ? date('Y-m-d') : $date; // get the time in seconds since epoc $stt = strtotime( $date ); // is the query date since we implemented sharding? if( $date >= $this->switchOver ) { // calculate the year this week is from $year = ( date( 'm', $stt ) == 01 && date( 'W', $stt ) == 52 ) ? date('Y',
$stt ) - 1 : date('Y', $stt ); // add the year and the week number to the table, and return return ‘data_' . $year . '_' . date('W', $stt ); } else { // return the legacy table return 'data'; }}
59 Dealing with Continuous Data Processing, ConFoo 2012
QUERY MAGIC
private function parseQuery( $sql, $date=null ){ $date = ( is_null( $date ) ) ? date('Y-m-d') : $date; return sprintf( $sql, $this->shardHelper->getTableNameFromDate( $date ) );}
60 Dealing with Continuous Data Processing, ConFoo 2012
BACKUPS & ARCHIVES
• Shading makes backing up and archiving easier• Simply export a sharded-week table and save• No long queries which have to find data based
on date• If its an archive (not just a backup)
• DROP the table afterwards• No need to run a slow delete query based on a wide
ranging WHERE clause
61 Dealing with Continuous Data Processing, ConFoo 2012
SHARDING: AN EXCUSE
DBMS’s such as MySQL don’t support sharding out of the box, so you need to:
Alter every query that touches this data
Use it as an excuse: • Check your data types• Check your indexes• Improve your queries
62 Dealing with Continuous Data Processing, ConFoo 2012
DATA TYPES
• When prototyping a system and designing a database schema, its easy to be sloppy with your data types, and fields• DONT BE• Use as little storage space as you can• Ensure the data type uses as little as you can• Use only the fields you need
63 Dealing with Continuous Data Processing, ConFoo 2012
INDEX OPTIMISATION
• Keep it small• From our legacy days of one database on one server, we had a column that told us which vehicle the data related to• This was still there...as part of an
index...despite the fact the application hadn’t required it for months
64 Dealing with Continuous Data Processing, ConFoo 2012
QUERIES: OPTIMISATION TIPS
• Run every query through EXPLAIN EXTENDED • Check it hits the indexes• Remove functions like CURDATE from queries, to ensure query cache is hit
65 Dealing with Continuous Data Processing, ConFoo 2012
CONCURRENT DATABASE CONNECTIONS
Sharding to multiple databases and multiple servers means its crucial to keep track of the connections.• Export and processing scripts – touched every database• If you don’t keep track and close connections when you are finished you risk flooding MySQL with too many connections
66 Dealing with Continuous Data Processing, ConFoo 2012
REPORTS: DB CONNECTION MANAGEMENT
•In addition to our calculated reports, we also need to export key bits of information to grant authorities
• Initially our PHP based export scripts held one database connection per database (~400 databases)
• Re-wrote to maintain only one connection per server, and switch the database used
• Toggles to instruct the export to only apply for 1 of the servers at a time
• Modulus magic to run multiple export scripts per server
67 Dealing with Continuous Data Processing, ConFoo 2012
CONCURRENT DATA PROCESSING
Depending on the batch processing we are going, sometimes we can run multiple instances per database server.
$counter = 1;foreach( $vehicles as $vehicle ){ if( ( $counter % $concurrent ) == $instance ) { $vehiclesToProcess[] = $vehicle; } $counter++;}
68 Dealing with Continuous Data Processing, ConFoo 2012
SYSTEM ARCHITECTURE
69 Dealing with Continuous Data Processing, ConFoo 2012
SYSTEM ARCHITECTURE
70 Dealing with Continuous Data Processing, ConFoo 2012
APPLICATION QUALITY
• When dealing with lots of data, quickly, you need to ensure:• You process it correctly• You can act fast if there is a bug• You can act fast when refactoring
71 Dealing with Continuous Data Processing, ConFoo 2012
DEPLOYMENT
• When dealing with a stream of data, rolling out new code can mean pausing the processing work that is done• Put deployment measures in place to make a deployment switch over instantaneous
72 Dealing with Continuous Data Processing, ConFoo 2012
ADDITIONAL POINTS
73 Dealing with Continuous Data Processing, ConFoo 2012
DON’T LOG “EVERYTHING”: EXTRAPOLATE AND ASSUME
• Our telemetry unit broadcasts each data point once per second
• Data doesn’t change every second, e.g.• Battery state of charge may take several
minutes to loose a percentage point• Fault flags only change to 1 when there is a fault
• Make an assumption. • We compare the data to the last known value…if
it’s the same we don’t insert, instead we assume it was the same
• Unfortunately, this requires us to put additional checks and balances in place
74 Dealing with Continuous Data Processing, ConFoo 2012
EXTRAPOLATE AND ASSUME: “INTERLATION”
• Built a special library which:• Accepted a number of arrays, each
representing a collection of data points for one variable on the truck
• Used key indicators and time differences to work out if/when the truck was off, and extrapolation should stop
• For each time data was recorded, pull down data for other variables for consistency
75 Dealing with Continuous Data Processing, ConFoo 2012
INTERLACE
* Add an array to the interlationpublic function addArray( $name, $array )
* Get the time that we first receive data in one of our arrayspublic function getFirst( $field )
* Get the time that we last received data in any of our arrayspublic function getLast( $field )
* Generate the interlaced arraypublic function generate( $keyField, $valueField )
* Beak the interlaced array down into seperate dayspublic function dayBreak( $interlationArray )
* Generate an interlaced array and fill for all timestamps within the range of _first_ to _last_
public function generateAndFill( $keyField, $valueField )
* Populate the new combined array with key fields using the common fieldpublic function populateKeysFromField( $field, $valueField=null )
http://www.michaelpeacock.co.uk/interlation-library
76 Dealing with Continuous Data Processing, ConFoo 2012
ABSTRACTION
When dealing with large data sets, abstraction becomes key
You can’t easily change your data structure to fit the application, so the application needs to be able to rapidly change to fit the data
77 Dealing with Continuous Data Processing, ConFoo 2012
TECHNICAL TIPS
• Measure your applications performance, data throughput and so on• A data at scale problem itself
• Use as much RAM on your servers as is safe to do so• We give 80% per DB server to MySQL of 100 –
140GB
78 Dealing with Continuous Data Processing, ConFoo 2012
WHAT DO WE HAVE NOW?
Now we have a fast, stable reliable systemPulling in millions of messages from a queue per dayDecoding those messages into 1.5 billion data points per dayInserting 1.5 billion data points into MySQL per dayPerformance data generated, and grant authority reports exported dailyMore sleep on a night than we used to
79 Dealing with Continuous Data Processing, ConFoo 2012
CONCLUSIONS
• Plan for scale, as best you can• Where data collection is critical, use a message queue, ideally hosted or “cloud based”• Leverage products, services and platforms that excel at tasks you need help with, be it: hosting, message queue or email service• Carefully plan your database schema and data types; use only the data you need• Hire a genius DBA to push your database engine and your Storage Soltion• Make use of data caching systems to reduce strain on the database• Calculations and post-processing should be done during dead time and automated• Add more tools to your toolbox – PHP needs lots of friends in these situations• Watch out for Session race conditions: where they can’t be avoided, use session_write_close, a template engine and a single entry point• Reduce the number of continuous AJAX calls
80 Dealing with Continuous Data Processing, ConFoo 2012
Q & A
Michael PeacockWeb Systems Developer – Telemetry Team – Smith Electric Vehicles [email protected]
Senior / Lead Developer, Author & [email protected] www.michaelpeacock.co.uk
@michaelpeacock
http://www.slideshare.net/michaelpeacock