Upload
keegan-mattis
View
213
Download
1
Embed Size (px)
Citation preview
Visual Fusion of Mega-City Big Data: An Application to Traffic and Tweets Data Analysis
of Metro Passengers
Fu-Ming Huang2015.03.25
Paper Presentation
2Academia Sinica, IIS Fu-Ming Huang
PUBLICATION
• Publication– 2014 IEEE International Conference on Big Data
• Authors– Masahiko Itoh, University of Tokyo– Daisaku Yokoyama , University of Tokyo– Masashi Toyoda , University of Tokyo– Yoshimitsu Tomita, Tokyo Metro Co.– Satoshi Kawamura, Tokyo Metro Co.– Masaru Kitsuregawa, University of Tokyo
3Academia Sinica, IIS Fu-Ming Huang
Masahiko ItohUniversity of Tokyo
4Academia Sinica, IIS Fu-Ming Huang
INTRODUCTION
RELATED WORK
DATA SETS
5Academia Sinica, IIS Fu-Ming Huang
INTRODUCTION
• Public transportation system– events resilience– optimal resource operation
• Hope to understand how the transportation systems are affected by changes in passengers' behaviors
• To implement real-time analysis and prediction of passenger behaviors in a complex transportation system– real-time transportation logs– social media streams
6Academia Sinica, IIS Fu-Ming Huang
Introduction
• The system needs to satisfy requirements:– Discovering unusual phenomena from the wide range of temporal
overviews– Understanding changes in passenger flows and spatial propagation– Exploring reasons for unusual phenomena or their effects from real
users' voices
• We integrate these visualization techniques:– Heat Map view– Animated Ribbon view– Tweet Bubble view
7Academia Sinica, IIS Fu-Ming Huang
Related Work: Smart Card Data Analysis
• Underground station crowding patterns– [Ceapa 2012], London
• MRT passengers spatiotemporal density– [Sun 2012], Singapore
• Metro trouble effects propagation– [Itoh 2014], Japan
8Academia Sinica, IIS Fu-Ming Huang
Related Work: Spatiotemporal Information Visualization
• Emphasize linear or cyclic temporal dependencies– [Tominski 2005], 3D icon
• Represent regional classification and time-varying quantities– [Thakur 2010], 2D and 3D icon
• Characterize important places– [Andrienko 2011], 3D space
• Visualize trajectory attribute data– [Tominski 2012], 3D color-coded bands
• Represent ST-attributes change on road network– [Cheng 2013], 3D staked bands
• Explore human activity patterns– [Ferrira 2013], NYC
• Extract traffic jams and propagation– [Wang 2013], Beijing
• Visualize aggregated passenger behaviors– [Itoh 2014], heat map and animated ribbons
9Academia Sinica, IIS Fu-Ming Huang
Related Work: Spatial Social Events Visualization
• Detect events from social media data and extract 4Ws information– [Dou 2012], LeadLine, Twitter
• Filter and visualize space-time-theme information– [MacEachren 2011], SensePlace2, Twitter data
• Detect traffic anomaly– [Zheng 2013], taxicabs and Twitter data
10Academia Sinica, IIS Fu-Ming Huang
DATA SETS
• Smart Card Data– Tokyo Metro– 28 lines, 540 stations, 350 million trips– March 2011 to May 2014– seperate weekdays and weekends (include national holidays and vacation seasons)
• Social Media Data– Twitter, Japanese users– March 2011 to May 2014– More than 2 million active users and 18 billion tweets
11Academia Sinica, IIS Fu-Ming Huang
The Complex Tokyo Metro System
12Academia Sinica, IIS Fu-Ming Huang
EXTRACTION OF PASSENGER FLOWS
EXTRACTION OF SITUATIONAL EXPLANATION
EXPLORATION ENVIRONMENT FOR PASSENGER FLOWS
CASE STUDIES
13Academia Sinica, IIS Fu-Ming Huang
EXTRACTION OF PASSENGER FLOWS
• Estimating Daily Passenger Flows– Shortest time path
• t = T + C + W• Dijkstra algorithm
– Find unusual phenomena• Estimate the speculated path• Accumulate the passengers number• Calculate simple moving average (SMA)• Calculate standard deviation
– SMA reflects daily cyclical patterns– Unusual patterns can be detected by comparing it with log data
14Academia Sinica, IIS Fu-Ming Huang
Estimating Passenger Flows after Accidents
• Accidents make passengers take detours– Shortest path would be changed by
service suspensions
• Recompute the shortest paths– To input constraints of suspended
lines and sections
• Visually check how passengers take detours and concentrate on particular lines– An accident in Machiya– (a), without suspension info– (b), with suspension info
• Probabilistic behavior model ?!!
15Academia Sinica, IIS Fu-Ming Huang
EXTRACTION OF SITUATIONAL EXPLANATION
• Social media– People have saw, thought, and did during and after events– More precise or fine-grained information than operating companies
• For overviewing and explaining situation– Words, weighted by word frequencies based on the measure similar with tf-
idf• tf(word, station/line, timewindow)– The frequencies for every co-occurring word for each station
• df(word, station/line)– The number of days when each word appears for each station
– Weight(word, station/line, date and time/timewindow)• As tf x idf(word, station/line)– s.t. idf = log(|date|/df(word,station/line)+1)
16Academia Sinica, IIS Fu-Ming Huang
EXPLORATION ENVIRONMENT FOR PASSENGER FLOWS
• To explore passenger flows and spatiotemporal propagation of crowdedness or emptiness– HeatMap view– AnimateRibbon view
• To explore situational explanations– TweetBubble view
• They can coordinate with each other
17Academia Sinica, IIS Fu-Ming Huang
HeatMap View
• An overview of temporal crowdedness or emptiness– Monthly overview (1 hour), Daily
overview (10 minutes)– Fig 3: dramatic changes in
passengers’ behavior after 16 March
• Color Encoding on HeatMap View– compared with the average
situation, z-score– red, green, blue– S-th and L-th thresholds
18Academia Sinica, IIS Fu-Ming Huang
Animated Ribbon View
• Dynamically visualizes animated temporal changes in the number of passengers– absolute number, height of 3D ribbons– deviation from average, color-coding– passenger numbers, 3D bar
• Color-encoding– z-scores, S-th, L-th– red, green, blue
• Perspective foreshortening– develop orthogonal projection mode– same height bands can look the same in
different places
• 2D bands would quickly suffer from overplotting and occlusion problem
19Academia Sinica, IIS Fu-Ming Huang
TweetBubble View
• Shows an overview of aggregated words from people's tweets related to times and stations– center node → station– other nodes → co-occurring words– node size → weight– color → noun:green, verb:blue,
adjective:pink– sparklines → tf variation– range sliders view → words filter– tweets view → normal:black,
mention:blue, retweet:red
20Academia Sinica, IIS Fu-Ming Huang
CASE STUDIES
• Show the usefulness of the system• Explore changes in behavior of passengers and
influences of events– natural disasters, accidents, public gatherings
• Interview customer service staff of a train operating company– correspondence, neglect, evidence
21Academia Sinica, IIS Fu-Ming Huang
Case 1: Earthquake
22Academia Sinica, IIS Fu-Ming Huang
Case 1: Earthquake
• Passenger flows– 11 Mar. 2011– during the Great East Japan
Earthquake occurred
• (a) before earthquake– green ribbon, normally
• (b) after earthquake– blue ribbon, suspended
• (c) after lines resume– Shibuya, Asakusa
• (d) spread of tweets– resuming, Ginza Line, Shibuya station
• (e) went to and exited Shibuya rapidly decreased around 21:50
– such rapid and short-term decreases cannot be shown in HeatMap
23Academia Sinica, IIS Fu-Ming Huang
Case 2: Spring Storm
24Academia Sinica, IIS Fu-Ming Huang
Case 2: Spring Storm
• Passenger flows– 3 April 2012– spring storm, Japanese mainland– companies urged employees to go home
early
• 5(b-i), 6(a)– line became very crowded before the normal rush hours
• 6(b)– many passengers exited Toyocho station
• 5(b-ii), red & blue– could not maintain normal operation– people had no routs to take
• 6(c)– Tozai Line resumed at 21:05
• 7, TweetBubble– suspension, free transfer, strong wind– taxi, bus, walk
• People in the operating company had not been aware of such extremely confusing situations, especially in Toycho station.
• Give them one piece of new evidence to help discussion and improvement.
25Academia Sinica, IIS Fu-Ming Huang
Case 3: Fire Events
25
26Academia Sinica, IIS Fu-Ming Huang
Case 3: Fire Events
• Passenger flows– after the fire around JR Yurakucho
station, 3 Jan. 2014
• (a), important gateway to 5 districts– distortion technique for overviewing
• (b), switch to Fukutoshin Line in place of the JR Yamanote Line– passangers increased between Ikebukuro
and Shibuya stations
• (c), switch to Chiyoda Line in place of JR Joban Line– many passengers transferred at Kita-Senju
station
26
27Academia Sinica, IIS Fu-Ming Huang
Case 3: Fire Events
27
• Such indirect effects of accidents are hard to understand
28Academia Sinica, IIS Fu-Ming Huang
Case 4: Parade effects
29Academia Sinica, IIS Fu-Ming Huang
Case 4: Parade effects
• Passenger flows– Parade by London Olympic
medalists, Ginza– 20 minutes from 11:00– about 500,000 people gathered
• (a)– quickly gathered, quickly left
• (b)(c)– extremely huge waves
• (a-ii)– leave Ginza just after the parade
ended
• This is a surprising result– Because Ginza is one of the most famous
shopping districts– But most people did not stay there for
long
30Academia Sinica, IIS Fu-Ming Huang
Case 4: Parade effects
31Academia Sinica, IIS Fu-Ming Huang
CONCLUSION
• A novel visual fusion environment to explore traffic flows
• Contributions– Passenger flows on a complicated metro network from large scale data
from the smart card system– Unusual phenomena and their propagation on a spatiotemporal space– Two forms of big-data into the system to explore causes and effects of
unusual phenomena
• Future work– Provide automatic event detection, prediction, and visualization– Fuse various kinds of big data streams– Explore more complex transportation networks
32Academia Sinica, IIS Fu-Ming Huang
Angus’ Comments
• To consider and distinguish the features of day and month in PLASH’s urban life log data analysis
• To explore more explanation and case studies in PLASH’s YouBike project and SpeedEvaluation project
• Power of data visualization + Power of human observation
• To find more interesting and practical relations among urban life data or governmental open data
Thanks for your listening …