28
GPSInsights: Towards an ecient framework for storing and mining massive real-time vehicle location data Linh-Truong Hoang, Duy-Khanh Bui, Viet- Trung Tran Hanoi University of Science and Technology 1

Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Embed Size (px)

Citation preview

Page 1: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

GPSInsights: Towards an efficient framework for storing and mining

massive real-time vehicle location data

Linh-Truong Hoang, Duy-Khanh Bui, Viet-Trung Tran

Hanoi University of Science and Technology

1  

Page 2: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Agenda

•  Motivation •  System architecture •  Scalable map-matching •  Experimentation •  Conclusion

2  

Page 3: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Global Navigation Satellite System (GNSS)

•  Autonomous geo-spatial positioning – position – velocity –  time

•  "Great" points about GNSS – Free – Real-time – No required local infrastructures

3  

Page 4: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

GNSS as part of Intelligent transport system (ITS)

•  "precious" data for real-time traffic managements –  traffic dashboard – speed control –  traffic jams monitoring

4  

Need  for  collec-ng  and  mining  massive  GNSS  data    

in  REAL-­‐TIME  

Page 5: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

GNSS data characteristics

•  Real-time –  reported every

second •  Massive in volume –  from millions cars

•  "bad" data •  Need to be

processed within digital map topology

5  

Page 6: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

GNSS data is Bigdata's 5V

6  

Page 7: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

SYSTEM ARCHITECTURE

Store massive GNSS data Real-time mining

7  

Page 8: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

8  

Elas(city  High-­‐throughput  Fault-­‐tolerance  

Scalable  First-­‐class  spa(o-­‐temporal  

API  High-­‐thoughput  Fault-­‐tolerance   Online  processing    

Scalable    Fault-­‐tolerence  

 Leverage  opensource  components  

Page 9: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

9  

Page 10: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Apache spark processing

•  Resilient Distributed dataset (RDD) –  In-memory, backed by persistent storage (HDFS) –  fault-tolerance by lineage – Support interactive – iterative analysis

10  

Page 11: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Spark streaming

11  

Page 12: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Apache storm

12  

Page 13: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

MongoDb with geo-indexing

13  

Page 14: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Geomesa: Accumulo + geo-indexing

14  

Page 15: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

SCALABLE MAP-MATCHING ALGORITHM

15  

Page 16: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Map-matching

•  Online vs. Offline

•  OSM map

16  

Page 17: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Algorithm

•  OSM map format

•  Filling intermediate points – Millions more points – Massive data – but simple calculations •  real-time, scalable

17  

Page 18: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

K-d tree for closest neighbours

•  Run by apache spark/storm

18  

Page 19: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

EXPERIMENTATION

19  

Page 20: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Experiment setup

•  12 millions GPS records collected by vehicles equipped with the GPS receiver in March 2014

•  4 nodes cluster – 8-cores Intel Xeon 2.6GHz CPU, 32GB memory

20  

Page 21: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Map-matching completion time

21  

Page 22: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Latency

22  

Page 23: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

"Scalability"

23  

Page 24: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Demonstration

24  

Page 25: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Real-time traffic monitoring

25  

Page 26: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Real-time shortest path

26  

Page 27: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Conclusion •  GPSInsights: Scalable framework for storing

and mining massive location data – built on open-source scalable components – scalable storage + real-time mining – Plug-able components – Demonstration with scalable map-matching

algorithm •  Future work – Advance map-matching algorithms – Traffic jam prediction

27  

Page 28: Paper@Soict2015: GPSInsights: towards a scalable framework for mining massive transportation data

Current state-of-the-arts

•  PostGIS – Spatial objects management

over Postgres – Small size – No mining supported

28