고대8 9£¼ ë¹…ë°‌´„°

  • View
    180

  • Download
    2

Embed Size (px)

Text of 고대8 9£¼...

PowerPoint

Prof. (kangjm@korea.ac.kr ; mooknc@gmail.com) 9 ; 2015.4.29. // ?( ? .) : , , : http://www.korea.ac.kr/search/search.jsp: http://analyticstraining.com/wp-content/uploads/2014/09/30-Sept.jpg+ https://www.pinterest.com/ , , ?

www.slideshare.net/mooknc ,

Intro. Movie , https://www.youtube.com/watch?v=OptqxagZDfM

() // ?

/ // ?

, , , // ?

, ? ( ?) ?: http://news.chosun.com/site/data/html_dir/2015/03/02/2015030202126.html http://cfile10.uf.tistory.com/image/182A5D50506E440612E7FF , ?

,

: http://news.chosun.com/site/data/html_dir/2015/03/02/2015030202126.html ,

10 ?

: http://news.chosun.com/site/data/html_dir/2015/03/02/2015030202126.html[]

' ' . . , .

' 10 ' .

.

: - -, (EunJungKim),(HyeSunLee), , 12 5. 2013 pp.205-223 , (structured) , (unstructured) 2 . , . , , , , (text), (photo), , . (digital universe) 95% (: IDC, The Expanding Digital Universe, 2007), 90% .(IDC, The Expanding Digital Universe, 2011; , , , :, 2012. p.32) SNS , . 3V((volume), (Variety), (Velocity)) .9 vs : http://www.edureka.co/blog/answering-the-big-question-what-is-big-data/

10 : , (JoSephChoi),(YongSeokChoi), , 27 2. 2014 pp.169-183 2010 12TB(terabyte) (Gantz Reinsel, 2010). Special Report(2010.02.25) , (Wal Mart) 100 , 2008 2,500TB . , 2011 1 1 1,000 (Chiang, 2011), 2020, 50 (Jeong, 2011).11The Explosion of Unstructured Data: http://www.edureka.co/blog/answering-the-big-question-what-is-big-data/

12 (Unstructured Data): (Big Data) , (numeric data) , .(: , , , 2012. p.14)

, , , , , .: - -, (EunJungKim),(HyeSunLee), , 12 5. 2013 pp.205-223 13 : , (JoSephChoi),(YongSeokChoi), , 27 2. 2014 pp.169-183 , , (link analysis) , (tweet) (unstructured big data) . , Kim Cho (2013) , , , , , , , , , , , Rhive R . , (distribution processing system) (MapReduce) , (unstructured big data) , (correspondence analysis) .14 (1): - -, (EunJungKim),(HyeSunLee), , 12 5. 2013 pp.205-223

15 (2): - -, (EunJungKim),(HyeSunLee), , 12 5. 2013 pp.205-223

16 : - -, (EunJungKim),(HyeSunLee), , 12 5. 2013 pp.205-223 , (life log) . , , , . , , .(, , 70 , ,2012. pp.180-181) (life log) (log file) . , GPS, , NFC (, ) . CRM , . , , .17Why DFS for Big Data? : http://www.edureka.co/blog/answering-the-big-question-what-is-big-data/https://i-msdn.sec.s-msft.com/dynimg/IC197174.gifA distributed file system (DFS) is a file system that has data stored in a server. The data can be accessed and processed as if it is stored on the local machine. The DFS makes it really convenient to share information in a controlled manner.

18What is DFS?: http://www.edureka.co/blog/answering-the-big-question-what-is-big-data/A DFS allows efficient and well-managed data and storage sharing options on a network compared to any other.The DFS allows faster processing of huge amounts of data by processing data at various locations and then combining them to give the desired output. In Big Data technologies like Hadoop, it is possible to scale a Hadoop cluster to hundreds or even thousands of nodes. In this way, the MapReduce functions can be executed on smaller subsets of larger data sets, and thereby providing the scalability that is needed for Big Data processing.19

XML DB Design (1): http://docs.oracle.com/cd/B19306_01/appdev.102/b14259/xdb02rep.htm20XML DB Design (2): http://docs.oracle.com/cd/B19306_01/appdev.102/b14259/xdb02rep.htmThis diagram shows four boxes: a, b, c, and d. Box a includes the words Data Structure? inside it. Box b includes, from top to bottom, the word Access?, a box labeled Repository Path Access, and a box labeled SQL Query Access. Box c includes, from top to bottom, the word Language?, and the bullet points Java, JDBC, PL/SQL, and C or C++. Box d includes, from top to bottom, the words Processing and Data Manipulation?, and the bullet points DOM, SQL inserts/updates, XSLT, Queriability, and Updatability.21Data Storage Models (1): http://docs.oracle.com/cd/B19306_01/appdev.102/b14259/xdb02rep.htm

22Data Storage Models (2): http://docs.oracle.com/cd/B19306_01/appdev.102/b14259/xdb02rep.htmThis diagram shows the data storage model. The words How Structured is Your Data? appear at the top of the diagram. Three lines connect below to three separate boxes named, from left to right, Structured Data, Semi-structured Pseudo-structured Data, and Unstructured Data.The box labeled Structured Data has two lines that connect below to the words XML Schema Based? and the words Non-Schema Based?. XML Schema Based? connects below to the words Use either: CLOB or Structured Storage. Non-Schema Based? connects below to the words Store as:, which list three bullet points: CLOB in XMLType Table, File in Repository Folder Views, and Access through Resource APIs.The box labeled Semi-structured Pseudo-structured Data has two lines that connect below to the words XML Schema Based? and Non-Schema Based?. The words XML Schema Based? connect below to the words Use either:, which have three bullet points listed below: CLOB, Structured, and Hybrid Storage (semi-structured storage). The words Non-Schema Based? connect below to the words Store as:, which have three bullet points listed below: CLOB in XMLType Table, File in Repository Folder Views, and Access through Resource APIs. The box labeled Unstructured Data connects to the words Store as:, which have three bullet points listed below: CLOB in XMLType Table, File in Repository Folder Views, and Access through Resource APIs.23Data Access Models (1): http://docs.oracle.com/cd/B19306_01/appdev.102/b14259/xdb02rep.htm

24: http://docs.oracle.com/cd/B19306_01/appdev.102/b14259/xdb02rep.htmThe figure shows a tree structure with two branches. The top node is labeled Oracle XML DB Data Access Options. The children of Oracle XML DB Data Access Options are Query-Based Access and Path-Based Access. The Query-Based Access node expands to Use SQL, which expands to Available Language and XMLType APIs (one node), which has three branches: JDBC, PL/SQL, and C (OCI). The Path-Based Access node expands to Use Repository, which expands to Available Languages and APIs (one node), which has three branches: SQL (RESOURCE_/PATH_VIEW), FTP, and HTTP/WebDav.Data Access Models (2)25 : http://docs.oracle.com/cd/B19306_01/appdev.102/b14259/img/adxdb006.gif

26 : , (JoSephChoi),(YongSeokChoi), , 27 2. 2014 pp.169-183

27 (10)10: ? ?

11 , , .

- 9 ( ) www.slideshare.net - PPT, PDF kangjm@korea.ac.kr

The followings were made to supplement my shabby presentation. When you need anything, please e-mail me at this address at any time.mooknc@gmail.com

ThanKQkangjm@korea.ac.kr