Gfarm presentation and thesis topic introduction

  • View

  • Download

Embed Size (px)


This slide outlines general information of Gfarm file system and the basis for the presenter's thesis.

Text of Gfarm presentation and thesis topic introduction

  • 1. GFARM V2: A grid file system that supports highperformance distributed and parallel data computing Osamu Tatebe, Satoshi Sekiguchi, AIST, Tsukuba, Japan Youhei Morita, KEK, Tsukuba, Japan Noriyuki Soda, SRA, Nagoya, Japan Satoshi Matsuoka, Titech / NII, Tokyo, Japan Presentation: Chawanat Nakasan / M1 Laboratory for Software Design and Analysis Nara Institute of Science and TechnologySeminar II, First Presentation 2013.12.04 1

2. Agenda O. Tatebe, S. Sekiguchi, Y. Morita, N. Soda, and S. Matsuoka, Gfarm v2: A Grid file system that supports high-performance distributed and parallel data computing, in Computing in High Energy Physics and Nuclear Physics, 2004, pp. 11721175. What is Gfarm Things similar to Gfarm Replication in Gfarm Networking issues in Gfarm Research introductionPaper Application2 3. Introduction3 4. What is Gfarm? Distributed File System with Parallel ProcessingMetaserver METAProcessor CPUCPUCPUCPUStorageStorage Nodes4 5. Whats different about Gfarm? Other clustering solutions send files to where the jobs are. METAFileFileJobJobCPUDoesnt work well with BIG DATA.CPUFile5 6. Whats different about Gfarm? Instead, Gfarm sends jobs to nodes with files. METAJobJobFileFileCPUCPUJobJobCPUCPU6 7. Replica Management7 8. One Big Issue in Distributed Storage: Replication and replica management Same files are copied and spread across the system. Reasons: Redundancy Locality In Gfarm: job location Problem: Consistency.METAFileFileCPUCPUFileCPUCPU8 9. Gfarm directs file opens to the same place. This method is very effective for consistency control. But, it requires more coordination between the nodes i.e. more network load and overhead. P1P2(1) P1 opens file replica F1(2) P2 tries to open replica F2 (same file different place) (3) P2 is redirected to use F1 too, to limit # copies openF1F2 9 10. Summary: What is Gfarm? A distributed file system with a parallel processing scheduler that sends jobs to files, not files to jobs, and only one replica can be written at a time!10 11. Application: Improving Gfarm Why do we have to improve it?11 12. It sounds good, until implementations get too large. When it becomes global-scale, we have to think differently. This is what appears to us: METACPUCPUCPU12 13. It sounds good, until implementations get too large. But this is reality: CPU METACPUCPU13 14. So how do we simplify this problem? We put an overlay network on top. Overlay Network (Gfarm sees) Physical Network (Reality) 14 15. 1. It doesnt care about locality.Overlay Network In this case, the two red arrows are same length according to this topology, because its just one hop apart.15 16. 1. It doesnt care about locality. However, its not when we look at physical diagram. Gfarms overlay network doesnt recognize the true distances.Physical Network16 17. 2. Conventional network doesnt use every route. Examine this topology: theres more than one way for the circled nodes to reach each other.Best Route: Always usedPhysical NetworkOther Route(s): Rarely used17 18. 3. If we use every route, which would we use? High latency, more bandwidth Good for data transferCPUCPULow latency, less bandwidth Good for control messages 18 19. We are about to use the SDN. SDN = Software-defined network Concept: Use software to dynamically add or change network data flows.Figure: McKeown, N., & Anderson, T. (2008). OpenFlow: enabling innovation in campus networks. Retrieved from 20. What can the SDN do? SDN can practically let us make a whole new protocol by programming a specific controller to do the job. With SDN, we can: Change settings dynamically Implement specialized Quality-of-Service (QoS) Differentiate many kinds of connections By application, port, users, network addresses, groups, etc. Use multi-path routing efficiently and much more!20 21. So what do we want to do? We want toaccelerate wide-area distributed storage by usingsoftware defined network tooptimize the overlay network. 21 22. 22 23. INFORMATION for GENERAL PUBLIC This work was made by a member of Laboratory for Software Design and Analysis, Graduate School of Information Science, Nara Institute of Science and Technology. This presentation is the first of two required for Masters degree graduation and is presented to faculty and students of the Institute. This file has been modified for public disclosure. Actual content during presentation was different.23 24. BACKUP SLIDES Some of them may not make sense.24 25. Gfarm job execution relies on file presence.25 26. BACKUP: Gfarms not Hadoop Gfarm isnt Hadoop: it provides job scheduling thats not MapReduce. Of course, Gfarm works with Hadoop if you want it to.Lets just say Gfarm doesnt do this: 27. How to work with file replicas To open a file in READ mode: Any replica is OK.Process WritingReplicaProcessProcessReadingReplicaReplicaReplica27 28. How to work with file replicas To open a file in WRITE mode (in this order): If somebody is writing, use a replica already opened in WRITE mode If nobody is writing, use a replica already opened in READ mode If nobody is reading, use any replica Process WritingReplicaProcessProcessReadingReplicaReplicaReplica 28 29. BACKUP: 2. Why dont we use every possible route? So what we can do might be: Transfer File A over the red path Transfer File B over the orange path The overall bandwidth would be increased!Physical Network29 30. BACKUP: 2. Why dont we use every possible route? Problems of this solution: TCP segmentation & reordering UDP will result in A LOT of unwanted and uncorrectable reordering Mitigation: Separate data & control Just divide the link at file level, so one file on link A, another file on link B, etc. We can do this because its a file system and may make use of many files at the same time.30 31. BACKUP: Why bandwidth and latency dont correlate? Bandwidth is limited by the link capacity and rate of transmission and receiving. Latency is caused by processing time. Per-router processing time is increased in the WAN due to routers being overwhelmed by general public usage of the Internet There can be more than 10 hops to reach a node in another country.31 32. Actually, why NOT SDN? Configuration delay: takes some time for a new route to be installed Single Point of Failure (for centralized SDNs like OpenFlow) Cannot easily implement multiple SDN instances We can however pre-slice the network and run SDN on each subnet, or use solutions like FlowVisor (proxy OpenFlow) Controller bugs can break the existing thing (even the simplest controllers can have bugs!)32 33. How can we use it with Gfarm? Data Use multiple paths Prefer bandwidth path Control QoS Prefer low-latency path These methods can be implemented in SDN33 34. How can we use it with Gfarm? Multi-path routing? We can use multiple paths to add up bandwidth. SDN can differentiate between each flow so paths can be separated.Physical Network34 35. How can we use it with Gfarm? Applicationaware routing? Control messages prefer low latency Data transfers prefer greater bandwidth. SDN knows difference between these uses and can optimize. CPUMore latency More bandwidthCPULess latency Less bandwidth 35 36. How can we use it with Gfarm? Quality of Service? Critical uses such as control messages can be given priority so they can skip the (potentially very long) queue of data packets. Some SDNs like OpenFlow are beginning to support QoS. VoIP Streaming data Control Messages Synchronous msgsImportantCan Wait Scheduled jobs Data backup Background tasks Unimportant things 36