「GraphDB徹底入門」〜構造や仕組み理解から使いどころ・種々のGraphDBの比較まで幅広く〜

Embed Size (px)

DESCRIPTION

 

Citation preview

  • Graph DB GraphDB doryokujin+WEB( Tokyo.Webmining #9-2)

[Me] doryokujin22 33[Company]1 [ ]MongoDB JPTokyoWebMining MongoDB[ ]MongoDBMongoDB GraphDB #1[MongoTokyo]Mongo DB Congerence in Japan2011 03 0110gen 3http://www.10gen.com/conferences/mongotokyo2011 #2[gihyo ]gihyo.jp2 DocumentDBGraphDB NoSQL GraphGraphGraph DB Graph TraversalGraph DB Neo4j, Sones, InfoGrid, OrientDB, InniteGraphTinker Pop Gremlin, Blueprints, Pipes, Rexster, Mutant GraphGraphGraphGraph DB Graph TraversalGraph DB Neo4j, Sones, InfoGrid, OrientDB, InniteGraphTinker Pop Gremlin, Blueprints, Pipes, Rexster, Mutant Graph:GraphGraph DBGraph Graph[Graph] DotsLinesvertices edges1(relationship) Dots Lines Graph Undirected Graph[(Undirected)Graph] Vertices: Edges:(relationship)(symmetric) Directed Graph[ (Directed) Graph]Vertices:Edges: (relationship) (asymmetric) Directed / Underected Graph friend follow friend follow followfriend follow [Facebook] [Twitter]Undirected GraphFollowDirected Graph friendsfollow Single-Relational GraphSingle-Relational Structures Undirected / Directed GraphSingle-Relatinal1 Graph Single-Relational Graph friend follow friend follow followfriend follow [Facebook] [Twitter]Undirected GraphFollowDirected GraphFacebook Twitterfriendsfollow Single-Relational Reply num:5 Reply Block num:5 ReplyDM num:5 num:1RT RT ReplyDM num:2num:2 num:2 num:1[Twitter] GraphDirected Graph Twitter Reply,RT,DM,Block *FacebookFlickr lives_in is is isfollowlives_in friend is share *friendsharefollow follow is[] lives_inUndirectedDirected isisis lives_in Multi-Relational GraphMulti-Relational Structureslives_in: User CountryShare: Facebook Flikcr Multi-RelationalReplyReply Block DM Reply RT RTReply DM[Twitter]TwitterReply,RT,DM,Block Multi-Relational *FacebookFlickr lives_inhas has has follow lives_in friendhas share *has friend sharefollowlives_in[Multi-Relatinal Graph] has has has lives_in Property GraphProperty Graph Multi-Relational Graph(Property)Graph DB Graph 1 key/value idid_A followidid_Bfollow 100 follow 500follower 200 date 2011/01/23 follower 1000 Property GraphReplynum:5Reply Blocknum:5Reply DMnum:5num:1 RT RTReplyDMnum:2num:2num:2 num:1GraphProperty GraphTwitterReply,RT,DM,Blocknum Property Graphnamedoryokujinsex man lives_in birth 1985/05/14has hasidid_Bfollowfollow 1000follower 2000lives_in date 2011/01/23 friend has frienddate 2011/01/23 has friendfollowfollow date2010/03/23 date 2011/01/23 name full namemailxxx@yyy addresszzzlives_in idid_A follow 100follower200has has date 2010/03/23 lives_in Graph The Graph Traversal Pattern Property GraphProperty GraphGraphProperty GraphGraph DB Tinker PopHyper Graph Graph DBGraghGraphGraph DB Graph TraversalGraph DB Neo4j, Sones, InfoGrid, OrientDB, InniteGraphTinker Pop Gremlin, Blueprints, Pipes, Rexster, Mutant Graph DB: Property Graph DBGraph DB Graph DB[DB Graph DB]GraphDBDB Graph DB RDBGraph[Relatinal Database] AoutVinV AB BC AC CD D DA Document DB Graph [Document Database] A{A : {out : [B, C], in : [D]}B : {in : [A] B C}C : {out : [D], in : [A]}D : {out : [A], in : [C]D}} XML DB Graph[XML Database] A BC D Graph DB[ ]A graph database is any storage system that provides index-free adjacency The Graph Traversal Programming Pattern(adjacent) ( index-free ) Non-Graph DB and Index-Based AdjacencyB E1. A 3. (B,C) A A BCB, C E D, EDE2.C D log_2(n)(B,C) time cost Graph DB andIndex-Free AdjacencyMini - IndexB E 1.1 A(B,C)C D idid_Bfollow 1000follower 2000 Property (key/value) The Graph Traversal Programming Pattern GraphDB: Graph TraversalGraph DBGraph DB Query Graph DB QueryGraph Query = Graph Traversal Traversal = Root Graph Graph Traversal(Root) Index-Free Adjacency private void printFriends( Node person ){ Traverser traverser = person.traverse( Order.BREADTH_FIRST, // StopEvaluator.END_OF_GRAPH, // Graph ReturnableEvaluator.ALL_BUT_START_NODE, // Root Node MyRelationshipTypes.KNOWS, // KNOWS Direction.OUTGOING ); // for ( Node friend : traverser ) { // Node name System.out.println( friend.getProperty( "name" ) ); }Neo4j Wiki} 131 2TrinityMorpheusCypherAgent Smith Neo4j Wiki private void findHackers( Node startNode ) Neo4j Wiki{ Traverser traverser = startNode.traverse( Order.BREADTH_FIRST, // StopEvaluator.END_OF_GRAPH, // Graph new ReturnableEvaluator() // { public boolean isReturnableNode( TraversalPosition currentPosition ) { Relationship rel = currentPosition.lastRelationshipTraversed(); if ( rel != null && rel.isType( MyRelationshipTypes.CODED_BY ) ) { return true; // CODED_BY } return false; // } }, // 2 MyRelationshipTypes.CODED_BY, Direction.OUTGOING, // MyRelationshipTypes.KNOWS, Direction.OUTGOING ); // for ( Node hacker : traverser ) { TraversalPosition position = traverser.currentPosition(); System.out.println( "At depth " + position.depth() + " => " + hacker.getProperty( "name" ) ); } At depth 4 => The Architect Graph DB[Data Locality][Local Search, Social Network] 2[Transition] Web[Recommendation] [Graph Problems][Shortest Path] 2GraphDB TraversalNeo4jrb Graph DB 10 KnowsTables, Documents, Key/Value ModelGraphDBUnion,Intersection, Join Graph DB[ ] Property GraphIndex-Free AdjacencyGraph Query = Graph TraversalData Locality Graph DBGraphGraphGraph DB Graph TraversalGraph DB Neo4j, Sones, InfoGrid, OrientDB, InniteGraphTinker Pop Gremlin, Blueprints, Pipes, Rexster, Mutant Neo4j Neo4j[ ] HPJavaAGPLv32003 24 82009 VCACIDPropety Graph Model / GremlinLucene Neo4j[Language Binding - Framework]Python - DjangoRuby - Ruby on RailsClojureScalaGroovy - Grifn / GrailsJava - Spring FrameworkRuby Ruby Java Neo4j[Tools] ShellShellGraph Traverse Indexing neo4j-serverNeo4j REST APIAdmin tools Online BackUp NeoclipseNeo4j Batch Insert Neo4j[ver. 1.2] 1.2 Neo4j Server REST API Admin Interface High Availability Kernel sones sones[ ] HPC#AGPLv32011 VCACIDREST InterfaceProperty Graph Model / Gremlin : Property Hyper GraphGraph Query Language(GQL) sones[GQL] SQL Traversal Cheat Sheet Query FROM User SELECT User.Friends.Friends.Name// aggregation SELECT COUNT(User.Friends) SELECT User.Friends.Random(2) SELECT User.Friends.Name.Substring(2,5) Orient DB Orient DB[ ] HPJavaApache2.01997 C++ JavaDocument-Graph DBACIDShell / REST InterfacePropety Graph Model / Gremlin Orient DB [Document-Graph DB][] Orient DBObject DBKey/ValueServer Document DB// DATABASE OPENODatabaseDocumentTx db = new ODatabaseDocumentTx("remote:localhost/petshop").open("admin", "admin");//DocumentODocument doc = new ODocument(db, "Person");doc.field( "name", "Luke" );doc.field( "surname", "Skywalker" );doc.field( "city", new ODocument(db, "City").field("name","Rome").field("country","Italy") ); //Transactiondoc.save();db.close(); Orient DB[Document-Graph DB] OGraphVertex OGraphEdgeOGraphElement ODocumentWrapperDocument SQLSELECT FROM OGraphVertex WHERE outEdges CONTAINS ( label = knows )//7 knowsSELECT FROM OGraphVertex WHERE outEdges TRAVERSE(0,7,out,outEdges)( @class = OGraphEdge and label = knows ) Orient DB[Language Binding Using Binary Protocol]JavaCPHPJRuby (Ruby: soon)[Language Binding Using REST Protocol]PythonJava Script InfoGrid InfoGrid[] HPJAVA AGPLv3ACIDREST InterfaceMeshObject GraphMeshBase _GDB = StoreMeshBase.create(_MySQLStore);MeshObject _xkcd = _GDB.getMeshObjectLifecycleManager().createMeshObject();_xkcd.setProperty("Name", "xkcd");_xkcd.setProperty("Url", "http://www.xkcd.com");_xkcd.relate(_good) Innite Graph Innite Graph[ ] HPC++ Academic and StartUp2010 6Distributed Graph DBObjectivity/DB: distributed database server Graph DB:DataSQL LikeGraphDBLicenseLanguage ProtocolGremlinBindingModelQuery REST/ PropertyRuby, Python, Neo4jAGPLv3Java Yes Scala,... - JSON Graph REST/ Property sonesAGPLv3C#JSONGraphYes -Yes (XML)(+Extend) REST/ Property PHP, Jruby,OrientDBApache2.0 Java Yes Python, JS,...Yes JSON GraphProperty REST/Info Grid AGPLv3JavaGraph?-- - JSON (MeshObject) Innite Property ProductC++ - -- -Graph Graph Graph DB[]Graph DB Neo4jOpen Source Social Graph Software Not Ready Yet Graph DB Hypergtaph: PropertyGraph HyperGraph Pregel: bulk synchronous parallel modelDistributed DB Google FlockDB: Distributed DB for storing adjancency lists Twitter Tinker PopGraphGraphGraph DB Graph TraversalGraph DB Neo4j, Sones, InfoGrid, OrientDB, InniteGraphTinker Pop Gremlin, Blueprints, Pipes, Rexster, Mutant Tinker Pop Tinker Pop[Tinker Pop] HPProperty Graph Model GraphDB Blueprints: A Property Graph Model Interface Gremlin: A Graph Traversal Language Pipes: A Data Flow Framework using Process Graphs Rexster: A RESTful Graph Shell Mutant: A Poly-ScriptEngine ScriptEngine Tinker Pop Tinker Pop: BluePrints BluePrints[ ] HPGraphDBJDBCProperty Graph Model GraphDB[Now] Tinker Graph: in-memory property graph model Sail: Open RDF Neo4j, Orient DB, sones, ...[Future] Redis Innite Graph, Dex BluePrintsGraphDBGraph graph = new Neo4jGraph("/tmp/graph/neo4j");// Graph graph = new OrientGraph("/tmp/graph/orientdb");Vertex a = graph.addVertex(null);Vertex b = graph.addVertex(null);a.setProperty("name","marko");b.setProperty("name","aaron");Edge e = graph.addEdge(null,a,b,"knows");e.setProperty("since",2010);graph.shutdown(); BluePrints Transactiongraph.startTransaction();try{ Vertex luca = graph.addVertex(null); luca.setProperty( "name", "Luca" ); Vertex marko = graph.addVertex(null); marko.setProperty( "name", "Marko" ); Edge lucaKnowsMarko = graph.addEdge(null, luca, marko,"knows"); graph.stopTransaction(Conclusion.SUCCESS);} catch( Exception e ) { graph.stopTransaction(Conclusion.FAILURE);} Tinker Pop: Gremlin Gremlin[ ] HPGremlin = Graph Programing LanguageBlueprints GraphDBShellGraphDBQueryJava + Groovy GremlinProperty Graph Basic Graph Traversals doryokujin$ ./gremlin.sh ,,,/ (o o)-----oOOo-(_)-oOOo-----gremlin> g = TinkerGraphFactory.createTinkerGraph()==>tinkergraph[vertices:6 edges:6] //6 6gremlin> g.class==>class com.tinkerpop.blueprints.pgm.impls.tg.TinkerGraphgremlin> // gremlin> g.V==>v[3]==>v[2]...gremlin> // gremlin> g.E==>e[10][4-created->5]==>e[7][1-knows->2]==>e[9][1-created->3]...Getting Srarted gremlin> v = g.v(1) // id=1 ==>v[1]gremlin> v.keys() // ==>age==>namegremlin> v.values() // ==>29==>markogremlin> v.name + is + v.age + years old.==>marko is 29 years old.gremlin> // id=1, name=marko gremlin> v.outE==>e[7][1-knows->2]==>e[9][1-created->3]==>e[8][1-knows->4]gremlin> // gremlin> v.outE.weight==>0.5==>0.4==>1.0 Getting Srarted gremlin> // id=11.0gremlin> v.outE{it.weight < 1.0}.inV==>v[2]==>v[3]gremlin> // gremlin> list = [] gremlin> v.outE{it.weight < 1.0}.inV >> list==>v[2]==>v[3]gremlin> // listproperty mapsgremlin> list.collect{ it.map() }==>{name=vadas, age=27}==>{name=lop, lang=java}gremlin> // listgremlin> list.inE() ==>e[7][1-knows->2]==>e[9][1-created->3]...Getting Srarted gremlin> list.inE{it.label==knows} // knows==>e[7][1-knows->2]gremlin> list.inE()[[label:knows]] // ==>e[7][1-knows->2]gremlin> list.inE()[[label:knows]].outV.name // :name ==>marko Getting Srarted~20000ms: g.V.outE{it[label]==followed_by}.inV.outE{it[label]==followed_by}.inV.outE {it[label]==followed_by}.inV >>-1~9000ms: g.V.outE{it.label==followed_by}.inV.outE{it.label==followed_by}.inV.outE {it.label==followed_by}.inV >>-1~8500ms: g.V.outE{it.getLabel()==followed_by}.inV.outE{it.getLabel()==followed_by}.inV.outE {it.getLabel()==followed_by}.inV >>-1~6000ms: g.V.outE[[label:followed_by]].inV.outE[[label:followed_by]].inV.outE [[label:followed_by]].inV >>-1 ClosureFilterPipe vs. PropertyFIlterPipe Tinker Pop: Pipes Pipes[ ] HPPipes = Data Flow FrameworkPipesGraph Traversal1 1Pipesltering, splitting, merging, traversing,... Gremling:id-v(a)/outE[@label=knows]/inV/outE[@label=develops]/inV/@namePipe pipe1 = new VertexEdgePipe(Step.OUT_EDGES);Pipe pipe2 = new LabelFilterPipe("knows", Filter.NOT_EQUALS);Pipe pipe3 = new EdgeVertexPipe(Step.IN_VERTEX);Pipe pipe4 = new VertexEdgePipe(Step.OUT_EDGES);Pipe pipe5 = new LabelFilterPipe("develops",Filter.NOT_EQUALS);Pipe pipe6 = new EdgeVertexPipe(Step.IN_VERTEX);Pipe pipe7 = new PropertyPipe("name");Pipe pipeline = new Pipeline(pipe1,pipe2,pipe3,pipe4,pipe5,pipe6,pipe7);pipeline.setStarts(new SingleIterator(graph.getVertex("a"));for(String name : pipeline) {System.out.println(name);}A Graph Processing Stack PipesPipespublic class NumCharsPipe extends AbstractPipe { public Integer processNextStart() { String word = this.starts.next(); return word.length(); }}A Graph Processing Stack Tinker Pop: Rexster Rexster[] HPRexster = A RESTful Graph ShellBlueprints GraphDB RESTfulAPI (JSON)Gremlin > http://localhost:8182/examplegraph/vertices/b{"version":"0.1","results": {"_type":"vertex","_id":"b","name":"aaron","type":"person"},"query_time":0.1537}A Graph Processing Stack// g:key-v(name,DARK STAR)[0]: Usin gGremlin Code> http://localhost:8182/gratefulgraph/traversals/gremlin?script=g:key-v%28%27name%27,%27DARK%20STAR%27%29[0]{ "results": [{ "_type":"vertex", "_id":"89", "name":"DARK STAR", "song_type":"original", "performances":219, "type":"song"} ], "query_time":6.753024, "success":true, "version"}Using Gremilin Tinker Pop: Mutant Mutant[] HPMutant = A Poly-ScriptEngine ScriptEngineJVMScript Engine Mutant Consolemarko:~/software/mutant$ ./mutant.sh // oO ~~-____m(___m___~.___ MuTanT 0.1-SNAPSHOT_|__|__|__|__|__| [ ?h = help ][gremlin] gremlin 0.6-SNAPSHOT[Groovy] Groovy Scripting Engine 2.0[ruby] JSR 223 JRuby Engine 1.5.5[ECMAScript] Mozilla Rhino 1.6 release 2[AppleScript] AppleScriptEngine 1.0mutant[gremlin]> $x := 12[12]mutant[gremlin]> ?xmutant[AppleScript]> ?xmutant[Groovy]> $x12mutant[Groovy]> ?xmutant[ruby]> $x12mutant[ruby]> ?xmutant[ECMAScript]> $x12 Basic Examples [ ] Graph DBGraph DBGraph PartitioningPregel Neo4j Graph DBhttp://snap.stanford.edu/data/index.html