52
Alvin Richards [email protected] Technical Director, 10gen Inc. @jonnyeight Basic Application & Schema Design

MongoDB Tokyo Design

Embed Size (px)

DESCRIPTION

In this talk we will discuss how objects in your application code map to your MongoDB schema. We will then discuss various options for mapping one-to-many, many-to-many, trees and queues structures to a schema.

Citation preview

Page 1: MongoDB Tokyo Design

Alvin  Richards  -­‐  [email protected]  Director,  10gen  Inc.@jonnyeight

Basic Application & Schema Design

Page 2: MongoDB Tokyo Design

Topics

Schema design is easy!• Data as Objects in code

Common patterns• Single table inheritance• One-to-Many & Many-to-Many• Trees• Queues

Page 3: MongoDB Tokyo Design

Use MongoDB with your language

10gen Supported Drivers• Ruby, Python, Perl, PHP, Javascript• Java, C/C++, C#, Scala• Erlang, Haskell

Object Data Mappers• Morphia - Java• Mongoid, MongoMapper - Ruby

Community Drivers• F# , Smalltalk, Clojure, Go, Groovy

Page 4: MongoDB Tokyo Design

So today’s example will use...

Page 5: MongoDB Tokyo Design

Design your objects in your code - Java using Driver//  Get  a  connection  to  the  databaseDBCollection  coll  =  new  Mongo().getDB("blogs");

//  Create  the  ObjectMap<String,  Object>  obj  =  new  HashMap...obj.add("author",  "Hergé");  obj.add("text",  "Destination  Moon");obj.add("date",  new  Date());

//  Insert  the  object  into  MongoDBcoll.insert(new  BasicDBObject(obj));

Page 6: MongoDB Tokyo Design

Design your objects in your code - Java using Object Data Mapper//  Use  Morphia  annotations@Entityclass Blog { @Id String author; @Indexed Date date; String text;}

Page 7: MongoDB Tokyo Design

Design your objects in your code - Java using Object Data Mapper//  Create  the  data  storeDatastore  ds  =  new  Morphia().createDatastore()

//  Create  the  ObjectBlog  entry  =  new  Blog("Hergé",  New  Date(),  "Destination  Moon")

//  Insert  object  into  MongoDBds.save(entry);

Page 8: MongoDB Tokyo Design

Terminology

RDBMS MongoDB

Table Collection

Row(s) JSON  Document

Index Index

Join Embedding  &  Linking

Partition Shard

Partition  Key Shard  Key

Page 9: MongoDB Tokyo Design

Schema DesignRelational Database

Page 10: MongoDB Tokyo Design

Schema DesignMongoDB

Page 11: MongoDB Tokyo Design

Schema DesignMongoDB embedding

Page 12: MongoDB Tokyo Design

Schema DesignMongoDB linking

Page 13: MongoDB Tokyo Design

Design Session

Design documents that simply map to your application>  post  =  {author:  "Hergé",                    date:  ISODate("2011-­‐09-­‐18T09:56:06.298Z"),                    text:  "Destination  Moon",                    tags:  ["comic",  "adventure"]}

>  db.posts.save(post)

Page 14: MongoDB Tokyo Design

>  db.posts.find()

   {  _id:  ObjectId("4c4ba5c0672c685e5e8aabf3"),        author:  "Hergé",          date:  ISODate("2011-­‐09-­‐18T09:56:06.298Z"),          text:  "Destination  Moon",          tags:  [  "comic",  "adventure"  ]    }     Notes:• ID must be unique, but can be anything you’d like• MongoDB will generate a default ID if one is not supplied

Find the document

Page 15: MongoDB Tokyo Design

Secondary index for “author”

 //      1  means  ascending,  -­‐1  means  descending

 >  db.posts.ensureIndex({author:  1})

 >  db.posts.find({author:  'Hergé'})          {  _id:  ObjectId("4c4ba5c0672c685e5e8aabf3"),          date:  ISODate("2011-­‐09-­‐18T09:56:06.298Z"),          author:  "Hergé",            ...  }

Add and index, find via Index

Page 16: MongoDB Tokyo Design

Examine the query plan>  db.blogs.find({author:  "Hergé"}).explain(){   "cursor"  :  "BtreeCursor  author_1",   "nscanned"  :  1,   "nscannedObjects"  :  1,   "n"  :  1,   "millis"  :  5,   "indexBounds"  :  {     "author"  :  [       [         "Hergé",         "Hergé"       ]     ]   }}

Page 17: MongoDB Tokyo Design

Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,

//  find  posts  with  any  tags>  db.posts.find({tags:  {$exists:  true}})

Page 18: MongoDB Tokyo Design

Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,

//  find  posts  with  any  tags>  db.posts.find({tags:  {$exists:  true}})

Regular expressions://  posts  where  author  starts  with  h>  db.posts.find({author:  /^h/i  })  

Page 19: MongoDB Tokyo Design

Query operatorsConditional operators: $ne, $in, $nin, $mod, $all, $size, $exists, $type, .. $lt, $lte, $gt, $gte, $ne,

//  find  posts  with  any  tags>  db.posts.find({tags:  {$exists:  true}})

Regular expressions://  posts  where  author  starts  with  h>  db.posts.find({author:  /^h/i  })  

Counting: //  number  of  posts  written  by  Hergé>  db.posts.find({author:  "Hergé"}).count()

Page 20: MongoDB Tokyo Design

Extending the Schema        new_comment  =  {author:  "Kyle",                                  date:  new  Date(),                                text:  "great  book"}

 >  db.posts.update(                      {text:  "Destination  Moon"  },                        {  "$push":  {comments:  new_comment},                          "$inc":    {comments_count:  1}})

Page 21: MongoDB Tokyo Design

 >  db.blogs.find({_id:  ObjectId("4c4ba5c0672c685e5e8aabf3")})

   {  _id  :  ObjectId("4c4ba5c0672c685e5e8aabf3"),          author  :  "Hergé",        date  :  ISODate("2011-­‐09-­‐18T09:56:06.298Z"),          text  :  "Destination  Moon",        tags  :  [  "comic",  "adventure"  ],                comments  :  [   {     author  :  "Kyle",     date  :  ISODate("2011-­‐09-­‐19T09:56:06.298Z"),     text  :  "great  book"   }        ],        comments_count:  1    }    

Extending the Schema

Page 22: MongoDB Tokyo Design

//  create  index  on  nested  documents:>  db.posts.ensureIndex({"comments.author":  1})

>  db.posts.find({"comments.author":"Kyle"})

Extending the Schema

Page 23: MongoDB Tokyo Design

//  create  index  on  nested  documents:>  db.posts.ensureIndex({"comments.author":  1})

>  db.posts.find({"comments.author":"Kyle"})

//  find  last  5  posts:>  db.posts.find().sort({date:-­‐1}).limit(5)

Extending the Schema

Page 24: MongoDB Tokyo Design

//  create  index  on  nested  documents:>  db.posts.ensureIndex({"comments.author":  1})

>  db.posts.find({"comments.author":"Kyle"})

//  find  last  5  posts:>  db.posts.find().sort({date:-­‐1}).limit(5)

//  most  commented  post:>  db.posts.find().sort({comments_count:-­‐1}).limit(1)

When sorting, check if you need an index

Extending the Schema

Page 25: MongoDB Tokyo Design

Common Patterns

Page 26: MongoDB Tokyo Design

Inheritance

Page 27: MongoDB Tokyo Design

Single Table Inheritance - RDBMS

shapes tableid type area radius length width

1 circle 3.14 1

2 square 4 2

3 rect 10 5 2

Page 28: MongoDB Tokyo Design

Single Table Inheritance - MongoDB>  db.shapes.find()  {  _id:  "1",  type:  "circle",area:  3.14,  radius:  1}  {  _id:  "2",  type:  "square",area:  4,  length:  2}  {  _id:  "3",  type:  "rect",    area:  10,  length:  5,  width:  2}

missing values not stored!

Page 29: MongoDB Tokyo Design

Single Table Inheritance - MongoDB>  db.shapes.find()  {  _id:  "1",  type:  "circle",area:  3.14,  radius:  1}  {  _id:  "2",  type:  "square",area:  4,  length:  2}  {  _id:  "3",  type:  "rect",    area:  10,  length:  5,  width:  2}

//  find  shapes  where  radius  >  0  >  db.shapes.find({radius:  {$gt:  0}})

Page 30: MongoDB Tokyo Design

Single Table Inheritance - MongoDB>  db.shapes.find()  {  _id:  "1",  type:  "circle",area:  3.14,  radius:  1}  {  _id:  "2",  type:  "square",area:  4,  length:  2}  {  _id:  "3",  type:  "rect",    area:  10,  length:  5,  width:  2}

//  find  shapes  where  radius  >  0  >  db.shapes.find({radius:  {$gt:  0}})

//  create  index>  db.shapes.ensureIndex({radius:  1},  {sparse:true})

index only values present!

Page 31: MongoDB Tokyo Design

One to ManyOne to Many relationships can specify• degree of association between objects• containment• life-cycle

Page 32: MongoDB Tokyo Design

One to Many- Embedded Array - $slice operator to return subset of comments - some queries harder e.g find latest comments across all blogs

blogs:  {                author  :  "Hergé",        date  :  ISODate("2011-­‐09-­‐18T09:56:06.298Z"),          comments  :  [      {     author  :  "Kyle",     date  :  ISODate("2011-­‐09-­‐19T09:56:06.298Z"),     text  :  "great  book"      }        ]}

Page 33: MongoDB Tokyo Design

One to Many- Normalized (2 collections) - most flexible - more queries

blogs:  {  _id:  1000,                        author:  "Hergé",                  date:  ISODate("2011-­‐09-­‐18T09:56:06.298Z"),                    comments:  [                                    {comment  :  1)}                                      ]}

comments  :  {  _id  :  1,                          blog:  1000,                          author  :  "Kyle",            date  :  ISODate("2011-­‐09-­‐19T09:56:06.298Z")}

>  blog  =  db.blogs.find({text:  "Destination  Moon"});>  db.comments.find({blog:  blog._id});

Page 34: MongoDB Tokyo Design

One to Many - patterns

- Embedded Array / Array Keys

- Embedded Array / Array Keys- Normalized

Page 35: MongoDB Tokyo Design

Many - ManyExample: - Product can be in many categories- Category can have many products

Page 36: MongoDB Tokyo Design

products:      {  _id:  10,          name:  "Destination  Moon",          category_ids:  [  20,  30  ]  }    

Many - Many

Page 37: MongoDB Tokyo Design

products:      {  _id:  10,          name:  "Destination  Moon",          category_ids:  [  20,  30  ]  }    categories:      {  _id:  20,            name:  "adventure",            product_ids:  [  10,  11,  12  ]  }

categories:      {  _id:  21,            name:  "movie",            product_ids:  [  10  ]  }

Many - Many

Page 38: MongoDB Tokyo Design

products:      {  _id:  10,          name:  "Destination  Moon",          category_ids:  [  20,  30  ]  }    categories:      {  _id:  20,            name:  "adventure",            product_ids:  [  10,  11,  12  ]  }

categories:      {  _id:  21,            name:  "movie",            product_ids:  [  10  ]  }

//All  categories  for  a  given  product>  db.categories.find({product_ids:  10})

Many - Many

Page 39: MongoDB Tokyo Design

products:      {  _id:  10,          name:  "Destination  Moon",          category_ids:  [  20,  30  ]  }    categories:      {  _id:  20,            name:  "adventure"}

Alternative

Page 40: MongoDB Tokyo Design

products:      {  _id:  10,          name:  "Destination  Moon",          category_ids:  [  20,  30  ]  }    categories:      {  _id:  20,            name:  "adventure"}

//  All  products  for  a  given  category>  db.products.find({category_ids:  20)})  

Alternative

Page 41: MongoDB Tokyo Design

products:      {  _id:  10,          name:  "Destination  Moon",          category_ids:  [  20,  30  ]  }    categories:      {  _id:  20,            name:  "adventure"}

//  All  products  for  a  given  category>  db.products.find({category_ids:  20)})  

//  All  categories  for  a  given  productproduct    =  db.products.find(_id  :  some_id)>  db.categories.find({_id  :  {$in  :  product.category_ids}})  

Alternative

Page 42: MongoDB Tokyo Design

TreesHierarchical information

   

Page 43: MongoDB Tokyo Design

TreesFull Tree in Document

{  comments:  [          {  author:  “Kyle”,  text:  “...”,                replies:  [                                            {author:  “James”,  text:  “...”,                                              replies:  []}                ]}    ]}

Pros: Single Document, Performance, Intuitive

Cons: Hard to search, Partial Results, 16MB limit

   

Page 44: MongoDB Tokyo Design

Array of Ancestors- Store all Ancestors of a node    {  _id:  "a"  }    {  _id:  "b",  thread:  [  "a"  ],  replyTo:  "a"  }    {  _id:  "c",  thread:  [  "a",  "b"  ],  replyTo:  "b"  }    {  _id:  "d",  thread:  [  "a",  "b"  ],  replyTo:  "b"  }    {  _id:  "e",  thread:  [  "a"  ],  replyTo:  "a"  }    {  _id:  "f",  thread:  [  "a",  "e"  ],  replyTo:  "e"  }

//  find  all  threads  where  "b"  is  in

>  db.msg_tree.find({thread:  "b"})

A B C

DE

F

Page 45: MongoDB Tokyo Design

Array of Ancestors- Store all Ancestors of a node    {  _id:  "a"  }    {  _id:  "b",  thread:  [  "a"  ],  replyTo:  "a"  }    {  _id:  "c",  thread:  [  "a",  "b"  ],  replyTo:  "b"  }    {  _id:  "d",  thread:  [  "a",  "b"  ],  replyTo:  "b"  }    {  _id:  "e",  thread:  [  "a"  ],  replyTo:  "a"  }    {  _id:  "f",  thread:  [  "a",  "e"  ],  replyTo:  "e"  }

//  find  all  threads  where  "b"  is  in

>  db.msg_tree.find({thread:  "b"})

//  find  all  direct  message  "b:  replied  to

>  db.msg_tree.find({replyTo:  "b"})

A B C

DE

F

Page 46: MongoDB Tokyo Design

Array of Ancestors- Store all Ancestors of a node    {  _id:  "a"  }    {  _id:  "b",  thread:  [  "a"  ],  replyTo:  "a"  }    {  _id:  "c",  thread:  [  "a",  "b"  ],  replyTo:  "b"  }    {  _id:  "d",  thread:  [  "a",  "b"  ],  replyTo:  "b"  }    {  _id:  "e",  thread:  [  "a"  ],  replyTo:  "a"  }    {  _id:  "f",  thread:  [  "a",  "e"  ],  replyTo:  "e"  }

//  find  all  threads  where  "b"  is  in

>  db.msg_tree.find({thread:  "b"})

//  find  all  direct  message  "b:  replied  to

>  db.msg_tree.find({replyTo:  "b"})

//find  all  ancestors  of  f:>  threads  =  db.msg_tree.findOne({_id:"f"}).thread>  db.msg_tree.find({_id:  {  $in  :  threads})

A B C

DE

F

Page 47: MongoDB Tokyo Design

Trees as PathsStore hierarchy as a path expression- Separate each node by a delimiter, e.g. “/”- Use text search for find parts of a tree

{  comments:  [          {  author:  "Kyle",  text:  "initial  post",                path:  ""  },          {  author:  "Jim",    text:  "jim’s  comment",              path:  "jim"  },          {  author:  "Kyle",  text:  "Kyle’s  reply  to  Jim",              path  :  "jim/kyle"}  ]  }

//  Find  the  conversations  Jim  was  part  of  >  db.posts.find({path:  /^jim/i})

Page 48: MongoDB Tokyo Design

Queue• Need to maintain order and state• Ensure that updates are atomic

     db.jobs.save(      {  inprogress:  false,          priority:  1,        ...      });

//  find  highest  priority  job  and  mark  as  in-­‐progressjob  =  db.jobs.findAndModify({                              query:    {inprogress:  false},                              sort:      {priority:  -­‐1},                                update:  {$set:  {inprogress:  true,                                                                started:  new  Date()}},                              new:  true})    

Page 49: MongoDB Tokyo Design

Queue• Need to maintain order and state• Ensure that updates are atomic

     db.jobs.save(      {  inprogress:  false,          priority:  1,        ...      });

//  find  highest  priority  job  and  mark  as  in-­‐progressjob  =  db.jobs.findAndModify({                              query:    {inprogress:  false},                              sort:      {priority:  -­‐1},                                update:  {$set:  {inprogress:  true,                                                                started:  new  Date()}},                              new:  true})    

Page 50: MongoDB Tokyo Design

Queue

     {  inprogress:  true,          priority:  1,            started:  ISODate("2011-­‐09-­‐18T09:56:06.298Z")      ...      }

updated

added

Page 51: MongoDB Tokyo Design

Summary

Schema design is different in MongoDB

Basic data design principals stay the same

Focus on how the application manipulates data

Rapidly evolve schema to meet your requirements

Enjoy your new freedom, use it wisely :-)

Page 52: MongoDB Tokyo Design

@mongodb

conferences,  appearances,  and  meetupshttp://www.10gen.com/events

http://bit.ly/mongo>  Facebook                    |                  Twitter                  |                  LinkedIn

http://linkd.in/joinmongo

download at mongodb.org

[email protected]