10gen Mongodb Microsoftazure White Paper

Embed Size (px)

Citation preview

  • 7/29/2019 10gen Mongodb Microsoftazure White Paper

    1/8

    MongoDB on Windows Azure

    A 10gen White Paper

  • 7/29/2019 10gen Mongodb Microsoftazure White Paper

    2/8

    2

    MongoDB on Windows Azure brings the

    power of the leading NoSQL database

    to Microsofts exible, open, and

    scalable cloud.

    MongoDB is an open source, document-

    oriented database designed with scalability

    and developer agility in mind. Windows

    Azure is the cloud services operating sys-

    tem that provides the development, service

    hosting, and service management envi-

    ronment for the Azure Services Platform.

    Together, MongoDB and Windows Azure

    MongoDB on Windows Azure

    provide customers the tools to build limit-

    lessly scalable applications in the cloud.

    This paper begins with an overview of

    MongoDB. Next, we describe the two

    primary deployment options available on

    Microsofts cloud platform, Windows Azure

    Virtual Machines and Windows Azure Cloud

    Services. Finally, to help those evaluating

    deploying MongoDB on Windows Azure,

    we outline the pros and cons of the two

    deployment options available.

  • 7/29/2019 10gen Mongodb Microsoftazure White Paper

    3/8

    About MongoDBMongoDB is an open source, document-

    oriented database. MongoDB bridges the

    gap between key-value stores which are

    fast and scalable and relational databas-

    es which have rich functionality. Instead

    of storing data in tables and rows as one

    would with a relational database, MongoDB

    stores a binary form of JSON (BSON or

    binary JSON documents). An example of a

    document is shown in Figure 1.

    The document serves as the fundamental

    unit within MongoDB (like a row in an

    RDBMS); one can add elds (like a column

    in an RDBMS), as well as nested elds and

    embedded documents. Rather than impos-

    ing a at, rigid schema across an entire

    table, the schema is implicit in what elds

    are used in the documents. Thus, MongoDB

    allows developers to have variable schemas

    3

    {

    "_id": ObjectId("504e4dd43796b3da50183991"),

    "text": "Study Implicates Immune System in Parkinsons Disease Pathogenesis http://bit.ly/duhe4P",

    "source": "twitterfeed,

    "coordinates": null,

    "truncated": false,

    "entities": {

    "urls": [{

    "indices": [

    67,

    87],

    "url": "http://bit.ly/duhe4P",

    "expanded_url": null

    }],

    "hashtags": []

    },

    "retweeted": false,

    "place": null,

    "user": {

    "friends_count": 780,

    "created_at": "Fri Jan 08 17:40:11 +0000 2010", "description": "Latest medical news, articles, and features from Medscape Pathology.",

    "time_zone": "Eastern Time (US & Canada)",

    "url": "http://www.medscape.com/pathology",

    "screen_name": "MedscapePath",

    "utc_offset": -18000

    },

    "favorited": false,

    "in_reply_to_user_id": null,

    "id": NumberLong("22819397000")

    }

    Figure 1: Sample JSON Document

    across documents and to adapt schemas as

    their applications evolve.

    Unlike relational databases, MongoDB does

    not use SQL syntax. Rather, MongoDB has

    a query language based on JSON. It also

    has drivers for most modern languages,

    such as C#, Java, Ruby, Python, and many

    others. MongoDBs exible data model and

    support for modern programming languages

    simplify development and administration

    signicantly.

  • 7/29/2019 10gen Mongodb Microsoftazure White Paper

    4/8

    MongoDB ArchitectureMongoDBs core capabilities deliver reli-

    ability, high availability, high performance,

    and scalability.

    Replication through replica setsprovides for

    high availability and data safety. A replica

    setis comprised of one primary node and

    some number of secondary nodes (de-

    termined by the user). Figure 2 shows an

    example replica set, with one primary and

    two secondaries (a common deployment

    model). By default, the primary node takes

    all reads and writes from the application;

    the secondaries replicate asynchronously in

    the background. If the primary node goes

    down for any reason, one of the secondaries

    is automatically promoted to primary status

    and begins to take all reads and writes.

    Replica sets help protect applications fromhardware and data center-related down-

    time. Moreover, they make it easy for DBAs

    to conduct operational tasks, including

    software upgrades and hardware changes.

    Figure 2: Replica Sets with MongoDB

    4

    Secondary

    Secondary

    Application

    Read Write

    Asynchronous

    Replication

    Automatic

    Leader Election

    Primary

    Driver

    Figure 3: Sharding with MongoDB

    Shard A

    0...30

    Shard B

    31...60

    Shard C

    61...90

    Shard N

    n...n+30

    ...

    Horizontally Scalable

    Sharding enables users to scale horizon-

    tally as their data volumes grow and/or as

    demands on their data stores grow. A shard

    is a subset of the database, kind of like a

    partition of the data. In Figure 3, Shard A

    contains documents 1-30; Shard B contains

    documents 31-60; and so on. One can

    choose any key on which to shard the col-

    lection (e.g., user name), and MongoDB will

    automatically shard the data store based on

    this key. One can scale a database in-

    nitely using sharding by adding new nodes

    to a cluster. When a new node is added,

    MongoDB recognizes it and redistributes

    the data across the cluster. Because shard-

    ing distributes both the actual data and

    therefore the load (i.e., trafc), it enables

    horizontal scalability as well as high per-

    formance.

  • 7/29/2019 10gen Mongodb Microsoftazure White Paper

    5/8

    An overview of the MongoDB architecture is shown in Figure

    4. In a multi-shard environment, the application communi-cates with mongos, an intermediary router that directs reads

    and writes to the appropriate shard. Each shard is a replica

    set, providing scalability, availability, and performance to

    developers.

    MongoDB was built for the cloud. Cloud services like Windows

    Azure are therefore a natural t for MongoDB. By coupling

    MongoDBs easy-to-scale architecture and Azures elastic

    cloud capacity, users can quickly and easily build, scale, and

    manage their applications.

    About Windows Azure Servicesfor MongoDBWindows Azure is Microsofts suite of cloud services, providing

    developers on-demand compute and storage to create, host

    and manage scalable and available web applications through

    Microsoft data centers. When deploying MongoDB to Windows

    Azure, users can choose from two deployment options:

    Windows Azure Virtual Machines. Windows AzureVirtual Machines (VMs) is Microsofts Infrastructure-as-

    a-Service (IaaS) offering. Similar to Amazon Web Services

    EC2, Azure VMs give users access to elastic, on-demand

    virtual servers. Users can install Windows or Linux on a VM

    and congure it based on their own preferences or their

    apps specic needs. Users manage the VMs themselves,

    including scaling, installing security patches, and ongoing

    performance monitoring and management. Azure VMs give

    users a relatively signicant degree of control over their

    environments, but by the same token require users to take

    on the VM management. Note: This service is currently in

    preview (beta).

    Windows Azure Cloud Services. Windows Azure CloudServices (Worker Roles and Web Roles) is Microsofts

    PlatformasaService (PaaS) offering. Similar to Heroku,

    Worker Roles provide users with prebuilt, precongured

    instances of compute power. In contrast with Azure VMs,

    users do not have to congure or manage Azure Worker

    Roles. Windows Azure handles the deployment details

    from provisioning and load balancing to health monitoring

    for continuous availability. This can be helpful to some

    users who prefer not to manage their applications at the

    infrastructure level, though it restricts the level of control

    users have over their environments.

    5

    Application

    Replica Set A

    0...30

    Replica Set B

    31...60

    Replica Set C

    61...90

    Replica Set N

    n...n+30

    ...Secondary

    Secondary

    Primary

    Secondary

    Secondary

    Primary

    Secondary

    Secondary

    Primary

    Secondary

    Secondary

    Primary

    Driver

    mongos

    Figure 4: MongoDB Architecture

  • 7/29/2019 10gen Mongodb Microsoftazure White Paper

    6/8

    Understanding the Deployment OptionsGiven that MongoDB can be deployed on either Windows

    Azure Virtual Machines (IaaS) or Windows Azure Cloud Services

    (PaaS), it is important for users to consider the different capa-

    bilities and implementation details of each service to deter-

    mine which deployment model makes the most sense for their

    applications.

    Azure Virtual Machines

    BASIC SETUP

    After being granted access to the preview functionality for

    Azure Virtual Machines, users can launch an instance and

    install and congure MongoDB on it manually. Alternatively,

    users can use the recently released Windows Azure installer for

    MongoDB to set up a MongoDB replica set quickly and easily on

    Windows Azure VMs.

    The installer is built on top of Windows PowerShell and the

    Windows Azure command line tool. It contains a number of

    deployment scripts. The tool is designed to help users get

    single or multi-node MongoDB congurations up and running

    quickly. There are only two steps to installing and congur-

    ing a MongoDB replica set on Azure VMs. Note: the installer

    is designed to run on a users local machine (i.e., not directly

    on an Azure VM), and then to deploy output to Windows Azure

    VMs. To start , download the publish settings le. Next, run the

    installer from the command prompt.

    With Azure Virtual Machines, users can create their own VMs or

    they can create a VM instance from one of several pre-installed

    operating system congurations. Both Windows and Linux are

    supported on Azure Virtual Machines. To deploy MongoDB on

    Linux, visit the MongoDB wiki (wiki.mongodb.org) for step-by-

    step instructions.

    PROS AND CONS OF AZURE VIRTUAL MACHINES

    The pros and cons of deploying MongoDB on Azure Virtual Ma -

    chines are generally consistent with the considerations around

    using IaaS more broadly. Overall, Azure Virtual Machines allow

    users to ne-tune their deployments but by the same token

    require increased operational effort.

    The advantages of using MongoDB on Azure Virtual Machines

    are as follows:

    Increased Control. Users have more control over theirinfrastructural conguration relative to Azure CloudServices. For instance, they can install and congure

    services on the OS, dene policies, etc. This consideration

    may be important for enterprises that have regimented

    policies and processes for IT security and compliance.

    OS Choice. Users can use Windows or Linux.

    Azure Virtual Machines may not always be the right t for the

    following reasons:

    Increased Operational Effort. The increased control that Azure

    Virtual Machines provide comes with increase effort , as well.

    Users must dene and implement their own security measures,

    apply patches, and locate instances for fault tolerance. This

    consideration may be important for developers that lack expe-

    rience managing their own infrastructure or for companies that

    dont have the operational bandwidth to devote to managing

    this component of the stack.

    Beta. The Azure Virtual Machines service is still in

    Preview (beta).

    Azure Cloud Services

    BASIC SETUP

    Users can also deploy MongoDB on Azure Cloud Services. To

    do so, download the MongoDB Azure Worker Role package,

    which is a precongured Worker Role with MongoDB. When

    deployed, each replica set member runs as a separate Worker

    Role instance; MongoDB data les are stored in Azure Cloud

    Drives. For detailed instructions, visit the MongoDB wiki (wiki.

    mongodb.org).

    PROS AND CONS OF AZURE CLOUD SERVICES

    The pros and cons of running MongoDB on Azure Cloud

    Services are generally consistent with those of using PaaS in

    general, though there are some Azure-specic considerations.

    Overall, Windows Azure Cloud Services decreases the opera-

    tional burden on users but affords them less control from an

    infrastructure conguration standpoint. The advantages ofusing Azure Cloud Services are as follows:

    Lower Operational Effort . Microsoft manages OS updatesand security, decreasing the operational burden on the

    users.

    Built-in Fault Tolerance. When deploying multipleMongoDB worker role instances, Windows Azure

    automatically deploys the instances across multiple fault

    and update domains to guarantee better uptime.

    Secure by Default. Microsoft takes measures to ensure thatworker and web roles are secure. Endpoints on instances

    can be enabled for instance-to-instance communication

    without making them public. Thus, one can congure

    MongoDB to be secure by enabling it only for other roles in

    the same deployment.

    6

  • 7/29/2019 10gen Mongodb Microsoftazure White Paper

    7/8

    PROS CONS

    Initial

    Administrative

    Effort

    - Windows only

    - Fixed OS

    configuration

    IaaS

    Windows

    Azure Virtual

    Machines

    - Increased operational

    effort

    - Increased control

    - OS choice

    - Lower operational effort

    - Built-in fault tolerance

    - Secure by default

    Table 1: Pros and Cons Summary - Windows Azure Virtual Machines and

    Windows Azure Cloud Services

    By the same token, there are some aspects

    of Azure Cloud Services that may be consid-

    ered drawbacks:

    Windows Only. Worker Roles can only bedeployed with Windows; Linux is not an

    option. Fixed OS Confguration. Users cannotcongure the OS, and must therefore

    develop applications that run on the

    pre-dened machine congurations

    available.

    Table 1 summarizes the pros and cons of

    using Windows Azure Worker Roles and

    Windows Azure Virtual Machines.

    SummaryMongoDB was built for ease of use, scal-

    ability, availability, and performance, and

    its quickly becoming an attractive alterna-

    tive to relational databases. Windows Azure

    provides a exible cloud platform for host-

    ing MongoDB, with two deployment models

    to choose from. Developers and enterprises

    looking at deploying MongoDB on Windows

    Azure should consider the pros and cons

    discussed here when evaluating which

    option is most appropriate for them. We

    hope that this paper helps customers better

    understand these solutions, how they work,and how to assess them.

    To learn more about MongoDB and how

    to deploy it in the cloud, or to speak

    to a sales representative, please email

    [email protected].

    7

  • 7/29/2019 10gen Mongodb Microsoftazure White Paper

    8/8

    New York Palo Alto Washington, DC LondonDublin Barcelona Sydney

    US 866.237.8815 INTL +1 650.440.4474 [email protected]

    Copyright 2013 10gen, Inc. All Rights Reserved.