IMG Seminar17 January 2007
If Web Services are the Answer, What’s the Question?
Michael Parkin
Supervisor: John Brooke
IMG Seminar17 January 2007
Acknowledgments & Sources
• Thanks to Dean Kuo, Mark McKeown, Bruno Harbulot, Donal Fellows
• Lots of stuff taken from blogs and web articles
• All quotes are acknowledged and linked to where possible
IMG Seminar17 January 2007
Overview
• Grid computing & it’s requirements
• Introduction of architectural styles:
• Web Services Architecture (WSA) and WS-*• Representational State Transfer (REST)• Instant Messaging (IM)
• Evaluation of Web Services, REST and IM
• Using the requirements of Grid computing
• The question is?
IMG Seminar17 January 2007
Grid Computing
IMG Seminar17 January 2007
Grid Computing is...... a new infrastructure that builds on today’s Internet and
Web to enable and exploit large-scale sharing of resources within distributed groups.
“The Grid” Ian Foster, ClusterWorld Magazine (January 2004)
... coordinated resource sharing among dynamic collections of individuals, institutions, and resources.
“The Anatomy of the Grid” Ian Foster, et al (2001)
... internet-scale distributed computing.
“Distributed Computing Economics” Jim Gray [Microsoft Research]
IMG Seminar17 January 2007
Non-functional Requirements
• Scalability and interoperation
• Large/internet-scale dynamic sharing of resources
• We need to connect lots of diverse/heterogeneous entities together
• Pervasiveness
• ... through simple clients and consistent implementations
• Network efficiency
• ... this is a distributed system, after all
• Return to these later ...
IMG Seminar17 January 2007
Web Services Architecture & WS-*
IMG Seminar17 January 2007
Web Services Architecture
• WSA is just SOAP (structured XML) + WSDL
A Web service is a software system designed to support interoperable machine-to-machine interaction over a network. It has an interface described in a machine-processable format (WSDL). Other systems
interact with the Web service ... using SOAP messages.
Web Services Architecture, W3C
The only game in town
Anon (though attributed to many)
IMG Seminar17 January 2007
WSA Constraints
• A constraint is a condition that a solution must meet in order to be acceptable - a confinement of all possible solutions
• Lack of constraints on the WSA other than
• Use of SOAP + WSDL
• Service specific interfaces
• i.e. no restriction on interfaces or message representations
• Sender / receiver pattern
• “Effectively unconstrained” - all possible solutions are acceptable
[An Architecture’s] properties are created by the application of constraints
Roy Fielding’s PhD. Thesis
IMG Seminar17 January 2007
No constraints...
“The Sugar House” Richard Greaves, 2001
IMG Seminar17 January 2007
Web Services Standards
• Collectively known as Web Service stack or WS-*
• 50+ specifing aspects of reliability, security, transactions, etc.
• WS-* are constraints on the WSA
• Defined though standards bodies (OGF, W3C, OASIS, EMCA?) before being implemented
• No reference implementations
Web services standards compose together to provide interoperable protocols ... in loosely coupled systems. The specifications build
on top of the core XML and SOAP standards.
“Web Service Specifications” Microsoft
IMG Seminar17 January 2007
Grid & WS-*
• Since 2002 Grid computing is officially implemented using WS-*
• First through Open Grid Services Infrastructure ...
• ... then through the WS-ResourceFramework (WS-RF) family (2004)
• WS-RF describes the relationship between WSA & WS-*
• As Web Services are stateless WS-Resource describes a stateful resource
• “Each message carries some information that identifies the [stateful] resource behind the service which should be the logical receiver of the message” Savas Parastatidis [Microsoft]
IMG Seminar17 January 2007
The WS-RF Family
• WS-Resource
• WS-ResourceProperty
• WS-ResourceLifetime
• WS-BaseFaults
• WS-RenewableReferences
• WS-ServiceGroup
• WS-Notification
IMG Seminar17 January 2007
REST Architecture
IMG Seminar17 January 2007
REST
• REpresentational State Transfer
• The architecture of the Web introduced in Roy Fielding’s PhD. thesis
• Not a standard, but it uses common standards
• Best known implementation of REST is HTTP
• Resource - not service - orientation
Client Resource
Representation
Fetch representation
IMG Seminar17 January 2007Adapted from: “REST - The Better Web Services Model”, Stefan Tilkov
+ ID submitJob (u_id: ID, j:Job)+ void cancelJob (j_id: ID)+ Job getJobDetail (j_id: ID)+ Job[] getJobs(u_id: ID)
JobManagementService
+ ID createUser (u: User)+ User getUser (u_id: ID)+ User[] getUsers ()
UserManagementService
GETPUTPOSTDELETE
<<interface>>Resource
GET - list all jobsPUT - unusedPOST - add new jobDELETE - unused
/jobs
GET - get job detailsPUT - update jobPOST - unusedDELETE - cancel job
/job/[id]
GET - return userPUT - update userPOST - unusedDELETE - unused
/user/[id]
GET - list all usersPUT - unusedPOST - create userDELETE - unused
/users
GET - return user's jobsPUT - POST - create new jobDELETE -
/user/[id]/jobs
• The resource is the central abstraction - not the service
Service vs Resource Orientation
IMG Seminar17 January 2007
REST Constraints
• Highly constrained architecture (cf. WSA)
• Self-descriptive messages
• ... stateless interaction
• ... resource representation
• “Hypermedia as engine of client state”
• Resource identification
• Name things with unique URIs
• Uniform interface
IMG Seminar17 January 2007
Architecture with Constraints
IMG Seminar17 January 2007
REST and the Grid
• REST style is not used by the Grid
• Why can’t a Grid Virtual Organisation (VO) be a REST resource?
• Find VO member:
• Fetch /myVO/people/fred
• Returns representation of Fred in format requested
• If no representation returned, Fred’s not a member
• Used for role-based access control as part of my work
• Very lightweight and interoperable ...
IMG Seminar17 January 2007
Instant Messaging Architecture
IMG Seminar17 January 2007
• Hub and spoke/email-type infrastructure
• Not just text: Google’s Jingle library can stream VOIP and Video
Instant Messaging
IMG Seminar17 January 2007
IM Constraints
• Asynchronous, routed messaging
• Open connection to server is used as bi-directional message pipe
• Document-style interaction
• Complexity is in the message - not the interface (cf. REST self-describing message)
• Unique naming of servers and clients
• Number (ICQ) or email style (name@domain)
• Presence of clients can be determined
• Clients may have multiple sessions with one or more other clients and multi-user chats
IMG Seminar17 January 2007
Wider Constraints
Eixample, Barcelona
IMG Seminar17 January 2007
Just for Multi-User Chat?
• Multi User Chat can be used as a messaging ‘bus’
• ‘Plug-in’ services across domains
• Each client listens for a message it’s interested in
• Ignores other messages
• A lightweight collaborative environment
• Implemented using XMPP
IMG Seminar17 January 2007
Grid Computing Requirements
• Scalability
• Interoperation
• Pervasiveness / simple clients
• Network efficiency
• How does each architecture do?
IMG Seminar17 January 2007
Scalability / Interoperability
IMG Seminar17 January 2007
WSA Scalability
• Remember - no constraints
• ... therefore an infinite number ways to do things...
• ... which promotes coupling
• “Web services are effectively unconstrained, and therefore much more tightly coupled.” Mark Baker [W3C Working Groups]
• Loose coupling is better than tight coupling for scalability
IMG Seminar17 January 2007
WS-* Scalability / Interoperability
• Requires each service to be predictable and implement the same set of specifications in the same way
• Specifications MUST be unambiguous
• Interoperability through consistency of implementations
• Most WS-* specifications aren’t unambiguous ...
• ... and sometimes they’re deliberately ambiguous - e.g. WS-BA
• There’s no reference implementation to code to
• No consistency, therefore no reliability
IMG Seminar17 January 2007
WS-* Scalability / Interoperability
This document does a disservice to the community; it is ambiguous in many places, incomplete in others, and riddled with errors throughout. Indeed any company that implemented it would
end up with an unreliable system.
Werner Vogels [CTO, Amazon] on WS-Reliability
IMG Seminar17 January 2007
?
“Untitled” Richard Greaves, 2001
IMG Seminar17 January 2007
WS-Interoperability
Conformance to the Web services Addressing Test Suite does not by itself enable a party to claim conformance with the Web
Services Addressing specification.
Disclaimer from the WS-Addressing test suite
IMG Seminar17 January 2007
WS-Incoherence
• WS-Reliability or WS-ReliableMessaging ?
• WS-Management or WS-DistributedManagement ?
• WS-Eventing or WS-Notification ?
• WS-Context or WS-Session ?
• WS-RF or WS-Transfer ?
• Multiple ways of doing things
• e.g. SOAP vs Literal encoding, Document vs RPC style
• e.g. “4 ways to send SOAP attachments (DIME, MTOM, SOAP w. Attachments, WS-I Attachments)” Dan Diephouse [XFire author]
IMG Seminar17 January 2007
WS-* Stack Criticism
No matter how hard I try, I still think the WS-* stack is bloated, opaque, and insanely complex. I think it’s going to be hard to understand, hard to
implement, hard to interoperate, and hard to secure.
Tim Bray [Director of Web Technologies, Sun Microsystems]
Much of [WS-* is] recently invented, untested and unproven in the real world.
Sean McGrath [CTO, Propylon]
IMG Seminar17 January 2007
“If you don’t like it, don’t use it”
• OK, if within the same administrative domain - you are ‘God’
• Sixth fallacy of distributed computing is that “there is a single administrator” Peter Deutsch [author of the definitive Smalltalk implementation] via Mark Baker
Are there too many specs? If you think so, then there are. So don't use them all.
You don't have to use any of them.
Matt Powell [Web Platform and Tools Marketing Team, Microsoft]
IMG Seminar17 January 2007
Across Administrative Domains
• Grid computing is about creating virtual organisations across organizational boundaries
• “[Grid computing ] is distributed computing across multiple administrative domains” Dave Snelling (via Mark McKeown)
• You can’t control who implements what WS-* specifications ...
• ... or how they interpret and implement it ...
• ... or which version they implement
• Tighter - not looser - coupling is required to solve these problems
IMG Seminar17 January 2007
WS-Tooling Support
Tools are being created by people everywhere to make it so you can just indicate the capabilities you need and the rest will be done for you.
Matt Powell
Each toolkit vendor is likely to interpret the specification somewhat differently.
Werner Vogels [CTO, Amazon]
IMG Seminar17 January 2007
REST Scalability & Interoperability
• The best known implementation of REST is HTTP
• The Web:
• All clients can immediately interact with a resource once we know it’s address because of uniform interface and operation semantics
I look at Google and Amazon and EBay and Salesforce and see them doing tens of millions of transactions a day involving pumping XML back and forth over HTTP, and I can’t help noticing that they don’t seem to need
much WS-apparatus.
Tim Bray
IMG Seminar17 January 2007
Even Looser Coupling
• Loose coupling in time is also possible
• HTTP has asynchronous communications built in ...
• GET empty job representation
• POST completed job representation to asynchronous job submission resource (URL contained in empty job)
• Server returns a ‘202 Accepted’ message and either:
1. With “Reply-To” request header: a notification is sent to that URL when the job is complete
2. Without “Reply-To” request header: a “Location” response header is returned with a URL with the status/result
• ... in a standard way
IMG Seminar17 January 2007
IM Scalability & Interoperability
• Considering XMPP, an open XML message based IM implementation...
• Scales very well
• Jabber has 40-50 million users (source: Wikipedia)
• Each XMPP server can support ~30-50,000 concurrent users
• Interoperability seems to be no problem
• Lots of heterogeneous clients connecting to lots of heterogeneous servers
• Gateways to ‘closed’ IM implementations
• AIM, ICQ, etc.
• Well-defined and accepted message semantics
IMG Seminar17 January 2007
Network Efficiency
IMG Seminar17 January 2007
Network Efficiency - A Note
• In a network-based application it is often the communications which are the bottleneck
• Generally increased through the use of network intermediaries
• Caches, proxy servers, etc.
• Reduce the ‘distance’ a request has to go
• Relieves stress on the server/application
• An intermediary can only make a decisions when a client’s request shows, in the clear, the targeted resource, and the method requested is understood
Adapted from: “REST”, Roger L. Costello
IMG Seminar17 January 2007
• Most (all?) Web Services use SOAP/HTTP
• Can they re-use Web caches to increase efficiency?
• A SOAP message is always a HTTP POST operation
• A cache cannot know from the HTTP method whether the client is doing a read, write, update or delete
• A SOAP URI is always to the SOAP server, not to the actual target
• Cache server cannot know from the URI what resource is being requested.
WSA Network Efficiency
Client
Message
operation = highFive()
I don't know what operation is being requested.
I don't know which resource is being requested.
I must forward the request
WebCacheEPR
IMG Seminar17 January 2007
Reuse or Abuse?
[Web Services are] on the web but they aren’t of the web. They don’t use any of the web’s features, or interact with anything else
on the web. They just use HTTP as a transport protocol. They could just as easily run over TCP and get better performance.
Leonard Richardson [co-author “REST Web Services”]
IMG Seminar17 January 2007
REST Network Efficiency
• Uniform, clear semantics of the interface’s operations
• GET always means “retrieve the resource identified”
• PUT always means “replace resource identified with this new one”
• POST always means “submit this data for processing”
• DELETE always means “delete the resource identified”
• URI’s can be compared
• Allows optimization of the network (i.e. caching, comparison of metainformation)
IMG Seminar17 January 2007
IM Network Efficiency
• Connection overheads removed (client only connects once to server)
• Login to server once
• Open pipe is used as an asynchronous bi-directional messaging bus
• Only send messages once
• Stored on server if destination is unavailable
• Guaranteed in-oder delivery when client reconnects
• No need to poll - response comes asynchronously when ready
• But, all messages are routed via server - not the most efficient route
IMG Seminar17 January 2007
Pervasivenesspervade, v. “To spread, extend, diffuse; to
be present and apparent throughout.” (Source OED)
IMG Seminar17 January 2007
WS-* Pervasiveness
• Complexity of WS-* limits overall uptake of technology
• Clients are service specific
• Clients must be ‘active’
• ... no information about the possible state transitions for the client in SOAP messages
• ... must process each message even if content has not changed
• ... therefore client more complex
• Stack overhead unsuitable for restricted/lightweight clients
IMG Seminar17 January 2007
REST / IM Pervasiveness
• Simpler clients than WS clients as all possible state transitions are contained in the returned message through hypermedia
• “REST is particularly useful for limited-profile devices such as PDAs and mobile phones” Sun Microsystems
• More complex clients than REST, but simpler than WS-*
• New York Times: “... instant messaging is used in more than 80 percent of corporations”
• No of IM accounts in 2006: 995 million (Source: Radichi Group)
IMG Seminar17 January 2007
Evaluation
Client Simplicity Scalability Interoperation Network Efficiency
WS-* - - -- -REST ++ ++ ++ +
IM + +/++ +/++ +
IMG Seminar17 January 2007
• If Web Services are the answer:
• The question should be:
Finally ...What is the question?
What foundations do we choose for limited-scale, tightly-coupled,
implementation dependent Grid?
What foundations do we choose for massively-scalable, loosely-coupled,
implementation independent, consistent Grid?
No architectural constraints?
vs
Architectural constraints?
Or...
IMG Seminar17 January 2007
Summary
• Grid computing is based on WS-*
• WS-* is widely acknowledged to be complex
• Across administrative domains you can’t control who implements what and how they implement it
• Other architectures and technologies that meet the Grid’s requirements should be considered
• These have been proven to be interoperable and scalable - the ultimate goal of the Grid
Recommended