Friday, September 18, 2009

Book Review: Effective REST Services via .NET For .NET Framework 3.5

 image
I’m currently reading Kenn Scribner and Scott Seely’s book Effective REST Services via .NET For .NET Framework 3.5. This is a great book which gives a thorough background on the history of REST, quoting often from Roy Fielding’s Ph.D. dissertation on the subject. The authors do not treat the subject with “kid gloves” or “framework gloves”. Instead, they exlain the HTTP protocol and its purpose, benefits, and its drawbacks.

Warning: I’m not really that good at giving book reviews for the general audience, and I feel like there’s already plenty of information on the web about all these subjects, but for own personal benefit it helps me to summarize subjects in my own words to clarify them. For me, it’s best for me to review a table-of-contents and then just give my best shot at summarizing what I’ve read in brief paragraphs using examples. This misses lots of details, of course, and some of what I remember won’t even come from the book.

For me, this is crucial research material for two projects I am working on. One of these is currently public and is a simple RESTful gaming platform. You can see the progress so far on that at http://apps.ultravioletconsulting.com/Skookzee.

So, don’t mistake this as a book review for anyone other than myself, but I hope you enjoy it anyway.

Chapter Summaries

1: RESTful Systems: Back to the Future

 

This chapter discusses how early on web “services” consisted simply of web servers and URIs that delivered chunks of data. Sometimes this was done through CGI processing that responded to the invocation of the incoming request, but eventually most web sites became simply file servers. Later, SOAP-based services became popular, but the problem here was that SOAP services are intrinsically Remote-Procedure-Call centric (RPC). The problem with this is that SOAP services typically encapsulate both NOUNS and VERBS, or Resources and Actions, Objects and Methods –- pick your metaphor – behind a single URI.

You might ask why this is a problem. One reason is that the HTTP and URI protocols were designed, as Roy Fielding describes in his dissertation, to work well in a distributed network and to enable distributed caching. Because SOAP specifically, and RPC in general, deal in coupling an invocation-time command to a remote resource, there is no architectural support to cache the results. Indeed, the idea of caching a call to a SOAP web service is something that each service would have to completely encapsulate.

Conversely, the way the web was actually designed was with a few key principles in mind to enable the distributed, cacheable structure of resources. First of all, the URI stands for Universal Resource Identifier, and a resource is a noun, it’s not an action or a method call. SOAP services are about exposing a single noun, the service entry point, and then encapsulating any number of nouns and verbs behind that one single noun.

So, in a RESTful system, each URI refers to a single, unique representation of a remote resource. For example, http://agilefromthegroundup.blogspot.com/2009/09/done-features-equals-velocity.html refers specifically to the HTML representation of my post about development velocity and tracking. There is, however, another URI which refers to this resource in another way: http://agilefromthegroundup.blogspot.com/feeds/posts/default/4352538909387142862. If you click this link you will get the content of that same blog post, but as XML inside of an “<entry>” element.

A good depiction of this distinction is found in the document Architecture of The World Wide Web on the W3C web site:

image

This becomes important because of REST’s concept of a single, generic interface, or a limited number of verbs/actions/methods. In terms of HTTP this turns out to be the standard set of HTTP methods: GET, PUT, POST, DELETE, HEAD, OPTIONS. Each of these methods must behave in a consistent, expected way when combined with a requested URI.

Another good reference, for quicker reading, is the summary of the web architecture document. This gives a list of prescriptive good practices for design.

For example, assuming I were authenticated and that Blogger supported this ability (I’m not sure whether it does) then issuing a request with each of the common HTTP methods above against the single URI http://agilefromthegroundup.blogspot.com/feeds/posts/default/4352538909387142862 would produce the following results:

Method Entity Body Contents Expected Result
GET Blank Returns the HTTP headers for and the current representation of the resource, in this case an XML document. But, does not MODIFY the resource in anyway specified from the request.
HEAD Blank Returns only the HTTP headers associated with this resource, but not the XML document itself
OPTIONS Blank Returns the list of HTTP methods this resource supports, such as GET, PUT, POST, DELETE
PUT A complete replacement XML <entry> document A response code indicating whether the replacement was accepted.
POST Arbitrary content, perhaps a string of content intended to be added as a comment to the blog post A response code indicating the result, and possibly the URI of a newly created, subordinate resource, such as a direct URI to the comment added to the entry.
DELETE Blank A response code indicating whether the resource was deleted.

There is much more to be said about this chapter, but I leave it to you to read it for yourself and enjoy what you learn.

2: The Hypertext Transfer Protocol and the Universal Resource Identifier

This chapter goes into more detail about the HTTP verbs and the standard response codes and header values, then discusses much more about specific HTTP and REST anti-patterns. These are, broadly:

Anti-Pattern Name Description Real-World Examples
GET Tunneling Specifying method information to the server via GET, which is supposed to NOT modify resources from client-specified commands (idempotency) Flickr used to have GET methods with commands like “delete” embedded in them. Google’s look-ahead pre-fetch tool caused HAVOC on many user’s accounts as a result!
POST Tunneling Specifying method information inside of the POST entity body. SOAP based RPC services are a wholesale violation of this. They embed arbitrary command names inside the envelope.
Misused Content-Types    
Miusing Status Codes    
Misusing Cache    
Cookies    
Lack of Hypermedia Support    
Lack of Self-Description    
 

The concept of URI and Addressability is covered thoroughly. This is so important for so many reasons to get into, but suffice it to say for Search-Engine-Optimization (googleiciousness / bingability) a URI should identify a single resource unambiguously so that search engine indexes can link to it.

Aside from this there is a wide range of computer science concepts wrapped up inside the old trusty URI.

For example, when you look at the URI: http://agilefromthegroundup.blogspot.com/feeds/posts/default/4352538909387142862, you actually have a specification that tells a user agent how to attempt fetching the resource (HTTP protocol), where to start looking for it (agilefromthegroundup.blogspot.com), where within that space to look further (/feeds/post/ etc). What you also have is a contract that is late-bound by definition. The intent of the URI is to provide, of course, a Universal Resource Identifier, which specifies NOTHING about what actually resides behind that identifier. What resides behind it is completely up to the control of the server. It can be absolutely anything. It just so happens to usually be HTML, but of course can be XML, PDF, TXT, etc.

3: Desktop Client Operations
4: Web Client Operations
5: IIS and ASP.NET Internals and Instrumentation

0 comments: