rajsingh.org blog

the geoweb, interoperability, OGC, and random rants
August 10th, 2007

REST has been a hot topic this year in the geo world. There’s a discussion group, a geographic data server, many blog posts, and email discussions. I’ve been mulling over what this means to OGC over the last couple months, reading RESTful Web Services, and discussing with the various advocates around the community. After all this, I think I know what’s going on, but I don’t think there’s any one clear explanation (despite some nice pieces of the puzzle here and here) available, and there has certainly been little effort to analyze the REST architecture in relation to geographic information systems theory, so that’s what I’ll try to do now.

At its very core, a REST architecture is centered on gaining access to a generally indivisible piece of information, which is generally called a resource. A REST system expects resources to exist at standard, unchanging, URLs (everybody knows cool URLs don’t change). In the case of Atom–the most cited example–this piece of information is the <entry>. Sure there are other resource types, but their primary purpose is to help you find and understand the information resources.

Right away we have a culture clash when it comes to GIS (geographic information systems). This crowd has been trained in geography departments to think of the data set as the primary piece of information. A data set is often called a layer or coverage, and represents some real-world phenomena like vegetation or rainfall or census tracts. The points, lines, polygons or pixels that comprise the data set are of secondary important to the concept of the data set itself. In fact, it’s considered un-cool to think of the shapes (or pixels in the case of imagery) that make up the data set to be concrete objects at all. Rather they are useful proxies of the real objects of concern. For example, you can’t really capture rainfall in a database because it’s not practical to measure every drop of rain. We estimate and approximate and end up storing some model of rainfall in the database, but we don’t ever have the temerity to say we are capturing the one and only representation of rainfall.

Now a programmer might respond, “screw your ivory tower theoretical nonsense. You’ve got information in a computer, and you read and write to it, so keep your conceptual information models to yourself.” And in large part I agree with this viewpoint, but there are some places where we’ll run into trouble. For example, most of the REST material I’ve read treats images as indivisible, binary resources. This won’t work in the geo field, where images are in the gigabyte and terabyte range and you’ve got to design services that provide access to portions of those images. However, aside from imagery I don’t see much of a problem with using a resource-oriented approach to information access. In fact, in the sensor web arena I think a REST approach makes a ton of sense.

UPATE: Sean is convincing me that imagery isn’t a tough problem for RESTies, even if I don’t know what “empirical orthogonal function decomposition” is. Is this related to what I know as imagery formats using wavelet compression?

Whew. I’ve spent a lot of time talking about resources, but if you don’t get the basics right, everything else is meaningless. Now I’ll move on to the other basic, which is accessing resources through “standard” URLs, and additionally, taking full advantage of the HTTP headers available in URL requests. I won’t get too deep into the technology here, because others do that better, but I will say that to me it seems the big point being made is that URLs work in a lot of places–Web browsers, email, all programming environments, etc. Don’t use anything more than URLs unless you really have to. And if you think you really have to, think again and again.

The other piece of the URL issue is using HTTP headers to do things like set expiration dates on information, say some resource is unavailable at the moment, or unavailable forever because it’s been deleted, and lots of other things that HTTP header geeks know about. Now some might say that the arcane wizardry of HTTP header usage is just as complex as the things REST advocates complain about, like SOAP headers and XML query languages. I agree, but the big difference is that the core infrastructure of the Internet–Web servers and routers for example–comes with built-in software to do useful things with HTTP headers, and if you don’t take advantage of them, you are losing the opportunity to get an amazing amount of useful functionality at a much lower cost than paying programmers to build it into your application. This includes all kinds of content description (MIME types), error handling, caching and load balancing to name a few. I don’t see any problem with this principle. It aligns perfectly with the OGC principle to leverage as much of the common Web infrastructure as possible before inventing new, geo-specific technologies. Even advocates of heavier-weight service-oriented architectures should be able to get behind this idea. So read the HTTP spec and use it. And once again, if you find yourself in a situation where you can’t use it, think again, and then once more…

To summarize, I talked at length about information resources, URLs, and HTTP. This is far from a comprehensive introduction to REST, but it hopefully a nice companion to the more general, programming oriented content available elsewhere. Next time I’ll talk about search and retrieval. And if I get enough positive feedback, I might even try to describe how this would all play out in building a collaborative spatial data infrastructure. I’ll be accepting votes in the form of drinks and/or cool ideas at the upcoming URISA and FOSS4G conferences.

6 Responses to “REST and GIS”

  1. import cartography Says:

    REST Can’t Handle Rasters and Coverages?…

    There may be geospatial problems that REST can’t tackle, but access to arbitrary regions of a
    coverage is not one of them….

  2. map butcher » I feel…..confused Says:

    [...] to be reading about REST. So I begun to trawl the net for some good resources and as if by demand Raj has begun a great mini series to help idiots like me try and get a better handle on REST and its [...]

  3. Sean Gillies Says:

    Good stuff here, but you’re wrong about images. Some comments here: http://zcologia.com/news/541/rest-cant-handle-rasters-and-coverages/.

    I also have a quibble with “… the OGC principle to leverage as much of the common Web infrastructure as possible before inventing new, geo-specific technologies.” First of all, OGC specs name HTTP as one of the possible distributed computing platforms, and thereby attempt to abstract away the Web (a pointless exercise, in my opinion). Furthermore, the OWS Common spec (one of the most recent) has several glaring errors with respect to HTTP. Section 11.1 states that HTTP supports 2 methods: GET and POST. Only if you ignore PUT and DELETE which have been around since 1999. In that same section, the spec recommends a gross abuse of POST to fetch capabilities or features. Section 11.7 on service responses is a bit better, but fails to make recommendations for HTTP status codes. The result is that services generally return errors with an erroneous “200 OK”, rendering intermediaries useless. Finally, a number of us feel that “paged” responses is a Web best practice that OGC services could adopt:

    http://chris.narx.net/2007/04/25/wfs-feature-paging-yes-please/

  4. rajsingh Says:

    Sean,
    A lot of the foundational work on the OGC architecture was done around or before 2000. Looking back on it now, it does seem a little pointless to maintain much focus on any other platform than the Web, but back then things like CORBA looked like they might get widespread adoption. We should probably (and I think we are) think seriously about being a Web-only organization. But that’s just my opinion at this point.

    In my version of OWS Common (OGC 06-121r3) Section 11.1 states that HTTP supports 2 *request* methods: GET and POST. A small difference, and we have ignored PUT and DELETE, and I know you think using POST for requests is wrong. Same thing on the HTTP status codes issue. If these REST design styles start solving problems better than what exists now, they’ll probably begin to show up much more in OGC standards.

  5. Sean Gillies Says:

    Since request/response is HTTP’s only means of communication, I left out the rendundant “request”. PUT and DELETE are request methods just as truly as GET and POST.

    I appreciate your acknowledgment that OWS is not very Web-oriented. I’m not looking for Maoist self-criticism from the OGC here — just recognition that there is a problem, which is a necessary first step forward.

  6. Software Development Guide Says:

    Software Development Guide…

    I couldn’t understand some parts of this article, but it sounds interesting…

Leave a Reply