Introduction
As technologists, we’re all familiar with REST – Representational State Transfer:
Representational state transfer (REST) is a style of software architecture for distributed hypermedia systems such as the World Wide Web. As such, it is not strictly a method for building what are sometimes called “web services.” The terms “representational state transfer” and “REST” were introduced in 2000 in the doctoral dissertation of Roy Fielding, one of the principal authors of the Hypertext Transfer Protocol (HTTP) specification.
REST talks about how we address and use information on the World Wide Web. I’d like to introduce the concept of WORK - Web Object Records – which defines how we think about data being transmitted across the web.
WORK is not a descriptive standard – it is not telling you what to do, it’s describing what you are doing. The hope is that by having a delineated description of what we are doing, we can then write tools to cut through the babel of API standards being currently promulgated by a multitude of vendors; we can standardize the unstandarded.
Defintion
A WORK item:
- is conceptually a JSON-like dictionary, consisting of string keys and object values
- each value in the dictionary is a (usually-) shallow JSON-like object, that is:
- a dictionary, list or basic value type
- the basic value types are Unicode strings, floating point numbers, integers and booleans
- the difference between strings and other basic value types is fuzzy (data encoded in XML, HTML form data)
- null/None is rarely explicitly sent, instead it is the absence of a value being defined
- the difference between a list of objects and a single object is fuzzy and fluid (XML children)
- the data model defined implicitly by “what you see” is as useful as formal definition elsewhere
- there are no cycles or explicit ways of cross referencing within a WORK item
- WORK items can – and often are – nested within another WORK item, but only one level deep
Benefits
Because we technologists inherently use a WORK model of data, it explains:
- why we prefer XML over CSV – because we like to store more that a single atomic value in a “cell”
- why we prefer JSON to XML – because we think about data as JSON-like WORK objects, not as nested text constructs
- why we don’t adopt RDF (in it’s variants) for transmitting data, implementing APIs and so forth – because we don’t think in graphs
- why we find it easier to work with web data in Python and Ruby than in Java – because those languages explicitly use the same model for storing data as we think about the data
Examples
Here are a few examples of how one can view common API / feed results as WORK items.
RSS feeds
RSS is defined by a two level WORK hierarchy. The first level is:
{
"channel" : CHANNEL-WORK,
"item" : [ ITEM-WORK, ITEM-WORK, ... ]
}
A ITEM-WORK looks like:
{
"title" : STRING,
"link" : STRING,
"description" : STRING
}
If you look at at the XML for a RSS feed with only 1 ITEM, there’s no way to tell without reading the spec than ITEM repeats. This is what we mean by saying that the difference between a single object and a list is sometimes fuzzy.
White Pages API
The White Pages API is also a two level WORK hierarchy (this pattern is very very common). Here’s the first level, slightly more complicated than RSS due to the XML serialization:
{
"meta" : META-WORK,
"listings" : {
"listing" : [ LISTING-WORK, LISTING-WORK, ... ]
}
}
A LISTING-WORK looks like:
{
"geodata" : OBJECT,
"phonenumbers" : OBJECT,
"business" : { "businessname" : "Fred's Pizza" },
"address" : OBJECT
}
The OBJECTs above in the White Pages API are somewhat complicated, but tractable (as we shall see in another post)
Amazon AWS API
The Amazon Associates Web Service allows one to retrieve information about Amazon products via XML responses. The response is a little convoluted but still recognizable:
{
"Items" : {
"RequestHeader" : REQUEST-HEADER-WORK,
"Item" : [ ITEM-WORK, ITEM-WORK, ... ]
},
"OperationRequest" : { ... }
The individual ITEM-WORK describe products:
{
"ASIN" : STRING,
"ImageSets": {
"ImageSet": {
"LargeImage": {
"URL": "http://ecx.images-amazon.com/images/I/31e55zf53VL.jpg",
"Width": "300",
"Height": "300"
},
},
"ItemAttributes": {
"Title": "Under a Blood Red Sky - Deluxe Edition CD/DVD",
"Manufacturer": "Island",
"ProductGroup": "Music",
"Artist": "U2"
}
}
Google search result
We can also look at HTML pages as if they’re returning data as WORK items. This could be explicit if rules such as microformats or RDFa were used, or once again it could be just a convenient way of modeling the data. Here’s a hypothetical WORK item for a single result returned from a Google:
{
"title" : "Bombardier Inc. - Bombardier - Home",
"url " : "http://www.bombardier.com/",
"description" : "Manufacturers of a large range of regional...",
"links" : [
{
"title" : "Careers",
"url" : "...",
},
{
"title" : "Business Aircraft",
"url" : "...",
},
...
]
}
Conclusion
WORK gives us a powerful way of looking at – at simplifying – data that’s retrieved over the Internet via REST calls. If we can view API results as being made up of standardized components – WORK items – then the amount of work we need to do to work with new APIs can be absolutely minimized.
Designing and writing some of these tools is my next task.