David Janes' Code Weblog

March 13, 2009

AUAPI: how to encode geographic location

auapi · David Janes · 7:55 am ·

Geographic location is a little more problematic to encode than other data types we’ve looked at for the Almost Universal API. The issue is that standards for adding geographic information are either overly complicated or they encode information in a hard-to-use-from-JSON manner.

Using GeoRSS

GeoRSS provides two ways (well, way more I’m sure) of adding geographic point information to a blog post (which is essentially the same as AUAPI item). From the webpage:

<georss:point>45.256 -71.92</georss:point>

which would be encoded as:

{
 "georss:point" : "45.256 -71.92"
}

and

<georss:where>
   <gml:Point>
      <gml:pos>45.256 -71.92</gml:pos>
   </gml:Point>
</georss:where>

which would be encoded as:

{
 "georss:where" : {
  "gml:Point" : {
   "gml:pos" : "45.256 -71.92",
  }
 }
}

Neither of these look particularly satisfying because the lat/lon is encoded as a string. We could encode as an array of numbers, but then we have to write a custom transcriber for converting to XML.

Using hCard

Another option would be to use the geo.latitude, geo.longitude attributes in hCard. Unfortunately, hCard requires elements such as FN and in general imply that we are talking about a person or organization, so this isn’t really satisfying either.

How AUAPI recommends geographic information should be encoded

'geo': {
   'latitude': 34.743763,
   'longitude': -86.572568
}

This in fact, is exactly the microformat’s geo standard.

March 3, 2009

PortableContacts and the Atom syndication standard

auapi · David Janes · 6:56 am ·

Thanks to Kevin Marks, I’ve just stumble onto the Portable Contacts effort. They seem to have thought quite a bit about the hCard/vCard serialization issue. I’ve added my two bits, in particular making the claim that they should consider making there proposal more compatible with existing consumers and infrastructure by piggybacking on top of the Atom syndication format, rather than requiring PC consumers to redevelop all this infrastructure around their format. This, in essence, is what the Almost Universal API idea is about.


I’ve been thinking and writing recently about how APIs can be made compatible with each other – that is, the same consumers can be used with the results from different APIs – and Kevin Marks pointed me this way, as I had written about hCard serialization.

However, it occurs to me that PC could be made to overlay the Atom syndication standard with very few changes (by the looks of it, you’re fairly late into the design cycle though). For example, this:

{
  "id": "703887",
  "displayName": "Mork Hashimoto",
  "name": {
    "familyName": "Hashimoto",
    "givenName": "Mork"
  },
  "birthday": "0000-01-16",
  "gender": "male",
  "drinker": "heavily",
  "tags": [
    "plaxo guy",
    "favorite"
  ],
}

Could be encoded like:

{
  "id": "703887",
  "title": "Mork Hashimoto",
  "updated" : "2008-...",
  "published" : "2007-...",
  "category" : [
   { "term" : "plaxo guy", },
   { "term" : "favorite", },
  ],
  "contact" : {
    "name": {
      "familyName": "Hashimoto",
      "givenName": "Mork"
    },
    "birthday": "0000-01-16",
    "gender": "male",
    "drinker": "heavily",
  }
}

Which is almost the same, except now that your XML serialization can be Atom (noting of course there will be changes for the paging elements, etc. that you’ve defined). The immediate implication of this is that you’re working within a large existing infrastructure that knows about update notification, has tools for display, and so forth.

March 2, 2009

AUAPI: encode images using media:rss

auapi · David Janes · 4:34 pm ·

MediaRSS is the way to encode images in the Almost Universal API (AUAPI). Because MediaRSS encodes its values in attributes, we use the @ symbol to prefix keys.

Example 1 – a single image

'media:content': {
    '@medium': 'image',
    '@type': 'image/jpeg',
    '@url': u'http://farm3.static.flickr.com/2165/2271778128_b59c01a695_m.jpg'
}

Example 2 – thumbnail

'media:thumbnail': {
    '@url': u'http://farm3.static.flickr.com/2165/2271778128_b59c01a695_t.jpg'
}

Example 3 – multiple images in different sizes

'media:group': {
    'media:content': [
        {'@medium': 'image',
        '@type': 'image/jpeg',
        '@url': u'http://farm3.static.flickr.com/2165/2271778128_b59c01a695_m.jpg'},
        {'@medium': 'image',
        '@type': 'image/jpeg',
        '@url': u'http://farm3.static.flickr.com/2165/2271778128_b59c01a695_s.jpg'},
        {'@medium': 'image',
        '@type': 'image/jpeg',
        '@url': u'http://farm3.static.flickr.com/2165/2271778128_b59c01a695_b.jpg'}
    ],
    'media:thumbnail': {
        '@url': u'http://farm3.static.flickr.com/2165/2271778128_b59c01a695_t.jpg'
    }
}

AUAPI: encoding hCards in JSON

auapi,aumfp,semantic web · David Janes · 9:15 am ·

The best model for describing people is the vCard standard, RFC 2425 and RFC 2426. The microformats community has adapted the vCard standard for serialization into HTML using hCard. In the Almost Universal API (AUAPI), people and organizations should almost always be described using a JSON-encoded hCard.

It is difficult to describe, without going into great minutiae, what the difficulties are in transforming the hCard and vCard standards into a pleasant looking and more importantly an easy-to-use hierarchy: there are certainly a number of edge cases that one would have to deal with it! There’s certainly an argument for just encoding hCard/vCards as a straight vCard serialization – at least in terms of simplicity of encoding. The issue is that the end consumer (which I believe should be the strongest focus) really has to do the dirty work in grouping everything together themselves.

Algorithm

This algorithm is destructive to the data structure it works upon, so generally you’ll be make a copy first.

  • note that though we reference to all upper, mixed case, camel case and so forth hCard attributes, all attributes are actually physically encoded in lower case with “-” separators
  • let the “groupers” be ADR, GEO, N, ORG, TEL. Groupers group together attributes that are related (such as FirstName and LastName)
  • let the “narrowers” be Home, Work, Parcel, Postal (and no-narrower). Narrowers assign a specific meaning to a value, i.e. this a Work phone number.
  • assume each value is described by a number of attributes, i.e. “416-515-5555″ can be described by ( TEL, Work, Mobile )

Then:

  • for Narrower, then for each Grouper
    • create a dictionary ‘subd’
    • for each values that is described by the ( Narrower, Grouper )
      • for each remaining attribute (besides Narrower and Grouper), add to subd
      • if the value was fully described by ( Narrower, Grouper ), add to subd under the key ‘@’
    • for key, value in subd
      • add to the final result
      • if narrower is not ‘no-narrower’, add ‘@narrower = narrower’
    • add subd to the result under the key Grouper
  • add all remaining values from the original hCard to the result, noting that
    • if the value is described by a Narrower, we encoded it as a dictionary with ‘@narrower = narrower’

Clear? Well, the examples below will help. We the “416-515-5555″ above we would get:

{
 "hcard:hcard" : {
  'tel' : {
   '@work' : 'work',
   'mobile' : '416-515-5555',
  }
 }
}

Code

The source code for this algorithm is in the AUMFP tree, in file vcard.py function decompose (see around line 1083)

Namespace

All JSON encoded hCards are in the namespace hcard:. In the AUAPI serialization, this namespace should only be on the enclosing element, all children will be assumed to be in the namespace. I am currently using the URI http://purl.org/uF/hCard/1.0/ for this namespace (when XML serializing); this may change in the future.

Example 1 – home phone number from whitepages.com

{
 'hcard:hcard': {'adr': {'country-name': u'United States',
                         'locality': u'Huntsville',
                         'postal-code': '35801-2908',
                         'region': 'Alabama',
                         'street-address': u'1114 Humes Avenue NE'},
                 'fn': u'Jack Smith',
                 'geo': {'latitude': 34.743763000000001,
                         'longitude': -86.572568000000004},
                 'n': {'family-name': u'Smith', 'given-name': u'Jack'},
                 'tel': {'voice': u'256-539-8788'}},
}

Example 2 – work phone number from whitepages.com

{ 'hcard:hcard': {'adr': {'country-name': u'United States',
                         'locality': u'Gurley',
                         'postal-code': '35748-8715',
                         'region': 'Alabama',
                         'street-address': u'148 Little Cove Road'},
                 'fn': u'Jack Smith',
                 'geo': {'latitude': 34.698258000000003,
                         'longitude': -86.383027999999996},
                 'n': {'family-name': u'Smith', 'given-name': u'Jack'},
                 'org': {'organization-name': u'Alldyne Powder Technoliges'},
                 'tel': {'@work': 'work', 'voice': u'256-776-1238'}},
}

Example 3 – hCard directly to JSON

{ 'hcard:hcard': {
                 'adr': {u'country-name': u'United States of America',
                         u'locality': u'San Francisco',
                         u'region': u'CA'},
                 u'fn': u'Tantek \xc7elik',
                 u'logo': u'icon-2007-128px.png',
                 'n': {'family-name': u'\xc7elik',
                       'given-name': u'Tantek'},
                 u'photo': u'http://tantek.com/icon-2007-128px.png',
                 u'url': u'http://feeds.technorati.com/contact/tantek.com/#hcard'},
}

March 1, 2009

AUAPI: JSON to XML serialization

auapi,ideas,tips · David Janes · 8:40 am ·

Here is a brief outline of how one would “naively” transform Almost Universal API‘s (AUAPI) JSON into XML. We say “naive” because in general one wants to make a transformation into a specific XML application: Atom, RSS, OPML, KML, etc.. In those cases, one has to rename and rework certain elements first for standards compliance, then complete the naive transformation for remaining elements.

Walking

Walking JSON objects is done depth first. Most of the complexity involved is in handling dictionaries, which can be valued as being comprised of ( key, value ) pairs. For each dictionary, we are creating an XML node whose properties are defined as follows:

  • keys beginning with @@ are ignored
  • the key @ means “the text” of the  node (the examples will make this more clear)
  • other keys beginning with @ are attributes of the node
  • all other keys are defining children of the node

There are number of complexities that have to be addressed; for this I suggest looking at the examples or source code.

Namespace handling

  • collect all the namespaces used in the JSON and add to the root XML node
  • if any JSON element has a namespace, assume that namespace is inherited by its children

Code

You can see the code for this in the AUAPI source base in api.py in XMLAPIWriter.TranscribeNode.

Example 1

{
    "numbers" : [ 1, -0.23, ],
    "strings" : [ "bob", "caf\xe9", ],
    "booleanish" : [ True, False, None, ],
}
<root>
    <numbers>1</numbers>
    <numbers>-0.23</numbers>
    <booleanish>True</booleanish>
    <booleanish>False</booleanish>
    <booleanish />
    <strings>bob</strings>
    <strings>caf\xc3\xa9</strings>
</root>

Example 2

{
    "a1" : {
        "b1" : 1,
        "b2" : 2,
    },
    "a2" : {
        "b3" : "hi",
        "b4" : "there",
    },
}
<root>
    <a1>
        <b1>1</b1>
        <b2>2</b2>
    </a1>
    <a2>
        <b4>there</b4>
        <b3>hi</b3>
    </a2>
</root>

Example 3

{
    "@attribute" : "hello",
    "@bttribute" : "there",
    "a" : "some string",
},
<root attribute="hello" bttribute="there">
    <a>some string</a>
</root>

February 28, 2009

AUAPI: the Atom core vocabulary for items

auapi · David Janes · 7:58 am ·

To quickly review, the AUAPI sees API results as composing of two parts. A “response” (formerly the “meta”) and “items”, composed of one or more items. This division is documented in Work Object Records.

Here is how we use Atom to encode API items in the AUAPI (we will document “response” in a different post). This sticks fairly closely to the Atom standard, the differences being in how each term is serialized into JSON for convenience and ease of use – and that there’s no real required elements. Don’t forget that each “item” is a discrete unit returned from an API and represents “whatever” – a Flickr photo, an Amazon product

  • title (string, plain text) – the title; this is a plain text string, i.e. no entities or HTML allowed
  • id (string, plain text) – a unique ID (in the context of the API being called) for this result
  • content (string, HTML) – the complete text; this is HTML
  • summary (string, HTML) – the summary text; this is HTML.
  • link (string, URL) – this is the “main” link of whatever the item represents, a page on Amazon, a blog post’s original HTML and so forth
  • links (array of dictionary) – these are other links related to the item; the format is documented below
  • category (array of dictionary) – tags/categories; the format is documented below
  • author (a string or dictionary) – if a dictionary, the format is documented below
  • updated (string, Atom datetime format) – when the item was last updated
  • posted (string, Atom datetime format) – when the item was originally posted; if using updated and/or posted, always use updated and then use posted only if you have a meaningful difference.

Links Format

links are results related to the item. The most important link should be encoded in the link item. links are a list of dictionary, each dictionary containing (further documentation):

  • href (string, URI)
  • rel (string, enumeration)
  • type (string, MIME type)
  • hreflang (string, language code)
  • title (string, plain text)
  • length (string, integer)

Category Format

category are the tags for the item, and are a list of a dictionary each dictionary containing (further documentation):

  • @ (string, plain text) – the tag
  • rel (string, from enumeration)
  • scheme (string, URI)

Author Format

The author can be encoded as a string (containing the author’s name) or as a dictionary, or as a list of dictionaries. If a dictionary, this is how it should be encoded (further documentation)

  • @ (string, plain text) – the author’s name
  • uri (string, URI)
  • email (string, email address)

Example

This is a API result from Whitepages.com, encoded as AUAPI. I have removed non-Atom items and shortened long text for clarity.

{'content': "<div class="vcard">\n<div class="fn">Jack Smith</div>...\n</div>\n",
 'id': '40b296d1a95f3b379a8108b27daf009c',
 'links': [{'href': 'http://www.whitepages.com/16176/t...',
            'rel': 'related',
            'title': 'Find Neighbours',
            'type': 'text/html'},
           {'href': 'http://www.whitepages.com/16176/track...',
            'rel': 'related',
            'title': 'Map',
            'type': 'text/html'},
           {'href': 'http://www.whitepages.com/16...',
            'rel': 'related',
            'title': 'Map',
            'type': 'text/html'},
           {'href': 'http://www.whitepages.com/16176/track/102...',
            'rel': 'alternate',
            'title': 'Whitepages.com',
            'type': 'text/html'}],
 'title': u'Jack Smith'}

February 27, 2009

Introducing The Almost Universal API

auapi · David Janes · 9:54 am ·

The Almost Universal API is a culmination – or at least a local maxima – of several projects I’ve been working on for the last few months: in particular, Web Object Records, Pipe Cleaner and PyBM. The AUAPI is:

  • a way of presenting results returned from many popular APIs
  • a Python library to actual do this

I’ll be making several posts about how to use the AUAPI, including installation instructions. The plan is to make an easy_install version, but initially this will be a SVN from Google Code thing.

The AUAPI is mainly about how to present results returned from APIs, not how to send data to APIs nor how to encode requests. The encoding is designed to “look good” in JSON and be easily and algorithmically encoded into XML. The AUAPI data model is based on:

  • Atom, the “core” vocabulary, particularly providing title, content, summary, updated, category, link and links
  • MediaRSS, for encoding images
  • hCard, for encoding information about people
  • hCalendar, for encoding information about events

There are several “maybe” standards too:

  • hProduct, for encoding information about things
  • Google’s SGN URLs, for providing a universal way of talking about accounts

I have already worked a fair number of APIs into the AUAPI. These are documented on the Mashematica Wiki:

Powered by WordPress

Switch to our mobile site