David Janes' Code Weblog

January 19, 2009

Atom as a Rosetta Stone for WORK objects

demo,pipe cleaner,work · David Janes · 6:31 am ·

WORK – Web Object Records – is a way of describing messages we pass over the web: a single header object called the “meta” and zero or more objects called “items”. Each object can be encoded as a JSON record, though we can access invidual items within each WORK object using a WORK Path which allows quite a bit of latitude for type coercision and vagarities in packaging.

Pipe Cleaner is a project I’ve been working on for the last two months that allows one to script data using WORK, to accomplish tasks such as remixing and filtering RSS feeds, read or produce OPML, make JSON interfaces and so forth. I actually have one live deployment which I will blog about soon and hope to have it beta productized for March.

Atom is a standard for syndicating feeds, not unsimilar to RSS but with a richer better described vocabulary. I already have one major “project” built around Atom: the hAtom microformat for describing microcontent and information that can be syndicated. hAtom has also been morphed by Microsoft to produce the Web Slice format, so you may be seeing that about. Atom is conforms to WORK: there’s a “feed” meta header and zero or more “entry” items.

With Pipe Cleaner I’m trying not only to make a way where feeds and other data can be remixed, but also make it easy to do so! To do that, I’ve decided that be default, even though you are working with (say) OPML or RSS, we’ll translate all the terms to their Atom equivalents as best as possible. You’ll have to read the spec yourselves, but here’s a quick rundown of common elements, not all required by any means:

  • author, with possible sub-fields uri and email
  • content – the body
  • summary – a summary of the body; currently my feeling is that content & summary must always be HTML
  • updated – when last updated
  • created – when created, assume to be updated if not present
  • link – the main URI
  • links – for alternate URIs (this is a variance from the Atom spec; it should be easy to find the main URI for an element; I may reconsider this before release)
  • id – a unique identifier
  • category – tags, encoded in a sub-field term

Note that I’m not slavish about making the output conformant to all the SHOULDs, MUSTs, etc. that are in the Atom spec: my pragmatic programming approach says “do the best we can” and if the user needs better, they can walk the extra mile.

Here’s some examples of data that’s been run through Pipe Cleaner, translating to Atom upon input and translating back to whatever is needed upon output. The JSON (actually pretty printed JSON) output is the most instructive for what’s going to inside Pipe Cleaner.

RSS Feed


Note how the OPML is “flattened”, with hierarchy being encoded into the Category. This can be turned off if needed.

hCard microformat (in HTML)

Note the neat namespacing in the RSS output. The OPML is almost devoid of useful information, further consideration is needed.

hCalendar microformat (in HTML)

Similar to hCard. We’ll probably also (or exclusively) encode the hCalendar data in an xCal extension.

hAtom microformat (in HTML)

hAtom -> RSS is basically turning an hAtom page into a feed!

Source example

Since no blog post is complete without a little source code, here’s a Pipe Cleaner script to parse the hCard document. If you’re following closely, the output format is selected by the user at runtime. All the other scripts are of similar terseness.

import module:api_microformat;
api_microformat.HCard uri:"http://tantek.com/" to:items meta:meta;

December 18, 2008

Pipe Cleaner

demo,djolt,dqt,html / javascript,ideas,jd,maps,pipe cleaner,pybm,work · David Janes · 6:38 pm ·

I’ve been working (in my decreasing available spare time) on a project to pull together into a project called “Pipe Cleaner” all the various concepts I’ve been mentioning on this blog: Web Object Records (WORK) for API Access and object manipulation, Djolt for generating text from templates, Data/Query/Transform/Template (DQT) for transforming data and JD for scripting these elements together. The pieces came together this morning enough to put a demo together and here it is – the Toronto Fires Pt II Demo.

How, you may ask, does this differ from the original Toronto Fires Demo? The answer is how it is put together, which we describe here.


This is the Djolt template that generates the output. The data fed to this template is generate by the JD script, described in the next section.

    <link rel="stylesheet" type="text/css" href="css.css" />
    {{ gmaps.js|safe }}
<div id="content_wrapper">
    <div id="map_wrapper">
        {{ gmaps.html|safe }}
    <div id="text_wrapper">
{% for incident in incidents %}
    <div id="{{ incident.IncidentNumber }}">
        {{ incident.body_sb|safe }}
{% endfor %}

Quite simple … as you can see, most of the data is being pulled in from elsewhere. The elsewhere is provided by the script described in the next section.


This is the script that pull all the pieces together. Note that I’m not 100% happy with the way the data is imported, I would like the geocoding to become part of this data flow too. In the next release perhaps.

First we pull in the “fire” module that we wrote in the previous Map examples. This is doing exactly what you think: importing a Python module. We may have to increase the security or restrict this to working with an API for general purpose use.

import module:"fire";

Next we define two headers – one that is going to appear in the Google Maps popup, the next that is going to appear in the sidebar. They need to be different as they refer to themselves. Note that the sidebar header “breaks” the encapsulation of Google Maps – this seems to be unavoidable. The to:"fitem.head.map" and to:"fitem.head.sb" are manipulating a WORK dictionary to store values.

Note also here that we’ve extended JD to accept Python multiline strings – this was unavoidable if JD was to be useful to me.

set to:"fitem.head.map" value:"""
<a href="#{{ IncidentNumber }}">{{ AlarmLevel}}: {{ IncidentType }} on {{ RawStreet }}

set to:"fitem.head.sb" value:"""
{% if latitude and longitude %}
<a href="javascript:js_maps.map.panTo(new GLatLng({{ latitude }}, {{ longitude }}))">*
{% endif %}
<a href="#{{ IncidentNumber }}">{{ AlarmLevel}}: {{ IncidentType }} on {{ RawStreet }}

The next block defines the text of the body used to describe a fire incident. It follows much the same pattern as the previous block.

set to:"fitem.body" value:"""
Alarm Level: {{ AlarmLevel }}
<br />
Incident Type: {{ IncidentType }}
<br />
City: {{ City }}
<br />
Street: {{ Street }} ({{ CrossStreet }})
<br />
Units: {{ Units }}

This is a map: it is translating the values in fire.GetGeocodeIncidents into a new format and storing that in incidents. The format that we were are storing it in is understood by the Google Maps generating module.

We may rename this translate, as the word map is somewhat overloaded.

map from:"fire.GetGeocodedIncidents" to:"incidents" map:{
    "latitude" : "{{ latitude }}",
    "longitude" : "{{ longitude }}",
    "title" : "{{ AlarmLevel}}: {{ IncidentType }} on {{ RawStreet }}",
    "uri" : "{{ HOME_URI }}#{{ IncidentNumber }}",
    "body" : "{{ *fitem.head.map|safe }}{{ *fitem.body|safe }}",
    "body_sb" : "{{ *fitem.head.sb|safe }}{{ *fitem.body|safe }}",
    "IncidentNumber" : "{{ IncidentNumber }}"

Next we set up the “meta” (see WORK meta description if you’re not following along) for the maps. The render_value:true declaration makes PC interpret the templates in strings). We then call our Google Maps generating code (which are actually more Pipe Cleaners) and that gets fed to the Djolt template we first showed you. Clear? Maybe not, we’ll have more examples coming…

set to:"map_meta" render_value:true value:{
    "id" : "maps",
    "latitude" : 43.67,
    "longitude" : -79.38,
    "uzoom" : -13,
    "gzoom" : 13,
    "api_key" : "{{ cfg.gmaps.api_key|otherwise:'...mykey...' }}",
    "html" : {
        "width" : "1024px",
        "height" : "800px"

load template:"gmaps.js" items:"incidents" meta:"map_meta";
load template:"gmaps.html" items:"incidents" meta:"map_meta";

December 11, 2008

A brief survey of Yahoo Pipes as a DQT

demo,djolt,dqt,ideas,semantic web,work · David Janes · 7:19 am ·

MacFUSEYahoo Pipes is a visual editor of mashups, allowing you to take data from sources on the net, transform them in various interesting ways and output the result as Atom, RSS or JSON. The primary downside Pipes of course is that you’re totally dependent on Yahoo for the infrastructure: it runs at Yahoo pulling feeds that have to be accessable through the public Internet.

It’s easy to use Pipes: just go to this page and start working with the sample example Pipe. You’ll need a Yahoo login ID, but most of us have that anyway. I’ve created an example that uses Yahoo Pipes to feed a Djolt template which you can see here.

We can analyze Pipes in the terms of the DQT paradigm we’ve outlined in the previous post.

Data Sources and Queries

Sources and Queries are merged (quite logically) in the Pipes interface. You can read in depth documentation here.

  • Fetch CSV
  • Feed Autodiscovery – outputs syndication feeds found on a page (RSS feeds on a CBC page)
  • Fetch Feed
  • Fetch Page – will read a page and parse the contents with a reg
  • Fetch Site Feed – this is the logical combination of Fetch Feed and Fetch Autodiscovery
  • Flickr – find images by tag near a location (photos of cats in Toronto)
  • Google Base – look up information in Google Base
  • Item Builder – a way of building new items from existing items
  • Yahoo Local
  • Yahoo Search


The operator documentation can be read here.

  • Count
  • Filter
  • Location Extractor – a geocoder that magically looks for locations
  • Loop
  • Regex
  • Rename
  • Reverse
  • Sort
  • Split
  • Sub-element – pulls a particular sub-element of an item and makes that the item. This is very much like WORK path manipulation
  • Tail
  • Truncate
  • Union
  • Unique
  • Web Service

Plus a number of specialized data services, for dealing with elements such as dates.


Pipes does not provide an arbitrary Djolt-like template producing HTML. Instead, they provide a number of pre-made code templates that output well known data types, including RSS, JSON and Atom (and some stranger choices, like PHP).

December 8, 2008

Coding backwards for simplicity

djolt,dqt,ideas,pybm,python,work · David Janes · 4:58 pm ·

I haven’t been posting as much as I like here for the last three weeks, not because of lack of ideas but because I haven’t been able to consolidate what I’ve been working on into a coherent thought. I’m trying to come up with a overreaching conceptual arch that covers WORK, Djolt and the various API interfaces I’ve been coded. Tentatively and horribly, I’m calling this Data/Query/Transform/Template right now though I’m expecting this to change.

The first demo of this … without further explanation … can be seen here. More details about what this is actually demonstrating (besides formatting this blog) will be forthcoming.

What I want to draw attention to in this post is how I coded this. What I’ve been doing for the last several weeks is coding backwards: I start with what I want the final code to look like and then figure out all the libraries, little languages and so forth that would be needed to code that. After several false starts, my conceptual logjam broke about a week ago and code started radically simplifying.

The ideal code, in my mind, is almost entirely static declarations: no loops, no if statements, no while statements, no goto-type statements (god help us). We simply specify how the parts are connected, and hope that we can abstract the complexity into the libraries that make this all happen. The code that you see below is actually post all my conceptualizing: I just wanted to write some code and since I had almost all the parts together it fell together quite nicely:

import bm_wsgi
import bm_io

import djolt
import api_feed

from bm_log import Log

class Application(bm_wsgi.SimpleWrapper):
    def __init__(self, *av, **ad):
        bm_wsgi.SimpleWrapper.__init__(self, *av, **ad)

    def CustomizeSetup(self):
        self.html_template_src = bm_io.readfile("index.dj")
        self.html_template = djolt.Template(self.html_template_src)

        self.context = djolt.Context()
        self.context["paramd"] = {
            "feed" : "http://feeds.feedburner.com/DavidJanesCode",
            "template" : """\
{% for item in data.items %}
	<li><a href="{{ item.link }}">{{ item.title }}</a></li>
{% endfor %}
        self.context["paramd"] = self.paramd
        self.context["data"] = api_feed.RSS20(self.context.as_string("paramd.feed"))

    def CustomizeContent(self):
        yield   self.html_template.Render(self.context)

if __name__ == '__main__':

There’s almost nothing there! In particular, note:

  • bm_wsgi.SimpleWrapper handles all the WSGI interface work, including determining when to output HTML headers, error trapping, and Unicode to UTF-8 encoding
  • the most complicated part of the application is setting up the Context. In particular, note that self.paramd is automatically populated by the QUERY_STRING passed to the application, and the double setting we do here allows us to have default values.
  • If you want to see the HTML template that drives the application it is here. Note two variations from Django templates: the {% asis %} block which doesn’t intrepret it’s content as Djolt code and the {{ *paramd.template|safe }} variable which interprets the variable’s contents as a template.
  • Methods called Customize-something are my convention for framework functions, i.e. methods that will be called for us rather than methods we call.

November 28, 2008

Djolt – Django-like Templates

djolt,pybm,python,work · David Janes · 4:34 pm ·

Djolt is a reimplementation of Django’s template language in Python. Why do this?

  • I like the Django template language
  • I wanted something that small and independent of Django
  • I wanted something that will work with WORK paths (this was the real deal breaker for using Django)
  • I wanted something that I could take and reimplement in Javascript and maybe Java too
  • Some template engines, Cheetah for example, are far too heavy for the kind of light-weight applications I have in mind; note that I’ve had great success with Cheetah in the past
  • Some template engines, such as that in Python 2.6, are for too underfeatured

However, if you’re really looking for the whole Django template experience and don’t want to use Djolt, just start here.

How do I get it?

Djolt is packaged as part of the pybm library.

How do I use it?

import djolt

t = djolt.Template("""
{% for name in names %}
<li>{{ name }}</li>
{% endfor %}
print t.Render({
    "names" : [ "Johnny", "Jack", "Ray", "Mary & Sam", ]

Which gives the results:

<li>Mary &amp; Sam</li>

Note the “autoescaping” of the & character.

What tags does it define?

  • autoescape/endautoescape
  • if/else/endif
  • equal/endequal
  • for/endfor
  • notequal/notendequal

It does not implement blocks.

What filters does it define?

  • add
  • cut
  • default (see otherwise below)
  • default_if_none
  • divisibleby
  • first
  • join
  • last
  • length
  • length_is
  • linebreaks
  • lower
  • pluralize
  • random
  • safe (respecting all the Django autoescape rules)
  • slug
  • upper

Unimplemented filters are due to laziness and will be done “on demand”. We also introduce a few new filters:

  • jslug – like slug, but more Javascript friendly
  • otherwise – like default, except the empty string/empty values trigger the filter also

Are their differences between Djolt and Django templates?

  • Djolt tags suck up whitespace if they’re on a line by themselves
  • If Djolt cannot resolve a variable, it resolves to the appropriate “empty” value (as opposed to failing). This is keeping in line with WORK philosophy

Beyond that you should be able to use most Django template examples (that don’t use block/implements) as-is.

Is it extensible?

Yes. You can add your own tags and filters by following the examples in code (djolt_nodes.py and djolt_filters.py respectively).

November 22, 2008

Toronto Fires

demo,djolt,ideas,maps,work · David Janes · 4:19 pm ·

Here’s a little mashup I’ve been putting together for the last few days: Toronto Fires.

It’s taking the data listed here on the City of Toronto’s Fire Services “Active Accidents”, scraping it (by pretending HTML is XHTML and treating it as WORK objects), geocoding it (using our WORK Google API) and mapping it (using this information).

This is very much a work in progress, but here’s a few more things that are involved:

  • we read body.table.tr.td[1].table.tr[1].td.table.tr as a list to get the rows in the table
  • we map those rows into the Geocoder use a new magic technology we’ll be explaining in the next few days: DjoltDjango-like templates
  • the output program is just one big Djolt template

I’m not quite satisfied with how the current page is constructed: I want the final result to be much more simple.

November 21, 2008

WORK paths

ideas,work · David Janes · 4:03 pm ·

A WORK object is simply a way of looking of any JSON-like dictionary – it’s an “attitude”. The primary difference is in how we use that dictionary, especially in the context of using APIs. Here’s a rough overview of the thre

  • the WORK is the interface – we don’t need to write specialized methods to deal with an API because we know what we’re looking for anyway. It’s not like we spend weeks looking at our API interface – typically, it’d be more minutes (once it’s coded) in a typically programming session
  • what is in the WORK is defined by how we want to use it. For example, if we want an Integer, we ask for an integer for the WORK: it may be stored as a string, a boolean, an integer, or a float; it doesn’t matter. When using WORK objects we expect to the conversion to be done for us at runtime
  • if you are looking for a list in the WORK and there’s another type of object, we pretend that it’s in a list of length 1
  • if you are looking for an object (by key) and you find a list, look in the first object in the list

These rules are written from a pragmatic vision of using APIs: data is sometimes in lists, sometimes it isn’t. Sometimes we know the types being sent on the wire, sometimes it’s just strings.

In order to use WORK objects efficiently, we define a “dot-path” for accessing items that may be hierarchically nested. We’ll address the type coercion issue in another post. To illustrate our point, we’ll be working with the following WORK object

  "name" : "Sally Jones",
  "age" : 22,
  "hobbies" : [ "Skiing", "Windsurfing", "Chillaxing" ],
  "sites" : {
    "facebook" : "http://www.facebook.com",
    "gmail" : "http://gmail.com",
  "address" : [
      "street" : "1 Main Street",
      "city", "Toronto",
      "province" : "Ontario",
      "street" : "RR4",
      "city", "Bala",
      "province" : "Ontario",

Here’s a few dot-paths and the value they’ll retrieve:

  • path: name
    value: "Sally Jones"
  • path: hobbies
    value: [ "Skiing", "Windsurfing", "Chillaxing" ]
  • path: hobbies[1]
    value: "Windsurfing"
  • path: address.street
    value: "1 Main Street"
    … this demonstrates seeing a list where we want a dictionary and just looking at the first object in the list
  • path: sites[0].facebook
    value: "http://www.facebook.com"
    … this is an example of looking a list, not finding it and assuming there’s an list of length 1 there. sites[1].facebook would return null.

November 18, 2008

Work API Teaser III – Google API

python,work · David Janes · 5:26 am ·

Here’s an example of implementing an API with many different endpoints. It’s the Google AJAX Search API which lets you access all of Google’s search engines programmatically! A few notes:

  • In the Javascript API Google provides “branding” functions to make sure search results are properly attributed. There doesn’t seem to be a corresponding AJAX call — that is, it’s probably implemented directly in the Javascript — but I’d still like to provide a corresponding function. It would be nice if API providers actually gave a branding end-point
  • The code doesn’t support (yet) multi-page results: coming soon
  • The clever bit is in _item_path, which describes how to pull WORK result objects out of the AJAX result
  • all this code is actually available right now, via SVN: the instructions are here. This library is standalone (and is in fact the basis for many of the other projects I have on Google code)
  • The Google API requires a _http_referer: the URL of the site that’s using the results
  • The Google API does not require an API key, but you can pass one (in the constructor or in individual search calls) under the key api_key. You can use the same API key that you’ve created for Google Maps.

Here’s the Google API class: quite simple. I’ll probably extend each individual search function to provide all the known parameters by name, rather than passing in a **ad catch-all.

class Google(bm_api.API):
    _base_query = {
        "v" : "1.0",

    _item_path = "responseData.results"
    _meta_path = "responseData.cursor"
    _convert2work = bm_work.JSON2WORK()

    def __init__(self, _http_referer, **ad):
        bm_api.API.__init__(self, _http_referer = _http_referer, **ad)

    def WebSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/web"
        self.SearchOn(q = q, **ad)

    def LocalSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/local"
        self.SearchOn(q = q, **ad)

    def VideoSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/video"
        self.SearchOn(q = q, **ad)

    def BlogSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/blogs"
        self.SearchOn(q = q, **ad)

    def NewsSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/news"
        self.SearchOn(q = q, **ad)

    def BookSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/books"
        self.SearchOn(q = q, **ad)

    def ImageSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/images"
        self.SearchOn(q = q, **ad)

    def PatentSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/patentNew"
        self.SearchOn(q = q, **ad)

Here’s how you use it:

api_key = os.environ["GMAPS_APIKEY"]
referer = "http://code.davidjanes.com"
query = "Paris Hilton"

api = Google(key = api_key, _http_referer = referer)

for item in api.IterItems():

Here’s an example of a results, searching for “Paris Hilton” in Videos. I tried searching in Patents without luck.

{'@Index': 0,
 '@Page': 1,
 u'GsearchResultClass': u'GvideoSearch',
 u'content': u"Paris Hilton's new video clip for 'Nothing In This World'",
 u'duration': u'204',
 u'playUrl': u'http://www.youtube.com/v/...',
 u'published': u'Thu, 12 Oct 2006 09:33:23 PDT',
 u'publisher': u'www.youtube.com',
 u'rating': u'4.52872',
 u'tbHeight': u'240',
 u'tbUrl': u'http://0.gvt0.com/vi/Ki2M3-2W-cQ/0.jpg',
 u'tbWidth': u'320',
 u'title': u'Paris Hilton - Nothing In This World',
 u'titleNoFormatting': u'Paris Hilton - Nothing In This World',
 u'url': u'http://www.google.com/url?q=...',
 u'videoType': u'YouTube'}

November 12, 2008

Work API Teaser II – Praized API

demo,ideas,python,search,semantic web,work · David Janes · 6:46 pm ·

Implementing a merchant search using the Praized API took about 10 minutes (mainly finding the right documentation), using my WORK framework:

class PraizedMerchants(bm_api.API):
    """See: http://code.google.com/p/praized/wiki/A_Second_Tutorial_Search"""

    _uri_base = "http://api.praized.com/apitribe/merchants.xml"
    _meta_path = "community"
    _item_path = "merchants.merchant"
    _page_max_path = 'pagination.page_count'
    _page_max = -1

    def __init__(self, api_key, slug = "apitribe", **ad):
        bm_api.API.__init__(self, api_key = api_key, **ad)

        self._uri_base = "http://api.praized.com/%s/merchants.xml" % slug

    def CustomizePageURI(self, page_index):
        if page_index > 1:
            return  "page=%s" % page_index

Partially hardcoding ‘apitribe’ as a ‘community slug’ is probably a bad idea. Anyhoo, here’s how you call it…

api_key = os.environ["PRAIZED_APIKEY"]
api = PraizedMerchants(api_key = api_key, slug = "david-janess-code")
    q = "Bistro",
    l = "Toronto",
for item in api.IterItems():
    print json.dumps(item, indent = 1)

… and a set if results, somewhat edited below. I’ll have to figure out what that “permalink” is all about (I’ve edited it to shorten it)  … it could be something neat, but I haven’t quite grasped all the ins and outs of what Praized wants to accomplish as a business.

 "@Index": 0,
 "@Page": 1,
 "short_url": "http://przd.com/zAU-7",
 "pid": "af5bebd604f3d1517a8113e0a2e8cc58",
 "updated_at": "2008-10-04T20:49:34Z",
 "phone": "(416) 585-7896",
 "name": "Coffee Supreme Bistro",
 "created_at": "2008-10-04T20:49:34Z",
 "location": {
  "city": {
   "name": "Toronto"
  "country": {
   "code": "CA",
   "name_fr": "Canada",
   "name": "Canada"
  "longitude": "-79.384071",
  "regions": {
   "province": "Ontario"
  "postal_code": "M5J 1T1",
  "latitude": "43.646347",
  "street_address": "40 University Avenue"


ideas,python,semantic web,work · David Janes · 9:41 am ·

Following from the concepts I wrote about yesterday, here’s two examples of API parsers using a WORK model.

RSS 2.0

Class definition – that’s the whole thing there!:

class RSS20(API):
    _item_path = "channel.item"
    _meta_path = "channel"

    def __init__(self, uri):

        self._uri_base = uri

Using it:

api = RSS20(uri = 'http://feeds.feedburner.com/DavidJanesCode')
for item in api.IterItems():
    print "-", item['title']


- WORK - Web Object Records
- Syntax Error on Line 1
- Adding MapField to inputEx
- Switching between mapping APIs and universal zoom levels
- How to dynamically load map APIs
- How to use the Google Maps API
- How to use the Microsoft Virtual Earth API
- Tip - how to get your browser’s User Agent
- How to use the MapQuest API
- How to use the Yahoo Maps Service AJAX API
- How to detect internal link jumps
- GenX - first public demonstration
- Amazon’s OpenSearch: mostly useless
- More style updates
- How to do multi-column multilingual full text searching in Oracle
- Tip - fixing broken menus over form on IE6 and IE7
- New style for this weblog
- AUMFP - Demo
- Tip - use mod_rewrite to redirect to subdirectory
- AUMFP - The Almost Universal Microformats Parser

Amazon ECS

This will probably end up replacing PyECS!

Class definition:

class AmazonECS(API):
    _base_query = {
        "Sort" : "relevancerank",
        "Operation" : "ItemSearch",
        "Version" : "2008-08-19",
        "ResponseGroup" : [ "Small", ],
    _uri_base = "http://ecs.amazonaws.com/onca/xml"
    _meta_path = "Items.Request"
    _item_path = "Items.Item"
    _page_max_path = 'Items.TotalPages'
    _item_max_path = 'Items.TotalResults'
    _page_max = -1

    def __init__(self, **ad):
        API.__init__(self, **ad)

    def CustomizePageURI(self, page_index):
        if page_index == 1:

        return  "%s=%s" % ( "ItemPage", page_index )

Using it:

api = AmazonECS(AWSAccessKeyId = os.environ["AWS_ECS_ACCESSKEYID"])
    Keywords = "Larry Niven",
    SearchIndex = "Books",
    Condition = "New",
for item in api.IterItems():
    print "-", item['ItemAttributes.Title']

Results … note that this fetching many pages of results:

- Fleet of Worlds
- Juggler of Worlds
- Escape from Hell
- Inferno
- N-Space
- The Ringworld Engineers (Ringworld)
- The Draco Tavern
- Legacy of Heorot: Legacy of Heorot
- Footfall
- The Burning City (Hardback)
- Protector
- Burning Tower
- Three Books of Known Space
- Ringworld Throne
- Tales of Known Space: The Universe of Larry Niven
- Scatterbrain
- Ringworld
- Lucifer's Hammer
... (continues) ...

Older Posts »

Powered by WordPress