David Janes' Code Weblog

February 21, 2009

Using Pipe Cleaner to convert CSV list of Science Journals to an OPML subscription list

demo,pipe cleaner · David Janes · 3:43 pm ·

Here’s a Pipe Cleaner script to convert this text list of Science Journals and converts it an OPML subscription list (here)

import module:api_csv.CSV;

CSV uri:'http://www.tictocs.ac.uk/text.php' delimeter:'\t';

items := map value:$items map:{
    "title" : "{{ C1 }}",
    "links" : {
        "href" : "{{ C2 }}",
        "type" : "text/xml",
        "rel" : "alternate",
    }
};

I decided not to use the “header name” feature of the CSV command because I had to remap anyway to create the links object. This has to be run with the following command (or from the web UI):

pc --format opml science-journals

Of course, this is a little unwieldy in size so maybe you only want journals with “Astrophysics” in their title:

import module:api_csv.CSV;

CSV uri:'http://www.tictocs.ac.uk/text.php' delimeter:'\t';

items := map value:$items map:{
    "title" : "{{ C1 }}",
    "links" : {
        "href" : "{{ C2 }}",
        "type" : "text/xml",
        "rel" : "alternate",
    }
};

items := search value:$items for:"Astrophysics";

Cool, eh? Not only this, this can be run entirely from the Web Interface with selectable strings so (theoretically) a Pipe Cleaner user would have an API to this data.

January 26, 2009

Pipe Cleaner – a delicious example

demo,pipe cleaner · David Janes · 7:08 pm ·

This is going to be a very brief post: here’s how you use Pipe Cleaner to download every in your delicious account tagged “python”  – outputing it as OPML, RSS, or Atom comes for free:

import module:api_delicious;
api_delicious.PostsList to:items tag:"python" authenticate:delicious;

January 25, 2009

Creating OPML subscription lists using Pipe Cleaner

authentication,demo,pipe cleaner,pybm,python · David Janes · 11:40 am ·

Here’s a neat API I completed this morning, called api_feeds. It takes a URL (or a list of them) and transforms them into:

  • the home page associated with the URL
  • the feed(s) for the URL
  • the name of the home page

If you’re following along at home, this is essentially the information needed for a single outline in an OPML subscription list.

Here’s a simple python example:

api = api_feeds.OneFeed()
api.request = {
    "uri" : "http://code.davidjanes.com/blog/2009/01/23/transparently-working-with-oauath/",
}

pprint.pprint(api.response, width = 1)

And here’s what the output looks like:

{'link': u'http://code.davidjanes.com/blog',
 'links': [{'href': u'http://feeds.feedburner.com/DavidJanesCode',
            'rel': 'alternate',
            'type': u'application/rss+xml'}],
 'title': u"David Janes' Code Weblog"}

There’s actually quite a bit going on here behind the scenes, most of it using code I didn’t initially write but have quite heavily hacked: the Universal Feed Parser and the Feed Finder.

What becomes really interesting what happens when we combine this with other modules. Here’s an example of how we can build an OPML subscription list from all the posts I’ve tagged “python” and “django” in del.icio.us. The code looks up each link I’ve bookmarked, does the feed discovery above, filters out items that don’t have feeds, and outputs as OPML. Note the neat pipeline type aspect to the code:

api_delicious = api_delicious.PostsList(tag = "python django")
api_many = api_feeds.ManyFeeds(require_feed = True)
api_opml = api_opml.OPMLWriter()

api_many.items = api_delicious.items
api_opml.items = api_many.items

print api_opml.Produce()

Producing the following OPML:

<opml encoding="utf-8" version="2.0">
  <head>
    <title>[Untitled]</title>
  </head>
  <body>
    <outline htmlUrl="http://push.cx"
      rssUrl="http://push.cx/feed"
      text="Push cx"
      type="rss"/>
    <outline htmlUrl="http://crankycoder.com"
      rssUrl="http://crankycoder.com/feed/"
      text="crankycoder.com"
      type="rss"/>
    <outline htmlUrl="http://blog.dowski.com"
      rssUrl="http://blog.dowski.com/feed/"
      text="the occasional occurrence"
      type="rss"/>
    <outline htmlUrl="http://www.b-list.org/feeds/entries/"
      rssUrl="http://feeds2.feedburner.com/b-list-entries"
      text="The B-List: Latest entries"
      type="rss"/>
    <outline htmlUrl="http://blog.thescoop.org"
      rssUrl="http://blog.thescoop.org/feed/"
      text="The Scoop"
      type="rss"/>
    <outline htmlUrl="http://effbot.org"
      rssUrl="http://effbot.org/zone/rss.xml"
      text="effbot.org"
      type="rss"/>
    <outline htmlUrl="http://blog.disqus.net"
      rssUrl="http://feeds.feedburner.com/BigHeadLabs"
      text="Disqus"
      type="rss"/>
    <outline htmlUrl="http://blog.ianbicking.org"
      rssUrl="http://blog.ianbicking.org/feed/atom/"
      text="Ian Bicking: a blog"
      type="rss"/>
    <outline htmlUrl="http://antoniocangiano.com"
      rssUrl="http://feeds.feedburner.com/ZenAndTheArtOfRubyProgramming"
      text="Zen and the Art of Programming"
      type="rss"/>
    <outline htmlUrl="http://www.carthage.edu/webdev"
      rssUrl="http://www.carthage.edu/webdev/?feed=rss2"
      text="carthage webdev"
      type="rss"/>
    <outline htmlUrl="http://www.eweek.com"
      rssUrl="http://www.eweek.com/rss-feeds-13.xml"
      text="Application Development - RSS Feeds"
      type="rss"/>
    <outline htmlUrl="http://jeffcroft.com/"
      rssUrl="http://feeds.feedburner.com/jeffcroft/blog"
      text="JeffCroft.com: Latest blog entries"
      type="rss"/>
  </body>
</opml>

This will be just as terse (terser, probably) when written as a Pipe Cleaner script; I’m just struggling over how to introduce the authentication code gracefully into the scripts.

January 23, 2009

Transparently working with OAuath

authentication,demo,pipe cleaner,pybm,python · David Janes · 5:03 am ·

This is part one of two posts I’m going to write about OAuth; the second will be somewhat more critical in tone. Before I criticize – and I know it’s hard to put together technologically things like OAuth – I want to actually accomplish something with it, so I at least I appear that I have somewhat of a clue about it. This is a report of what I’ve done.

bm_uri is a libary and tool I’ve written for working with URIs, and in particular http:// and https:// URLs. Here are some of the advantages of using bm_uri over all the normal Python urllib and urllib2 methods:

  • downloads are cached; if a URL is temporarily not available, bm_uri will return the cached version, likewise if it has been downloaded in the near past, the cached version will be returned rather than hitting the net again
  • downloads can be cooked, meaning converted into a more useful form such as TIDY-cleaned up HTML, JSON, Unicode text and so forth
  • bm_uri handles all the protocol stuff for you (such as User-Agent, Last-Modified and so forth) so you don’t have to
  • authentication is handled “invisibly” as possible for you … at least after the initial setup

Here is an example of accessing a OAuth resource using bm_uri returning my current location from Fire Eagle as a Python object. From a programming point of a view, I believe I have reduced this to close to the minimum number of steps possible. Here’s the setup phase:

import bm_uri
import bm_oauth
import pprint

bm_cfg.cfg.initialize()

bm_oauth.OAuth(service_name = "fireeagle")

Here’s using it in code – note how there’s no reference to OAuth here whatsoever.

loader = bm_uri.JSONLoader('https://fireeagle.yahooapis.com/api/0.1/user.json?format=json')
loader.Load()

pprint.pprint(loader.GetCooked())

And here’s the output of the program:

{u'stat': u'ok',
 u'user': {u'location_hierarchy': [{u'best_guess': True,
         u'geometry': {u'coordinates': [-79.418426513699998,
                   43.731891632100002],
              u'type': u'Point'},
         u'id': 572261,
         u'label': None,
         u'level': 1,
         u'level_name': u'postal',
         u'located_at': u'2008-03-19T04:09:30-07:00',
...
         u'name': u'Canada',
         u'normal_name': None,
         u'place_id': u'EESRy8qbApgaeIkbsA',
         u'woeid': 23424775}],
     u'readable': True,
     u'writable': False}}

Gather information

The devil is in the details, obviously and with OAuth, the little satan is doing the initial setup. Here’s how I did this for Fire Eagle – there’ll be something analogous for whatever service you are using:

  • Log in or sign up (obviously)
  • Go to the Developers’ Page
  • Click on Create a New App
  • Copy the “Consumer Key” and the “Consumer Secret” … these will be long-ish strings of nonsense
  • Find out the Request Token URL, the Access Token URL, and the Authorization URL. These are public knowledge and for Fire Eagle are:
    • https://fireeagle.yahooapis.com/oauth/request_token
    • https://fireeagle.yahooapis.com/oauth/access_token
    • http://fireeagle.yahoo.net/oauth/authorize

Note how Yahoo has conveniently made that last URL similar looking to the others, but not quite the same. Thanks!

However you implement OAuth, you’re probably going to need to be able to persist information to disk or database. As documented here several weeks ago, we already have that covered with our bm_cfg module. In ~/.cfg/fireeagle.json, create the following JSON format file:

{
 "fireeagle": {
  "api_uri" : "https://fireeagle.yahooapis.com/",
  "oauth_access_token_url": "https://fireeagle.yahooapis.com/oauth/access_token",
  "oauth_authorization_url": "http://fireeagle.yahoo.net/oauth/authorize",
  "oauth_consumer_key": "ABCDEFGHIJKL",
  "oauth_consumer_secret": "ABCDEFGHIJKLMNOPQRSTUVWXYZ012345",
  "oauth_token_url": "https://fireeagle.yahooapis.com/oauth/request_token",
 }
}

The only new item here is the api_uri: that’s the prefix of URLs that bm_uri will use OAuth with.

Set it up

Next you have to do all sorts of OAuth stuff to actually work with OAuth. If the why interests you, please go read the spec! I’m more of how person myself, and this is what we need to do:

  • run: python bm_uri.py --service fireeagle --authorize
  • this will pop up a browser window; grant your application access and then…
  • run: python bm_uri.py --service fireeagle --exchange

And that’s it – you should now be able to just work with the Fire Eagle API in bm_uri without even having to know OAuth is there!

End notes

  • the current implementation only works with HTTP/REST GET; POST to come soon, DELETE and PUT as needed
  • bm_uri, bm_config and the rest of the code is freely licensed and available here. It is a constantly changing product, albeit converging on perfection in my own mind ;-)

January 19, 2009

Atom as a Rosetta Stone for WORK objects

demo,pipe cleaner,work · David Janes · 6:31 am ·

WORK – Web Object Records – is a way of describing messages we pass over the web: a single header object called the “meta” and zero or more objects called “items”. Each object can be encoded as a JSON record, though we can access invidual items within each WORK object using a WORK Path which allows quite a bit of latitude for type coercision and vagarities in packaging.

Pipe Cleaner is a project I’ve been working on for the last two months that allows one to script data using WORK, to accomplish tasks such as remixing and filtering RSS feeds, read or produce OPML, make JSON interfaces and so forth. I actually have one live deployment which I will blog about soon and hope to have it beta productized for March.

Atom is a standard for syndicating feeds, not unsimilar to RSS but with a richer better described vocabulary. I already have one major “project” built around Atom: the hAtom microformat for describing microcontent and information that can be syndicated. hAtom has also been morphed by Microsoft to produce the Web Slice format, so you may be seeing that about. Atom is conforms to WORK: there’s a “feed” meta header and zero or more “entry” items.

With Pipe Cleaner I’m trying not only to make a way where feeds and other data can be remixed, but also make it easy to do so! To do that, I’ve decided that be default, even though you are working with (say) OPML or RSS, we’ll translate all the terms to their Atom equivalents as best as possible. You’ll have to read the spec yourselves, but here’s a quick rundown of common elements, not all required by any means:

  • author, with possible sub-fields uri and email
  • content – the body
  • summary – a summary of the body; currently my feeling is that content & summary must always be HTML
  • updated – when last updated
  • created – when created, assume to be updated if not present
  • link – the main URI
  • links – for alternate URIs (this is a variance from the Atom spec; it should be easy to find the main URI for an element; I may reconsider this before release)
  • id – a unique identifier
  • category – tags, encoded in a sub-field term

Note that I’m not slavish about making the output conformant to all the SHOULDs, MUSTs, etc. that are in the Atom spec: my pragmatic programming approach says “do the best we can” and if the user needs better, they can walk the extra mile.

Here’s some examples of data that’s been run through Pipe Cleaner, translating to Atom upon input and translating back to whatever is needed upon output. The JSON (actually pretty printed JSON) output is the most instructive for what’s going to inside Pipe Cleaner.

RSS Feed

OPML Data

Note how the OPML is “flattened”, with hierarchy being encoded into the Category. This can be turned off if needed.

hCard microformat (in HTML)

Note the neat namespacing in the RSS output. The OPML is almost devoid of useful information, further consideration is needed.

hCalendar microformat (in HTML)

Similar to hCard. We’ll probably also (or exclusively) encode the hCalendar data in an xCal extension.

hAtom microformat (in HTML)

hAtom -> RSS is basically turning an hAtom page into a feed!

Source example

Since no blog post is complete without a little source code, here’s a Pipe Cleaner script to parse the hCard document. If you’re following closely, the output format is selected by the user at runtime. All the other scripts are of similar terseness.

import module:api_microformat;
api_microformat.HCard uri:"http://tantek.com/" to:items meta:meta;

December 23, 2008

Pipe Cleaner (II)

demo,maps,pipe cleaner · David Janes · 6:37 am ·

Here’s the latest evolution of Pipe Cleaner, mainly recorded here for historical interest. The big change is that there isn’t a separate outside template – everything is in the one index.jd file. The new directive is template, which can read and execute an outside module or actually produce the final output (as we see in the very last directive). I have not put this up as an independent demo.

#
#	Import the Python fire module
#	- used in: map from:"fire.GetGeocodedIncidents" to:"incidents"
#
import module:"fire";

#
#	Header for Google Maps popup
#	- used in: map from:"fire.GetGeocodedIncidents" to:"incidents"
#
#
set to:"fitem.head.map" value:"""
<h3>
<a href="#{{ IncidentNumber }}">{{ AlarmLevel}}: {{ IncidentType }} on {{ RawStreet }}</a>
</h3>
""";

#
#	Header for the sidebar
#	- used in: map from:"fire.GetGeocodedIncidents" to:"incidents"
#
set to:"fitem.head.sb" value:"""
<h3>
{% if latitude and longitude %}
<a href="javascript:js_maps.map.panTo(new GLatLng({{ latitude }}, {{ longitude }}))">*</a>
{% endif %}
<a href="#{{ IncidentNumber }}">{{ AlarmLevel}}: {{ IncidentType }} on {{ RawStreet }}</a>
</h3>
""";

#
#	Body for the Google Maps pop and the sidebar
#	- used in: map from:"fire.GetGeocodedIncidents" to:"incidents"
#
set to:"fitem.body" value:"""
<p>
Alarm Level: {{ AlarmLevel }}
<br />
Incident Type: {{ IncidentType }}
<br />
City: {{ City }}
<br />
Street: {{ Street }} ({{ CrossStreet }})
<br />
Units: {{ Units }}
</p>
""";

#
#	Convert all the incidents from the fire module
#	to the path 'incidents' using the mapping rules defined above
#
#	- incidents are used in "gmaps.js" and "gmaps.html"
#
map from:"fire.GetGeocodedIncidents" to:"incidents" map:{
	"latitude" : "{{ latitude }}",
	"longitude" : "{{ longitude }}",
	"title" : "{{ AlarmLevel}}: {{ IncidentType }} on {{ RawStreet }}",
	"uri" : "{{ HOME_URI }}#{{ IncidentNumber }}",
	"body" : "{{ *fitem.head.map|safe }}{{ *fitem.body|safe }}",
	"body_sb" : "{{ *fitem.head.sb|safe }}{{ *fitem.body|safe }}",
	"IncidentNumber" : "{{ IncidentNumber }}"
};

#
#	Load the 'gmaps' templates (for arbitrary geo-mapping),
#	using the 'incidents' for its items and the specified meta.
#
#	- used in in "gmaps.js" and "gmaps.html"
#
set to:"map_meta" value_render:true value:{
	"id" : "maps",
	"latitude" : 43.67,
	"longitude" : -79.38,
	"uzoom" : -13,
	"gzoom" : 13,
	"api_key" : "{{ cfg.gmaps.api_key|otherwise:'ABQIAAA...pIxzZQ' }}",
	"html" : {
		"width" : "1024px",
		"height" : "800px"
	}
};

#
#	Produce GMaps
#
template script:"gmaps" items:"incidents" meta:"map_meta";

#
#	Produce the final output
#
template value:"""
<html>
<head>
    <link rel="stylesheet" type="text/css" href="css.css" />
	{{ gmaps.js|safe }}
</head>
<body>
<div id="content_wrapper">
	<div id="map_wrapper">
		{{ gmaps.html|safe }}
	</div>
	<div id="text_wrapper">
{% for incident in incidents %}
	<div id="{{ incident.IncidentNumber }}">
		{{ incident.body_sb|safe }}
	</div>
{% endfor %}
</div>
</body>
</html>
""";

The gmaps.jd (imported in the second last directive) looks like as follows (there will not be a test). It’s designed to be a universal “show a map and plot points on in it” inclusion. I’ve added a few line breaks so the PRE box doesn’t break.

#
#
#
template to:"html" value:"""
<div id="id_{{ meta.id|jslug }}"
style="{% if meta.html.width %}width: {{ meta.html.width }};{% endif %}
{% if meta.html.height %} height: {{ meta.html.height }};{% endif %}
{% if meta.html.style %} style: {{ meta.html.style }};{% endif %}"
{% if meta.html.class %} class="{{ meta.html.class }}"{% endif %}
></div>
<script type="text/javascript">
js_{{ meta.id|jslug }}.onload();
</script>
""";

#
#
#
template to:"js" value:"""
<script
 type="text/javascript"
 src="http://maps.google.com/maps?file=api&v=2&key={{meta.api_key}}">
</script>
<script type="text/javascript">
js_{{ meta.id|jslug }} = {
 onload : function() {
  js_{{ meta.id|jslug }}.map = new GMap2(document.getElementById("id_{{ meta.id|jslug }}"));
  m = js_{{ meta.id|jslug }}.map;
  m.setCenter(new GLatLng({{ meta.latitude }}, {{ meta.longitude }}), {{ meta.gzoom }});

  // {{ items|length }} items follow
{% for itemd in items %}
{% if itemd.latitude and itemd.longitude %}

  // {{ itemd.title }}
  var ll = new GLatLng({{ itemd.latitude }}, {{ itemd.longitude }});
  var marker = js_{{ meta.id|jslug }}.make_marker(m, ll, "{{ itemd.body|safe|escapejs }}");
  m.addOverlay(marker);
{% else %}
  // an item is missing latitude or longitude
{% endif %}
{% endfor %}
 },

 make_marker : function(m, ll, html) {
  var marker = new GMarker(ll);
  GEvent.addListener(marker, "click", function() {
   m.openInfoWindowHtml(ll, html);
  });

  return marker;
 },

 end : 0
}
</script>
""";

December 22, 2008

Issues with utcoffset and pytz

demo,python · David Janes · 10:14 am ·

In the previous entry, we talked about the difficultly in finding out the delta from UTC for a timezone returned from the pytz module. In particular, consider the offset for St. John’s, Newfoundland which should be at -3:30.

dt_now = datetime.datetime.now()
tz = pytz.timezone('America/St_Johns')

offset = tz.utcoffset(dt_now)

Log(
    "using datetime.utcoffset",
    offset = format(offset),
)

With the unexpected result:

  message: using datetime.utcoffset
  offset: -4:29 (-12660)

I did a fair bit of Google searching for an answer without finding a satisfactory result, so I did further research on my own. To find the correct offset value, I found that this works:

dt_sj = tz.localize(dt_now)
offset = dt_sj - pytz.UTC.localize(dt_now)

Log(
    "using delta to UTC",
    offset = format(offset),
)

Which yields the correct:

  message: using delta to UTC
  offset: 03:30 (12600)

Note that if you’re going to use the above method for finding deltas, you’re going to have to take Daylight Savings Time into consideration also. I have not done this here, as I’m a little pressed for time and just want to illustrate the problem.

The issue seems to be with the way that pytz uses the Olson database entry (from here) for St. John’s – and all other locations. It appears that pytz is using the first rule it sees, from 1884, rather than the rule for the date that was passed in. I think this is a bug.

#
# St John's has an apostrophe, but Posix file names can't have apostrophes.
# Zone  NAME        GMTOFF  RULES   FORMAT  [UNTIL]
Zone America/St_Johns   -3:30:52 -  LMT 1884
            -3:30:52 StJohns N%sT   1918
            -3:30:52 Canada N%sT    1919
            -3:30:52 StJohns N%sT   1935 Mar 30
            -3:30   StJohns N%sT    1942 May 11
            -3:30   Canada  N%sT    1946
            -3:30   StJohns N%sT

The setup code for the examples above is:

from bm_log import Log
import dateutil.parser
import pytz
import datetime

def format(td):
    seconds = td.seconds + td.days * ( 24 * 3600 )
    return  "%02d:%02d (%s)" % ( seconds // 3600, seconds % 3600 // 60, seconds, )

Update 2010-03-09: This has been fixed in the code base and (presumably) will be in the next upcoming release.

Working with dates, times and timezones in Python

demo,python · David Janes · 7:37 am ·

Here’s a few examples of working with dates, times and timezones in Python. We are using the following packages:

  • datetime (part of the standard Python distribution)
  • dateutil – for date parsing, though there’s a lot more depth to this package that I’m not touching here
  • pytz – for timezone handling, and specifically making available the Olson timezone database to Python

There’s a lot of complexity to working with datetimes in any language; I’m not going to get into that but would prefer instead to show a few practical examples. Keep the following in mind:

  • datetimes may or may not have timezones associated with them. If they do not, they are called “naive” and their meaning is effectively defined by the program. In general, you want to work with non-naive datetimes. Generally the assumption would be that the naive datetime is in the application’s current timezone or the user’s preferred timezone
  • when working with datetimes, consider the strategy of converting everything to the universal UTC timezone, then converting back to the user’s timezone only when you need to display that to the user
  • if you are rolling your own code for handling dates, times and timezones and you haven’t done a lot of research, your implementation is garbage. Do yourself and everyone else a favor and use a library.

Our standard imports. Log is from the pybm library and it’s purpose is rather obvious.

from bm_log import Log
import dateutil.parser
import pytz
import datetime

Here’s an example of parsing the an e-mail or RSS type date using dateutil.

dts = "Thu, 13 Nov 2008 05:41:35 +0000"
dt = dateutil.parser.parse(dts)

Log(
    "Parsing an RFC type date",
    src = dts,
    dt = dt,
    iso = dt.isoformat(),
)
  message: Parsing an RFC type date
  dt: 2008-11-13 05:41:35+00:00
  iso: 2008-11-13T05:41:35+00:00
  src: Thu, 13 Nov 2008 05:41:35 +0000

Here’s an example of parsing an ISO Datetime

dts = '2008-11-13T05:41:35-0400'
dt = dateutil.parser.parse(dts)

Log(
    "Parsing an ISO Date with Timezone",
    src = dts,
    dt = dt,
    iso = dt.isoformat(),
)
  message: Parsing an ISO Date with Timezone
  dt: 2008-11-13 05:41:35-04:00
  iso: 2008-11-13T05:41:35-04:00
  src: 2008-11-13T05:41:35-0400

Here’s an example of parsing a naive timezone.

dts = '2008-11-13T05:41:35'
dt = dateutil.parser.parse(dts)

Log(
    "Parsing an ISO Date without a Timezone",
    src = dts,
    dt = dt,
    iso = dt.isoformat(),
)
  message: Parsing an ISO Date without a Timezone
  dt: 2008-11-13 05:41:35
  iso: 2008-11-13T05:41:35
  src: 2008-11-13T05:41:35

Here’s are two similar example, showing how to force the timezone if it’s not present. This will happen in the first part, but not the second.

tz = pytz.timezone('America/Toronto')
dts = '2008-11-13T05:41:35'
dt = dateutil.parser.parse(dts)
if dt.tzinfo == None:
    dt = dt.replace(tzinfo = tz)

Log(
    "Parsing an ISO Date without a Timezone BUT specifying default TZ",
    src = dts,
    dt = dt,
    iso = dt.isoformat(),
    tz = tz,
)

tz = pytz.timezone('America/Toronto')
dts = '2008-11-13T05:41:35-0400'
dt = dateutil.parser.parse(dts)
if dt.tzinfo == None:
    dt = dt.replace(tzinfo = tz)

Log(
    "Parsing an ISO Date with a Timezone AND specifying default TZ",
    src = dts,
    dt = dt,
    iso = dt.isoformat(),
    tz = tz,
)
  message: Parsing an ISO Date without a Timezone BUT specifying default TZ
  dt: 2008-11-13 05:41:35-05:00
  iso: 2008-11-13T05:41:35-05:00
  src: 2008-11-13T05:41:35
  tz: America/Toronto

  message: Parsing an ISO Date with a Timezone AND specifying default TZ
  dt: 2008-11-13 05:41:35-04:00
  iso: 2008-11-13T05:41:35-04:00
  src: 2008-11-13T05:41:35-0400
  tz: America/Toronto

Update: here’s an example of moving datetimes to UTC and then to a different Timezone. Remember: you want your backend code to work with UTC datetimes for simplicity and correctness:

dts = '2008-11-13T05:41:35-0400'
dt_orig = dateutil.parser.parse(dts)
dt_utc = dt.astimezone(pytz.UTC)

Log(
    "Changing a datetime to UTC",
    src = dts,
    dt_orig = dt_orig,
    dt_utc = dt_utc,
)

tz_vancouver = pytz.timezone('America/Vancouver')
dt_vancouver = dt_utc.astimezone(tz_vancouver)

Log(
    "Changing UTC datetime to a different timezone",
    dt_vancouver = dt_vancouver,
    dt_utc = dt_utc,
)
  message: Changing a datetime to UTC
  dt_orig: 2008-11-13 05:41:35-04:00
  dt_utc: 2008-11-13 09:41:35+00:00
  src: 2008-11-13T05:41:35-0400

  message: Changing UTC datetime to a different timezone
  dt_utc: 2008-11-13 09:41:35+00:00
  dt_vancouver: 2008-11-13 01:41:35-08:00

Here is an example of listing all “common” timezones using pytz. Note that “America” refers to the two continents, not the Irish word for the United States. Printing the actual timezone offset turned out to be a surprisingly complex task, which I will outline in a different blog post. For now let it suffice that with pytz try not to depend on utcoffset.

dt_now = datetime.datetime.now()

def tzname2offset(tzname):
    dt_in_utc = pytz.UTC.localize(dt_now)
    dt_in_tz = pytz.timezone(tzname).localize(dt_now)

    offset = dt_in_utc - dt_in_tz
    seconds = offset.seconds + offset.days * ( 24 * 3600 )

    return  "%02d:%02d" % ( seconds // 3600, seconds % 3600 // 60, )

Log(
    "Olsen (pytz) common timezones and their UTC offsets",
    timezones = map(
        lambda tzname: ( tzname, tzname2offset(tzname), ),
        pytz.common_timezones,
    )
)
  message: Olsen (pytz) common timezones and their UTC offsets
  timezones:
    [('Africa/Abidjan', '00:00'),
     ('Africa/Accra', '00:00'),
     ('Africa/Addis_Ababa', '03:00'),
     ('Africa/Algiers', '01:00'),
     ('Africa/Asmara', '03:00'),
...
     ('Pacific/Wake', '12:00'),
     ('Pacific/Wallis', '12:00'),
     ('US/Alaska', '-9:00'),
     ('US/Arizona', '-7:00'),
     ('US/Central', '-6:00'),
     ('US/Eastern', '-5:00'),
     ('US/Hawaii', '-10:00'),
     ('US/Mountain', '-7:00'),
     ('US/Pacific', '-8:00'),
     ('UTC', '00:00')]

December 18, 2008

Pipe Cleaner

demo,djolt,dqt,html / javascript,ideas,jd,maps,pipe cleaner,pybm,work · David Janes · 6:38 pm ·

I’ve been working (in my decreasing available spare time) on a project to pull together into a project called “Pipe Cleaner” all the various concepts I’ve been mentioning on this blog: Web Object Records (WORK) for API Access and object manipulation, Djolt for generating text from templates, Data/Query/Transform/Template (DQT) for transforming data and JD for scripting these elements together. The pieces came together this morning enough to put a demo together and here it is – the Toronto Fires Pt II Demo.

How, you may ask, does this differ from the original Toronto Fires Demo? The answer is how it is put together, which we describe here.

Index.dj

This is the Djolt template that generates the output. The data fed to this template is generate by the JD script, described in the next section.

<html>
<head>
    <link rel="stylesheet" type="text/css" href="css.css" />
    {{ gmaps.js|safe }}
</head>
<body>
<div id="content_wrapper">
    <div id="map_wrapper">
        {{ gmaps.html|safe }}
    </div>
    <div id="text_wrapper">
{% for incident in incidents %}
    <div id="{{ incident.IncidentNumber }}">
        {{ incident.body_sb|safe }}
    </div>
{% endfor %}
</div>
</body>
</html>

Quite simple … as you can see, most of the data is being pulled in from elsewhere. The elsewhere is provided by the script described in the next section.

Index.jd

This is the script that pull all the pieces together. Note that I’m not 100% happy with the way the data is imported, I would like the geocoding to become part of this data flow too. In the next release perhaps.

First we pull in the “fire” module that we wrote in the previous Map examples. This is doing exactly what you think: importing a Python module. We may have to increase the security or restrict this to working with an API for general purpose use.

import module:"fire";

Next we define two headers – one that is going to appear in the Google Maps popup, the next that is going to appear in the sidebar. They need to be different as they refer to themselves. Note that the sidebar header “breaks” the encapsulation of Google Maps – this seems to be unavoidable. The to:"fitem.head.map" and to:"fitem.head.sb" are manipulating a WORK dictionary to store values.

Note also here that we’ve extended JD to accept Python multiline strings – this was unavoidable if JD was to be useful to me.

set to:"fitem.head.map" value:"""
<h3>
<a href="#{{ IncidentNumber }}">{{ AlarmLevel}}: {{ IncidentType }} on {{ RawStreet }}
</h3>
""";

set to:"fitem.head.sb" value:"""
<h3>
{% if latitude and longitude %}
<a href="javascript:js_maps.map.panTo(new GLatLng({{ latitude }}, {{ longitude }}))">*
{% endif %}
<a href="#{{ IncidentNumber }}">{{ AlarmLevel}}: {{ IncidentType }} on {{ RawStreet }}
</h3>
""";

The next block defines the text of the body used to describe a fire incident. It follows much the same pattern as the previous block.

set to:"fitem.body" value:"""
<p>
Alarm Level: {{ AlarmLevel }}
<br />
Incident Type: {{ IncidentType }}
<br />
City: {{ City }}
<br />
Street: {{ Street }} ({{ CrossStreet }})
<br />
Units: {{ Units }}
</p>
""";

This is a map: it is translating the values in fire.GetGeocodeIncidents into a new format and storing that in incidents. The format that we were are storing it in is understood by the Google Maps generating module.

We may rename this translate, as the word map is somewhat overloaded.

map from:"fire.GetGeocodedIncidents" to:"incidents" map:{
    "latitude" : "{{ latitude }}",
    "longitude" : "{{ longitude }}",
    "title" : "{{ AlarmLevel}}: {{ IncidentType }} on {{ RawStreet }}",
    "uri" : "{{ HOME_URI }}#{{ IncidentNumber }}",
    "body" : "{{ *fitem.head.map|safe }}{{ *fitem.body|safe }}",
    "body_sb" : "{{ *fitem.head.sb|safe }}{{ *fitem.body|safe }}",
    "IncidentNumber" : "{{ IncidentNumber }}"
};

Next we set up the “meta” (see WORK meta description if you’re not following along) for the maps. The render_value:true declaration makes PC interpret the templates in strings). We then call our Google Maps generating code (which are actually more Pipe Cleaners) and that gets fed to the Djolt template we first showed you. Clear? Maybe not, we’ll have more examples coming…

set to:"map_meta" render_value:true value:{
    "id" : "maps",
    "latitude" : 43.67,
    "longitude" : -79.38,
    "uzoom" : -13,
    "gzoom" : 13,
    "api_key" : "{{ cfg.gmaps.api_key|otherwise:'...mykey...' }}",
    "html" : {
        "width" : "1024px",
        "height" : "800px"
    }
};

load template:"gmaps.js" items:"incidents" meta:"map_meta";
load template:"gmaps.html" items:"incidents" meta:"map_meta";

December 13, 2008

Brief notes on SIMILE Timeline

demo,html / javascript · David Janes · 8:01 am ·

SIMILE Timeline is “the Google Maps for time based events”. It used to be housed at MIT but now it’s graduated to Google Code. I’ve created an example application showing this year’s Oscar awards and a number of movies that are, umm, are not all likely to be nominated.

  • the application source can be seen here; it is based on this demo and some work I had done previously. Note:
    • multiple scrolling bands linked together
    • custom icons
    • custom colors
  • we demonstrate populating the timeline widget using JSON data coded in the application; the most difficult part of putting this demo together was cutting and pasting all this data, a task we hope to make easier with our DQT code
  • the documentation for Timeline is starting to diverge from the source code; showEventText is now replaced by overview in the band creation code
  • there’s a large number of weaknesses (still) with Timeline for dealing with arbitrary data, these may be corrected in the Javascript code but I don’t have time to go through all this
    • note the incorrect placement of custom icons; using Firebug to inspect the HTML, I discovered that unfortunately everything is placed using style tags so it’s difficult to correct using CSS. Ideally I would like to be able to assign classes and ID tags to everything
    • there doesn’t seem to be an obvious way to control the widths of the labels; in fact, if I reduce the band spacing in the top band, the text starts to overlap in a horrible manner
    • popup information boxes get confused when there is too little space to display information
    • when I don’t add a description to events, it uses “undefined” (see the Oscar Nominations Period)
    • date display functions are inferring more resolution (i.e. an actual time as opposed to the just the date) that I’m giving it

If anyone has corrections for me I’ll update the demo

Older Posts »

Powered by WordPress

Switch to our mobile site