<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>David Janes&#039; Code Weblog &#187; python</title>
	<atom:link href="http://code.davidjanes.com/blog/category/python/feed/" rel="self" type="application/rss+xml" />
	<link>http://code.davidjanes.com/blog</link>
	<description>Just another WordPress weblog</description>
	<lastBuildDate>Sun, 11 Apr 2010 12:32:10 +0000</lastBuildDate>
	<generator>http://wordpress.org/?v=2.9.2</generator>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
			<item>
		<title>How to add XCode &#8216;#pragma mark&#8217;-like sections for Python code</title>
		<link>http://code.davidjanes.com/blog/2010/01/11/how-to-add-xcode-pragma-mark-like-sections-for-python-code/</link>
		<comments>http://code.davidjanes.com/blog/2010/01/11/how-to-add-xcode-pragma-mark-like-sections-for-python-code/#comments</comments>
		<pubDate>Mon, 11 Jan 2010 12:43:10 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[python]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=649</guid>
		<description><![CDATA[Just add in your Python comments:
# MARK: comment
# TODO: comment
# FIXME: comment
# !!!: comment
# ???: comment
From here.
]]></description>
			<content:encoded><![CDATA[<p>Just add in your Python comments:</p>
<pre># MARK: <em>comment</em>
# TODO: <em>comment</em>
# FIXME: <em>comment</em>
# !!!: <em>comment</em>
# ???: <em>comment</em></pre>
<p>From <a href="http://mail.python.org/pipermail/pythonmac-sig/2008-August/020375.html">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2010/01/11/how-to-add-xcode-pragma-mark-like-sections-for-python-code/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>PIL, libjpeg, jpeg and Mac OS/X Snow Leopard</title>
		<link>http://code.davidjanes.com/blog/2009/11/16/pil-libjpeg-jpeg-and-mac-osx-snow-leopard/</link>
		<comments>http://code.davidjanes.com/blog/2009/11/16/pil-libjpeg-jpeg-and-mac-osx-snow-leopard/#comments</comments>
		<pubDate>Mon, 16 Nov 2009 12:47:50 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[macintosh]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=620</guid>
		<description><![CDATA[If you want to the use the Python Imaging Library on Mac OS/X Snow Leopard, these instructions appear to be the best way to to get libjpeg installed:
1. Download the source from http://libjpeg.sourceforge.net/
2. Extract, configure, make:

tar zxvf jpegsrc.v6b.tar.gz
cd jpeg-6b
cp /usr/share/libtool/config/config.sub .
cp /usr/share/libtool/config/config.guess .
./configure --enable-shared --enable-static
make

3. You may need to create the following directories:

sudo mkdir -p [...]]]></description>
			<content:encoded><![CDATA[<p>If you want to the use the <a href="http://www.pythonware.com/products/pil/">Python Imaging Library</a> on Mac OS/X Snow Leopard, <a href="http://jetfar.com/libjpeg-and-python-imaging-pil-on-snow-leopard/">these instructions</a> appear to be the best way to to get <code>libjpeg</code> installed:</p>
<blockquote><p>1. Download the source from <a href="http://libjpeg.sourceforge.net/">http://libjpeg.sourceforge.net/</a></p>
<p>2. Extract, configure, make:<br />
<code><br />
tar zxvf jpegsrc.v6b.tar.gz<br />
cd jpeg-6b<br />
cp /usr/share/libtool/config/config.sub .<br />
cp /usr/share/libtool/config/config.guess .<br />
./configure --enable-shared --enable-static<br />
make<br />
</code></p>
<p>3. You may need to create the following directories:<br />
<code><br />
sudo mkdir -p /usr/local/include<br />
sudo mkdir -p /usr/local/lib<br />
sudo mkdir -p /usr/local/man/man1<br />
</code></p>
<p>4. Now you can install it as usual.<br />
<code><br />
sudo make install</code></p></blockquote>
<p>I used to use <a href="http://www.finkproject.org/">Fink</a> on Leopard, but it didn&#8217;t seem to work to well this time. If you&#8217;ve previously made an attempt at installing PIL, make sure to <code>rm -rf build</code>.</p>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2009/11/16/pil-libjpeg-jpeg-and-mac-osx-snow-leopard/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Django 1.1 and ImageField</title>
		<link>http://code.davidjanes.com/blog/2009/11/16/django-1-1-and-imagefield/</link>
		<comments>http://code.davidjanes.com/blog/2009/11/16/django-1-1-and-imagefield/#comments</comments>
		<pubDate>Mon, 16 Nov 2009 12:42:12 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[code fragments]]></category>
		<category><![CDATA[django]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=616</guid>
		<description><![CDATA[Having recently upgraded to Django 1.1, I suddenly started getting the error messages that look like:
  File "/Library/Python/2.6/site-packages/django/db/models/fields/related.py", line 257, in __get__
    rel_obj = QuerySet(self.field.rel.to).get(**params)
  File "/Library/Python/2.6/site-packages/django/db/models/query.py", line 300, in get
    num = len(clone)
  File "/Library/Python/2.6/site-packages/django/db/models/query.py", line 81, in __len__
    self._result_cache = list(self.iterator())
 [...]]]></description>
			<content:encoded><![CDATA[<p>Having recently upgraded to Django 1.1, I suddenly started getting the error messages that look like:</p>
<pre>  File "/Library/Python/2.6/site-packages/django/db/models/fields/related.py", line 257, in __get__
    rel_obj = QuerySet(self.field.rel.to).get(**params)
  File "/Library/Python/2.6/site-packages/django/db/models/query.py", line 300, in get
    num = len(clone)
  File "/Library/Python/2.6/site-packages/django/db/models/query.py", line 81, in __len__
    self._result_cache = list(self.iterator())
  File "/Library/Python/2.6/site-packages/django/db/models/query.py", line 251, in iterator
    obj = self.model(*row[index_start:aggregate_start])
  File "/Library/Python/2.6/site-packages/django/db/models/base.py", line 324, in __init__
    signals.post_init.send(sender=self.__class__, instance=self)
  File "/Library/Python/2.6/site-packages/django/dispatch/dispatcher.py", line 166, in send
    response = receiver(signal=self, sender=sender, **named)
  File "/Library/Python/2.6/site-packages/django/db/models/fields/files.py", line 368, in update_dimension_fields
    (self.width_field and not getattr(instance, self.width_field))
AttributeError: 'Icon' object has no attribute 'width'</pre>
<p>The issue turns out to be that you can&#8217;t just define the <code>ImageField</code> in your model, you also have to explicitly define the fields that will store the width and height fields for the image field. The sql generation tools for Django don&#8217;t do it for you.</p>
<p>For various reasons, I can&#8217;t do that this at this moment so I made the following I hack which I strongly recommend you don&#8217;t use (for efficiency reasons, as with this the height &amp; width have to be computed every time you access the image). This is added to <code>site-packages/django/db/models/fields</code> around line 367.</p>
<pre>if self.width_field and not hasattr(instance, self.width_field):
     dimension_fields_filled = False
else:
     dimension_fields_filled = not(
          (self.width_field and not getattr(instance, self.width_field))
          or (self.height_field and not getattr(instance, self.height_field))
     )</pre>
<p>The proper solutions probably involve:</p>
<ul>
<li>not adding the hack above and explicitly adding the fields, as per <a href="http://code.djangoproject.com/ticket/11196">here</a></li>
<li> updating the documentation (<a href="http://docs.djangoproject.com/en/dev/ref/models/fields/#imagefield">here</a> and <a href="http://docs.djangoproject.com/en/dev/topics/files/#using-files-in-models">here</a>) to say &#8220;you also have to add the fields to the DB&#8221;</li>
<li>making <code>syncdb</code>/<code>sql</code> automatically generate the width &amp; height fields</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2009/11/16/django-1-1-and-imagefield/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Fizz Buzz in one line of Python</title>
		<link>http://code.davidjanes.com/blog/2009/10/24/fizz-buzz-in-one-line-of-python/</link>
		<comments>http://code.davidjanes.com/blog/2009/10/24/fizz-buzz-in-one-line-of-python/#comments</comments>
		<pubDate>Sat, 24 Oct 2009 10:58:30 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[code fragments]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=589</guid>
		<description><![CDATA[Since Libin points to &#8220;Fizz Buzz&#8221; in one line of Ruby, I feel it&#8217;s only fair to do it in one line of Python:
print [ not i % 15 and "Fizz Buzz" or not i % 5 and "Buzz" or not i % 3 and "Fizz" or i for i in xrange(1, 101) ]
My preference [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://blog.libinpan.com/2008/02/fizz-buzz-fizzbuzz/">Since Libin points to &#8220;Fizz Buzz&#8221; in one line of Ruby</a>, I feel it&#8217;s only fair to do it in one line of Python:</p>
<pre>print [ not i % 15 and "Fizz Buzz" or not i % 5 and "Buzz" or not i % 3 and "Fizz" or i for i in xrange(1, 101) ]</pre>
<p>My preference is to really have a few more brackets in there, for clarity but apparently terseness is considered a virtue in and off itself sometimes. There&#8217;s other implementations of this in one line of Python:</p>
<ul>
<li><a href="http://www.sogeti-phoenix.com/Blogs/category/FizzBuzz.aspx">Sogeti Phoenix</a></li>
<li><a href="http://codepad.org/xHBbgcnO">Scaevolus on Codepad</a> &#8211; have a drink before you look at this one</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2009/10/24/fizz-buzz-in-one-line-of-python/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Turning garbage &#8220;HTML&#8221; into XML parsable XHTML using Beautiful Soup</title>
		<link>http://code.davidjanes.com/blog/2009/02/05/turning-garbage-html-into-xml-parsable-xhtml-using-beautiful-soup/</link>
		<comments>http://code.davidjanes.com/blog/2009/02/05/turning-garbage-html-into-xml-parsable-xhtml-using-beautiful-soup/#comments</comments>
		<pubDate>Thu, 05 Feb 2009 11:56:41 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[html / javascript]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=471</guid>
		<description><![CDATA[Here&#8217;s our problem child HTML: Members of Provincial Parliament. Amongst the attrocities committed against humanity, we see:

use of undeclared namespaces in both tags (&#60;o:p&#62;) and attributes (&#60;st1:City w:st="on"&#62;)
XML processing instructions &#8211; incorrectly formatted! &#8211; dropped into the middle of the document in multiple places (&#60;?xml:namespace prefix = "o" ns = "urn:schemas-microsoft-com:office:office" /&#62;)
leading space before the [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s our problem child HTML: <a href="http://www.elections.on.ca/en-CA/Tools/MPP.htm">Members of Provincial Parliament</a>. Amongst the attrocities committed against humanity, we see:</p>
<ul>
<li>use of undeclared namespaces in both tags (<code>&lt;o:p&gt;</code>) and attributes (<code>&lt;st1:City w:st="on"&gt;</code>)</li>
<li>XML processing instructions &#8211; incorrectly formatted! &#8211; dropped into the middle of the document in multiple places (<code>&lt;?xml:namespace prefix = "o" ns = "urn:schemas-microsoft-com:office:office" /&gt;</code>)</li>
<li>leading space before the DOCTYPE</li>
</ul>
<p>This is so broken that even HTML TIDY chokes on it, producing a severely truncated file. This broken document provided me however an opportunity to play with the Python library <a href="http://www.crummy.com/software/BeautifulSoup/">Beautiful Soup</a>, which lists amongst it&#8217;s advantages:</p>
<ul>
<li>Beautiful Soup won&#8217;t choke if you give it bad markup. It yields a parse tree that makes approximately as much sense as your original document. This is usually good enough to collect the data you need and run away.</li>
<li>Beautiful Soup provides a few simple methods and Pythonic idioms for navigating, searching, and modifying a parse tree: a toolkit for dissecting a document and extracting what you need. You don&#8217;t have to create a custom parser for each application.</li>
<li>Beautiful Soup automatically converts incoming documents to Unicode and outgoing documents to UTF-8. You don&#8217;t have to think about encodings, unless the document doesn&#8217;t specify an encoding and Beautiful Soup can&#8217;t autodetect one. Then you just have to specify the original encoding.</li>
</ul>
<p>Alas, straight out of the box Beautiful Soup<em> didn&#8217;t</em> do it for me, perhaps because of some of my strange requirements (my data flow works something like this: raw document → XML → DOM parser → JSON). However, Beautiful Soup does provide the necessary calls to manipulate the document to do the trick. Here&#8217;s what I did:</p>
<p>First, we import Beautiful Soup and parse it to the object soup. We&#8217;re expecting an HTML node at the top, so we look for that.</p>
<pre>import BeautifulSoup
soup = BeautifulSoup.BeautifulSoup(raw)

if not hasattr(soup, "html"):
	return</pre>
<p>Next, we loop through every node in the document, using Beautiful Soup&#8217;s <code>findAll</code> interface. You will see several variants of this call here in the code. What we&#8217;re looking for is use of namespaces, which we then add to the HTML element as attributes using fake namespace declarations.</p>
<p>We need to find namespaces already declared:</p>
<pre>used = {}
for ns_key, ns_value in soup.html.attrs:
	if not ns_key.startswith("xmlns:"):
		continue

	used[ns_key[6:]] = 1</pre>
<p>Then we look for ones that are actually used:</p>
<pre>nsd = {}
for item in soup.findAll():
	name = item.name
	if name.find(':') &gt; -1:
		nsd[name[:name.find(':')]] = 1

	for name, value in item.attrs:
		if name.find(':') &gt; -1:
			nsd[name[:name.find(':')]] = 1</pre>
<p>Then we add all the missing namespaces to the HTML node.</p>
<pre>for ns in nsd.keys():
	if not used.get(ns):
		soup.html.attrs.append(( "xmlns:%s" % ns, "http://www.example.com#%s" % ns, ))</pre>
<p>Next we look for attributes that aren&#8217;t properly XML declarations, e.g. HTML style <code>&lt;input checked /&gt;</code>-type items.</p>
<pre>for item in soup.findAll():
	for index, ( name, value ) in enumerate(item.attrs):
		if value == None:
			item.attrs[index] = ( name, name )</pre>
<p>Then we remove all nodes from the document that we aren&#8217;t expecting to see. If you keep the <code>script</code> tags you&#8217;re going to have to make sure that each node is properly CDATA encoded; I didn&#8217;t care about this so I just remove them.</p>
<pre>[item.extract() for item in soup.findAll('script')]
[item.extract() for item in soup.findAll(
    text = lambda text:isinstance(text, BeautifulSoup.ProcessingInstruction ))]
[item.extract() for item in soup.findAll(
    text = lambda text:isinstance(text, BeautifulSoup.Declaration ))]</pre>
<p>In the final step we convert the document to Unicode. This requires another step of post-processing: <code>html2xml</code> changes all entity uses that XML doesn&#8217;t recognize into a <code>&amp;#...;</code> style. E.g. we do change <code>&amp;nbsp;</code> but we don&#8217;t change <code>&amp;amp;</code>. At this point we now have a document that can be processed by standard DOM parsers (if you convert to UTF-8 bytes, sigh).</p>
<pre>cooked = unicode(soup)
cooked = bm_text.html2xml(cooked)</pre>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2009/02/05/turning-garbage-html-into-xml-parsable-xhtml-using-beautiful-soup/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Creating OPML subscription lists using Pipe Cleaner</title>
		<link>http://code.davidjanes.com/blog/2009/01/25/creating-opml-subscription-lists-using-pipe-cleaner/</link>
		<comments>http://code.davidjanes.com/blog/2009/01/25/creating-opml-subscription-lists-using-pipe-cleaner/#comments</comments>
		<pubDate>Sun, 25 Jan 2009 16:40:30 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[authentication]]></category>
		<category><![CDATA[demo]]></category>
		<category><![CDATA[pipe cleaner]]></category>
		<category><![CDATA[pybm]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=433</guid>
		<description><![CDATA[Here&#8217;s a neat API I completed this morning, called api_feeds. It takes a URL (or a list of them) and transforms them into:

the home page associated with the URL
the feed(s) for the URL
the name of the home page

If you&#8217;re following along at home, this is essentially the information needed for a single outline in an [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a neat API I completed this morning, called <code>api_feeds</code>. It takes a URL (or a list of them) and transforms them into:</p>
<ul>
<li>the home page associated with the URL</li>
<li>the feed(s) for the URL</li>
<li>the name of the home page</li>
</ul>
<p>If you&#8217;re following along at home, this is essentially the information needed for a single <code>outline</code> in an <a href="http://www.opml.org/spec2#subscriptionLists">OPML subscription list</a>.</p>
<p>Here&#8217;s a simple python example:</p>
<pre>api = api_feeds.OneFeed()
api.request = {
    "uri" : "http://code.davidjanes.com/blog/2009/01/23/transparently-working-with-oauath/",
}

pprint.pprint(api.response, width = 1)</pre>
<p>And here&#8217;s what the output looks like:</p>
<pre>{'link': u'http://code.davidjanes.com/blog',
 'links': [{'href': u'http://feeds.feedburner.com/DavidJanesCode',
            'rel': 'alternate',
            'type': u'application/rss+xml'}],
 'title': u"David Janes' Code Weblog"}</pre>
<p>There&#8217;s actually quite a bit going on here behind the scenes, most of it using code I didn&#8217;t initially write but have quite heavily hacked: the <a href="http://www.feedparser.org/">Universal Feed Parser</a> and the <a href="http://pypi.python.org/pypi/feedfinder/1.371">Feed Finder</a>.</p>
<p>What becomes really interesting what happens when we combine this with other modules. Here&#8217;s an example of how we can build an OPML subscription list from all the posts I&#8217;ve tagged &#8220;python&#8221; and &#8220;django&#8221; in <a href="http://delicious.com/dpjanes/python+django">del.icio.us</a>. The code looks up each link I&#8217;ve bookmarked, does the feed discovery above, filters out items that don&#8217;t have feeds, and outputs as OPML. Note the neat pipeline type aspect to the code:</p>
<pre>api_delicious = api_delicious.PostsList(tag = "python django")
api_many = api_feeds.ManyFeeds(require_feed = True)
api_opml = api_opml.OPMLWriter()

api_many.items = api_delicious.items
api_opml.items = api_many.items

print api_opml.Produce()</pre>
<p>Producing the following OPML:</p>
<pre>&lt;opml encoding="utf-8" version="2.0"&gt;
  &lt;head&gt;
    &lt;title&gt;[Untitled]&lt;/title&gt;
  &lt;/head&gt;
  &lt;body&gt;
    &lt;outline htmlUrl="http://push.cx"
      rssUrl="http://push.cx/feed"
      text="Push cx"
      type="rss"/&gt;
    &lt;outline htmlUrl="http://crankycoder.com"
      rssUrl="http://crankycoder.com/feed/"
      text="crankycoder.com"
      type="rss"/&gt;
    &lt;outline htmlUrl="http://blog.dowski.com"
      rssUrl="http://blog.dowski.com/feed/"
      text="the occasional occurrence"
      type="rss"/&gt;
    &lt;outline htmlUrl="http://www.b-list.org/feeds/entries/"
      rssUrl="http://feeds2.feedburner.com/b-list-entries"
      text="The B-List: Latest entries"
      type="rss"/&gt;
    &lt;outline htmlUrl="http://blog.thescoop.org"
      rssUrl="http://blog.thescoop.org/feed/"
      text="The Scoop"
      type="rss"/&gt;
    &lt;outline htmlUrl="http://effbot.org"
      rssUrl="http://effbot.org/zone/rss.xml"
      text="effbot.org"
      type="rss"/&gt;
    &lt;outline htmlUrl="http://blog.disqus.net"
      rssUrl="http://feeds.feedburner.com/BigHeadLabs"
      text="Disqus"
      type="rss"/&gt;
    &lt;outline htmlUrl="http://blog.ianbicking.org"
      rssUrl="http://blog.ianbicking.org/feed/atom/"
      text="Ian Bicking: a blog"
      type="rss"/&gt;
    &lt;outline htmlUrl="http://antoniocangiano.com"
      rssUrl="http://feeds.feedburner.com/ZenAndTheArtOfRubyProgramming"
      text="Zen and the Art of Programming"
      type="rss"/&gt;
    &lt;outline htmlUrl="http://www.carthage.edu/webdev"
      rssUrl="http://www.carthage.edu/webdev/?feed=rss2"
      text="carthage webdev"
      type="rss"/&gt;
    &lt;outline htmlUrl="http://www.eweek.com"
      rssUrl="http://www.eweek.com/rss-feeds-13.xml"
      text="Application Development - RSS Feeds"
      type="rss"/&gt;
    &lt;outline htmlUrl="http://jeffcroft.com/"
      rssUrl="http://feeds.feedburner.com/jeffcroft/blog"
      text="JeffCroft.com: Latest blog entries"
      type="rss"/&gt;
  &lt;/body&gt;
&lt;/opml&gt;</pre>
<p>This will be just as terse (terser, probably) when written as a Pipe Cleaner script; I&#8217;m just struggling over how to introduce the authentication code gracefully into the scripts.</p>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2009/01/25/creating-opml-subscription-lists-using-pipe-cleaner/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Transparently working with OAuath</title>
		<link>http://code.davidjanes.com/blog/2009/01/23/transparently-working-with-oauath/</link>
		<comments>http://code.davidjanes.com/blog/2009/01/23/transparently-working-with-oauath/#comments</comments>
		<pubDate>Fri, 23 Jan 2009 10:03:45 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[authentication]]></category>
		<category><![CDATA[demo]]></category>
		<category><![CDATA[pipe cleaner]]></category>
		<category><![CDATA[pybm]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=414</guid>
		<description><![CDATA[This is part one of two posts I&#8217;m going to write about OAuth; the second will be somewhat more critical in tone. Before I criticize &#8211; and I know it&#8217;s hard to put together technologically things like OAuth &#8211; I want to actually accomplish something with it, so I at least I appear that I [...]]]></description>
			<content:encoded><![CDATA[<p>This is part one of two posts I&#8217;m going to write about <a href="http://oauth.net/">OAuth</a>; the second will be somewhat more critical in tone. Before I criticize &#8211; and I know it&#8217;s hard to put together technologically things like OAuth &#8211; I want to actually accomplish something with it, so I at least I appear that I have somewhat of a clue about it. This is a report of what I&#8217;ve done.</p>
<p><code>bm_uri</code> is a libary and tool I&#8217;ve written for working with URIs, and in particular <code>http://</code> and <code>https://</code> URLs. Here are some of the advantages of using <code>bm_uri</code> over all the normal Python <code>urllib</code> and <code>urllib2</code> methods:</p>
<ul>
<li>downloads are cached; if a URL is temporarily not available, <code>bm_uri</code> will return the cached version, likewise if it has been downloaded in the near past, the cached version will be returned rather than hitting the net again</li>
<li>downloads can be <em>cooked</em>, meaning converted into a more useful form such as TIDY-cleaned up HTML, JSON, Unicode text and so forth</li>
<li><code>bm_uri</code> handles all the protocol stuff for you (such as User-Agent, Last-Modified and so forth) so you don&#8217;t have to</li>
<li>authentication is handled &#8220;invisibly&#8221; as possible for you &#8230; at least after the initial setup</li>
</ul>
<p>Here is an example of accessing a OAuth resource using <code>bm_uri</code> returning my current location from <a href="http://fireeagle.yahoo.net/">Fire Eagle</a> as a Python object. From a programming point of a view, I believe I have reduced this to close to the minimum number of steps possible. Here&#8217;s the setup phase:</p>
<pre>import bm_uri
import bm_oauth
import pprint

bm_cfg.cfg.initialize()

bm_oauth.OAuth(service_name = "fireeagle")</pre>
<p>Here&#8217;s using it in code &#8211; note how there&#8217;s no reference to OAuth here whatsoever.</p>
<pre>loader = bm_uri.JSONLoader('https://fireeagle.yahooapis.com/api/0.1/user.json?format=json')
loader.Load()

pprint.pprint(loader.GetCooked())</pre>
<p>And here&#8217;s the output of the program:</p>
<pre>{u'stat': u'ok',
 u'user': {u'location_hierarchy': [{u'best_guess': True,
         u'geometry': {u'coordinates': [-79.418426513699998,
                   43.731891632100002],
              u'type': u'Point'},
         u'id': 572261,
         u'label': None,
         u'level': 1,
         u'level_name': u'postal',
         u'located_at': u'2008-03-19T04:09:30-07:00',
...
         u'name': u'Canada',
         u'normal_name': None,
         u'place_id': u'EESRy8qbApgaeIkbsA',
         u'woeid': 23424775}],
     u'readable': True,
     u'writable': False}}</pre>
<h4>Gather information</h4>
<p>The devil is in the details, obviously and with OAuth, the little satan is doing the initial setup. Here&#8217;s how I did this for Fire Eagle &#8211; there&#8217;ll be something analogous for whatever service you are using:</p>
<ul>
<li>Log in or sign up (obviously)</li>
<li>Go to the <a href="http://fireeagle.yahoo.net/developer">Developers&#8217; Page</a></li>
<li>Click on <a href="https://fireeagle.yahoo.net/developer/create">Create a New App</a></li>
<li>Copy the &#8220;Consumer Key&#8221; and the &#8220;Consumer Secret&#8221; &#8230; these will be long-ish strings of nonsense</li>
<li>Find out the Request Token URL, the Access Token URL, and the Authorization URL. These are public knowledge and for Fire Eagle are:
<ul>
<li>https://fireeagle.yahooapis.com/oauth/request_token</li>
<li>https://fireeagle.yahooapis.com/oauth/access_token</li>
<li>http://fireeagle.yahoo.net/oauth/authorize</li>
</ul>
</li>
</ul>
<p><em>Note how Yahoo has conveniently made that last URL similar looking to the others, but not quite the same. Thanks!</em></p>
<p>However you implement OAuth, you&#8217;re probably going to need to be able to persist information to disk or database. As documented here several weeks ago, <a href="http://code.davidjanes.com/blog/2009/01/09/thinking-about-configuration/">we already have that covered</a> with our bm_cfg module. In <code>~/.cfg/fireeagle.json</code>, create the following JSON format file:</p>
<pre>{
 "fireeagle": {
  "api_uri" : "https://fireeagle.yahooapis.com/",
  "oauth_access_token_url": "https://fireeagle.yahooapis.com/oauth/access_token",
  "oauth_authorization_url": "http://fireeagle.yahoo.net/oauth/authorize",
  "oauth_consumer_key": "ABCDEFGHIJKL",
  "oauth_consumer_secret": "ABCDEFGHIJKLMNOPQRSTUVWXYZ012345",
  "oauth_token_url": "https://fireeagle.yahooapis.com/oauth/request_token",
 }
}</pre>
<p>The only new item here is the <code>api_uri</code>: that&#8217;s the prefix of URLs that <code>bm_uri</code> will use OAuth with.</p>
<h4>Set it up</h4>
<p>Next you have to do all sorts of OAuth stuff to actually work with OAuth. If the <em>why</em> interests you, <a href="http://oauth.net/core/1.0/">please go read the spec</a>! I&#8217;m more of <em>how</em> person myself, and this is what we need to do:</p>
<ul>
<li>run: <code>python bm_uri.py --service fireeagle --authorize</code></li>
<li>this will pop up a browser window; grant your application access and then&#8230;</li>
<li>run: <code>python bm_uri.py --service fireeagle --exchange</code></li>
</ul>
<p>And that&#8217;s it &#8211; you should now be able to just work with the Fire Eagle API in bm_uri without even having to know OAuth is there!</p>
<h4>End notes</h4>
<ul>
<li>the current implementation only works with HTTP/<a href="http://en.wikipedia.org/wiki/Representational_State_Transfer">REST</a> GET; POST to come soon, DELETE and PUT as needed</li>
<li><code>bm_uri</code>, <code>bm_config</code> and the rest of the code is freely licensed and available <a href="http://code.google.com/p/pybm/">here</a>. It is a constantly changing product, albeit converging on perfection in my own mind ;-)</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2009/01/23/transparently-working-with-oauath/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Thinking about Configuration</title>
		<link>http://code.davidjanes.com/blog/2009/01/09/thinking-about-configuration/</link>
		<comments>http://code.davidjanes.com/blog/2009/01/09/thinking-about-configuration/#comments</comments>
		<pubDate>Fri, 09 Jan 2009 12:20:02 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[ideas]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=406</guid>
		<description><![CDATA[Happy New Year, everyone. I&#8217;ve been busy at paying work recently, plus cleaning up and testing existing code I&#8217;ve been discussing here over the last few months. At work I&#8217;ve been developing in WebObjects, which though a lovely platform is not the way of the future so I&#8217;m not documenting many of my experiences here.
The [...]]]></description>
			<content:encoded><![CDATA[<p>Happy New Year, everyone. I&#8217;ve been busy at paying work recently, plus cleaning up and testing existing code I&#8217;ve been discussing here over the last few months. At work I&#8217;ve been developing in WebObjects, which though a lovely platform is not the way of the future so I&#8217;m not documenting many of my experiences here.</p>
<p>The applications I&#8217;ve been working on recently, <a href="http://code.davidjanes.com/blog/category/pipe-cleaner/">Pipe Cleaner</a> and <a href="http://code.davidjanes.com/blog/category/genx/">GenX</a>, need &#8211; like most applications &#8211; configuration. This will store information which can be safely exposed to the public, such as my Google Maps API key, and information that I need to keep private within the application, such as my Freebase username and password (cf. however <a href="http://adactio.com/journal/1357/"><em>the password anti-pattern</em></a>). Furthermore, though the code I&#8217;m writing is in Python it is possible that the code that provides the UI will be written in another language, such as PHP inside of WordPress.</p>
<p>Given these considerations, here&#8217;s my design choices:</p>
<ul>
<li>configuration files are stored as multiple individual files inside a directory (or directories)</li>
<li>configuration files are in JSON, and contain a dictionary of dictionaries (see below)</li>
<li>configuration files can be marked as private or public</li>
<li>the same logical configuration (say for Amazon, which has both public and private information) can be in a public and private file</li>
<li>the configuration is global, but is accessed through setter/getter properties</li>
<li>non-global versions of the configuration can be made</li>
</ul>
<p>That all said, here&#8217;s what I&#8217;ve written. First, <a href="http://wiki.python.org/moin/PythonDecoratorLibrary#PropertyDefinition">the setters and getters</a>:</p>
<pre>class Cfg:
    _cfg_private = {}
    _cfg_public = {}

    @apply
    def public():
        def fget(self):
            return  self._cfg_public

        return property(**locals())

    @apply
    def private():
        def fget(self):
            return  self._cfg_private

        return property(**locals())</pre>
<p>As an aside, I&#8217;m not 100% sure about Python decorators and wonder if my favorite language is being turned into a C++ like mess.</p>
<p>Next, the &#8216;add&#8217; function that adds information to the configuration ensuring private and public are handled correctly. Note that there can be multiple dictionaries inside of &#8216;d&#8217;, but &#8216;d&#8217; is either all Public or not.</p>
<pre>    def add(self, d):
        if type(d) != types.DictType:
            raise TypeError("only dictionaries can be added")

        if d.get('@Public'):
            #
            #   Public definitions never overwrite private definitions
            #
            for key, value in d.iteritems():
                if type(value) != types.DictType:
                    continue

                if not self._cfg_private.has_key(key):
                    self._cfg_private[key] = value

                self._cfg_public[key] = value
        else:
            self._cfg_private.update(d)</pre>
<p>And finally the loader, which gets everything in a directory or one level down. Note the &#8216;exception&#8217; parameter which makes me a bad person, but I don&#8217;t like code failing unless I tell it to.</p>
<pre>    def load(self, path, exception = False, depth = 0):
        try:
            if os.path.isdir(path) and depth &lt; 2:
                for file in os.listdir(path):
                    self.load(os.path.join(path, file))
            elif os.path.isfile(path):
                if path.endswith(".json"):
                    self.add(json.loads(bm_io.readfile(path)))

        except:
            if exception:
                raise

            Log("ignoring exception", exception = True, path = path)</pre>
<p>And one more thing: make the global configuation:</p>
<pre>cfg = Cfg()</pre>
<p>Here&#8217;s how you use it:</p>
<pre>import bm_cfg

# setup ... on a per-file or directory basis
for file in sys.argv[1:]:
    bm_cfg.cfg.load(file)

# use it
pprint.pprint({
    "private" : bm_cfg.cfg.private,
    "public" : bm_cfg.cfg.public,
}, width = 1)</pre>
<p>Here&#8217;s what my configuration directory looks like:</p>
<pre>$ pwd
/Users/davidjanes/Sites/pc/cfg
$ ls
amazon.json		freebase.json		praized.json
amazon.public.json	gmaps.json		yahoo.json</pre>
<p>Here&#8217;s the (private) <code>amazon.json</code>:</p>
<pre>{
    "amazon" : {
        "Locale" : "us",
        "AccessKeyID" : "0......",
        "AssociateTag" : "ona-20",
        "Private" : "Don't See"
    }
}</pre>
<p>And here&#8217;s the (public) <code>amazon.public.json</code>:</p>
<pre>{
    "@Public" : 1,
    "amazon" : {
        "Locale" : "us",
        "AccessKeyID" : "0......",
        "AssociateTag" : "ona-20"
    }
}</pre>
<p>Note that if the private version of the Amazon file wasn&#8217;t available, the public version would also be in the private one. I.e. the private configuration basically is &#8220;everything&#8221; (noting possibly exceptions above in the code).</p>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2009/01/09/thinking-about-configuration/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Issues with utcoffset and pytz</title>
		<link>http://code.davidjanes.com/blog/2008/12/22/pytz-utcoffset/</link>
		<comments>http://code.davidjanes.com/blog/2008/12/22/pytz-utcoffset/#comments</comments>
		<pubDate>Mon, 22 Dec 2008 15:14:45 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[demo]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=384</guid>
		<description><![CDATA[In the previous entry, we talked about the difficultly in finding out the delta from UTC for a timezone returned from the pytz module. In particular, consider the offset for St. John&#8217;s, Newfoundland which should be at -3:30.
dt_now = datetime.datetime.now()
tz = pytz.timezone('America/St_Johns')

offset = tz.utcoffset(dt_now)

Log(
    "using datetime.utcoffset",
    offset = format(offset),
)
With [...]]]></description>
			<content:encoded><![CDATA[<p>In the <a href="http://code.davidjanes.com/blog/2008/12/22/working-with-dates-times-and-timezones-in-python/">previous entry,</a> we talked about the difficultly in finding out the delta from UTC for a timezone returned from the <a href="http://pytz.sourceforge.net/">pytz</a> module. In particular, consider the offset for <a href="http://en.wikipedia.org/wiki/St._John%27s,_Newfoundland_and_Labrador">St. John&#8217;s, Newfoundland</a> which should be at -3:30.</p>
<pre>dt_now = datetime.datetime.now()
tz = pytz.timezone('America/St_Johns')

offset = tz.utcoffset(dt_now)

Log(
    "using datetime.utcoffset",
    offset = format(offset),
)</pre>
<p>With the unexpected result:</p>
<pre class="result">  message: using datetime.utcoffset
  offset: -4:29 (-12660)</pre>
<p>I did a fair bit of Google searching for an answer without finding a satisfactory result, so I did further research on my own. To find the correct offset value, I found that this works:</p>
<pre>dt_sj = tz.localize(dt_now)
offset = dt_sj - pytz.UTC.localize(dt_now)

Log(
    "using delta to UTC",
    offset = format(offset),
)</pre>
<p>Which yields the correct:</p>
<pre class="result">  message: using delta to UTC
  offset: 03:30 (12600)</pre>
<p>Note that if you&#8217;re going to use the above method for finding deltas, you&#8217;re going to have to take Daylight Savings Time into consideration also. I have not done this here, as I&#8217;m a little pressed for time and just want to illustrate the problem.</p>
<p>The issue seems to be with the way that pytz uses the Olson database entry (from <a href="ftp://elsie.nci.nih.gov/pub/tzarchive.gz">here</a>) for St. John&#8217;s &#8211; and all other locations. It appears that pytz is using the first rule it sees, from 1884, rather than the rule for the date that was passed in. I think this is a bug.</p>
<pre>#
# St John's has an apostrophe, but Posix file names can't have apostrophes.
# Zone  NAME        GMTOFF  RULES   FORMAT  [UNTIL]
Zone America/St_Johns   -3:30:52 -  LMT 1884
            -3:30:52 StJohns N%sT   1918
            -3:30:52 Canada N%sT    1919
            -3:30:52 StJohns N%sT   1935 Mar 30
            -3:30   StJohns N%sT    1942 May 11
            -3:30   Canada  N%sT    1946
            -3:30   StJohns N%sT</pre>
<p>The setup code for the examples above is:</p>
<pre>from bm_log import Log
import dateutil.parser
import pytz
import datetime

def format(td):
    seconds = td.seconds + td.days * ( 24 * 3600 )
    return  "%02d:%02d (%s)" % ( seconds // 3600, seconds % 3600 // 60, seconds, )</pre>
<p><strong>Update 2010-03-09</strong>: <a href="https://bugs.launchpad.net/pytz/+bug/310606">This has been fixed in the code base</a> and (presumably) will be in the next upcoming release.</p>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2008/12/22/pytz-utcoffset/feed/</wfw:commentRss>
		<slash:comments>2</slash:comments>
		</item>
		<item>
		<title>Working with dates, times and timezones in Python</title>
		<link>http://code.davidjanes.com/blog/2008/12/22/working-with-dates-times-and-timezones-in-python/</link>
		<comments>http://code.davidjanes.com/blog/2008/12/22/working-with-dates-times-and-timezones-in-python/#comments</comments>
		<pubDate>Mon, 22 Dec 2008 12:37:56 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[demo]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=377</guid>
		<description><![CDATA[Here&#8217;s a few examples of working with dates, times and timezones in Python. We are using the following packages:

datetime (part of the standard Python distribution)
dateutil &#8211; for date parsing, though there&#8217;s a lot more depth to this package that I&#8217;m not touching here
pytz &#8211; for timezone handling, and specifically making available the Olson timezone database [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a few examples of working with dates, times and timezones in Python. We are using the following packages:</p>
<ul>
<li><a href="http://docs.python.org/library/datetime.html">datetime</a> (part of the standard Python distribution)</li>
<li><a href="http://labix.org/python-dateutil">dateutil</a> &#8211; for date parsing, though there&#8217;s a lot more depth to this package that I&#8217;m not touching here</li>
<li><a href="http://pytz.sourceforge.net/">pytz</a> &#8211; for timezone handling, and specifically making available the <a href="http://en.wikipedia.org/wiki/Zoneinfo">Olson timezone database</a> to Python</li>
</ul>
<p>There&#8217;s a lot of complexity to working with datetimes in any language; I&#8217;m not going to get into that but would prefer instead to show a few practical examples. Keep the following in mind:</p>
<ul>
<li>datetimes may or may not have timezones associated with them. If they do not, they are called &#8220;naive&#8221; and their meaning is effectively defined by the program. In general, you want to work with non-naive datetimes. Generally the assumption would be that the naive datetime is in the application&#8217;s current timezone or the user&#8217;s preferred timezone</li>
<li>when working with datetimes, consider the strategy of converting everything to the universal UTC timezone, then converting back to the user&#8217;s timezone only when you need to display that to the user</li>
<li>if you are rolling your own code for handling dates, times and timezones and you haven&#8217;t done a lot of research, your implementation is <strong>garbage</strong>. Do yourself and everyone else a favor and use a library.</li>
</ul>
<p>Our standard imports. <code>Log</code> is from the <a href="http://code.google.com/p/pybm/">pybm</a> library and it&#8217;s purpose is rather obvious.</p>
<pre>from bm_log import Log
import dateutil.parser
import pytz
import datetime</pre>
<p>Here&#8217;s an example of parsing the an e-mail or RSS type date using <code>dateutil</code>.</p>
<pre>dts = "Thu, 13 Nov 2008 05:41:35 +0000"
dt = dateutil.parser.parse(dts)

Log(
    "Parsing an RFC type date",
    src = dts,
    dt = dt,
    iso = dt.isoformat(),
)</pre>
<pre class="result">
  message: Parsing an RFC type date
  dt: 2008-11-13 05:41:35+00:00
  iso: 2008-11-13T05:41:35+00:00
  src: Thu, 13 Nov 2008 05:41:35 +0000
</pre>
<p>Here&#8217;s an example of parsing an ISO Datetime</p>
<pre>dts = '2008-11-13T05:41:35-0400'
dt = dateutil.parser.parse(dts)

Log(
    "Parsing an ISO Date with Timezone",
    src = dts,
    dt = dt,
    iso = dt.isoformat(),
)</pre>
<pre class="result">
  message: Parsing an ISO Date with Timezone
  dt: 2008-11-13 05:41:35-04:00
  iso: 2008-11-13T05:41:35-04:00
  src: 2008-11-13T05:41:35-0400
</pre>
<p>Here&#8217;s an example of parsing  a naive timezone.</p>
<pre>dts = '2008-11-13T05:41:35'
dt = dateutil.parser.parse(dts)

Log(
    "Parsing an ISO Date without a Timezone",
    src = dts,
    dt = dt,
    iso = dt.isoformat(),
)</pre>
<pre class="result">
  message: Parsing an ISO Date without a Timezone
  dt: 2008-11-13 05:41:35
  iso: 2008-11-13T05:41:35
  src: 2008-11-13T05:41:35
</pre>
<p>Here&#8217;s are two similar example, showing how to force the timezone if it&#8217;s not present. This will happen in the first part, but not the second.</p>
<pre>tz = pytz.timezone('America/Toronto')
dts = '2008-11-13T05:41:35'
dt = dateutil.parser.parse(dts)
if dt.tzinfo == None:
    dt = dt.replace(tzinfo = tz)

Log(
    "Parsing an ISO Date without a Timezone BUT specifying default TZ",
    src = dts,
    dt = dt,
    iso = dt.isoformat(),
    tz = tz,
)

tz = pytz.timezone('America/Toronto')
dts = '2008-11-13T05:41:35-0400'
dt = dateutil.parser.parse(dts)
if dt.tzinfo == None:
    dt = dt.replace(tzinfo = tz)

Log(
    "Parsing an ISO Date with a Timezone AND specifying default TZ",
    src = dts,
    dt = dt,
    iso = dt.isoformat(),
    tz = tz,
)</pre>
<pre class="result">
  message: Parsing an ISO Date without a Timezone BUT specifying default TZ
  dt: 2008-11-13 05:41:35-05:00
  iso: 2008-11-13T05:41:35-05:00
  src: 2008-11-13T05:41:35
  tz: America/Toronto

  message: Parsing an ISO Date with a Timezone AND specifying default TZ
  dt: 2008-11-13 05:41:35-04:00
  iso: 2008-11-13T05:41:35-04:00
  src: 2008-11-13T05:41:35-0400
  tz: America/Toronto
</pre>
<p>Update: here&#8217;s an example of moving datetimes to UTC and then to a different Timezone. Remember: you want your backend code to work with UTC datetimes for simplicity and correctness:</p>
<pre>
dts = '2008-11-13T05:41:35-0400'
dt_orig = dateutil.parser.parse(dts)
dt_utc = dt.astimezone(pytz.UTC)

Log(
    "Changing a datetime to UTC",
    src = dts,
    dt_orig = dt_orig,
    dt_utc = dt_utc,
)

tz_vancouver = pytz.timezone('America/Vancouver')
dt_vancouver = dt_utc.astimezone(tz_vancouver)

Log(
    "Changing UTC datetime to a different timezone",
    dt_vancouver = dt_vancouver,
    dt_utc = dt_utc,
)
</pre>
<pre class="result">
  message: Changing a datetime to UTC
  dt_orig: 2008-11-13 05:41:35-04:00
  dt_utc: 2008-11-13 09:41:35+00:00
  src: 2008-11-13T05:41:35-0400

  message: Changing UTC datetime to a different timezone
  dt_utc: 2008-11-13 09:41:35+00:00
  dt_vancouver: 2008-11-13 01:41:35-08:00
</pre>
<p>Here is an example of listing all &#8220;common&#8221; timezones using pytz. Note that &#8220;America&#8221; refers to the two continents, not the Irish word for the United States. Printing the actual timezone offset turned out to be a surprisingly complex task, which I will outline in a different blog post. For now let it suffice that with pytz try not to depend on <code>utcoffset</code>.</p>
<pre>dt_now = datetime.datetime.now()

def tzname2offset(tzname):
    dt_in_utc = pytz.UTC.localize(dt_now)
    dt_in_tz = pytz.timezone(tzname).localize(dt_now)

    offset = dt_in_utc - dt_in_tz
    seconds = offset.seconds + offset.days * ( 24 * 3600 )

    return  "%02d:%02d" % ( seconds // 3600, seconds % 3600 // 60, )

Log(
    "Olsen (pytz) common timezones and their UTC offsets",
    timezones = map(
        lambda tzname: ( tzname, tzname2offset(tzname), ),
        pytz.common_timezones,
    )
)</pre>
<pre class="result">
  message: Olsen (pytz) common timezones and their UTC offsets
  timezones:
    [('Africa/Abidjan', '00:00'),
     ('Africa/Accra', '00:00'),
     ('Africa/Addis_Ababa', '03:00'),
     ('Africa/Algiers', '01:00'),
     ('Africa/Asmara', '03:00'),
...
     ('Pacific/Wake', '12:00'),
     ('Pacific/Wallis', '12:00'),
     ('US/Alaska', '-9:00'),
     ('US/Arizona', '-7:00'),
     ('US/Central', '-6:00'),
     ('US/Eastern', '-5:00'),
     ('US/Hawaii', '-10:00'),
     ('US/Mountain', '-7:00'),
     ('US/Pacific', '-8:00'),
     ('UTC', '00:00')]
</pre>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2008/12/22/working-with-dates-times-and-timezones-in-python/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Coding backwards for simplicity</title>
		<link>http://code.davidjanes.com/blog/2008/12/08/coding-backwards-for-simplicity/</link>
		<comments>http://code.davidjanes.com/blog/2008/12/08/coding-backwards-for-simplicity/#comments</comments>
		<pubDate>Mon, 08 Dec 2008 21:58:57 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[djolt]]></category>
		<category><![CDATA[dqt]]></category>
		<category><![CDATA[ideas]]></category>
		<category><![CDATA[pybm]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[work]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=311</guid>
		<description><![CDATA[I haven&#8217;t been posting as much as I like here for the last three weeks, not because of lack of ideas but because I haven&#8217;t been able to consolidate what I&#8217;ve been working on into a coherent thought. I&#8217;m trying to come up with a overreaching conceptual arch that covers WORK, Djolt and the various [...]]]></description>
			<content:encoded><![CDATA[<p>I haven&#8217;t been posting as much as I like here for the last three weeks, not because of lack of ideas but because I haven&#8217;t been able to consolidate what I&#8217;ve been working on into a coherent thought. I&#8217;m trying to come up with a overreaching conceptual arch that covers <a href="http://code.davidjanes.com/blog/category/work/">WORK</a>, <a href="http://code.davidjanes.com/blog/category/djolt/">Djolt</a> and the various API interfaces I&#8217;ve been coded. Tentatively and horribly, I&#8217;m calling this Data/Query/Transform/Template right now though I&#8217;m expecting this to change.</p>
<p>The first demo of this &#8230; without further explanation &#8230; <strong><a href="http://code.davidjanes.com/examples/2008-12-08/dqtt1/">can be seen here</a></strong>. More details about what this is actually demonstrating (besides formatting this blog) will be forthcoming.</p>
<p>What I want to draw attention to in this post is <em>how</em> I coded this. What I&#8217;ve been doing for the last several weeks is <em>coding backwards</em>: I start with what I want the final code to look like and then figure out all the libraries, little languages and so forth that would be needed to code that. After several false starts, my conceptual logjam broke about a week ago and code started radically simplifying.</p>
<p>The ideal code, in my mind, is almost entirely static declarations: no loops, no if statements, no while statements, no goto-type statements (god help us). We simply specify how the parts are connected, and hope that we can abstract the complexity into the libraries that make this all happen. The code that you see below is actually post all my conceptualizing: I just wanted to write some code and since I had almost all the parts together it fell together quite nicely:</p>
<pre>import bm_wsgi
import bm_io

import djolt
import api_feed

from bm_log import Log

class Application(bm_wsgi.SimpleWrapper):
    def __init__(self, *av, **ad):
        bm_wsgi.SimpleWrapper.__init__(self, *av, **ad)

    def CustomizeSetup(self):
        self.html_template_src = bm_io.readfile("index.dj")
        self.html_template = djolt.Template(self.html_template_src)

        self.context = djolt.Context()
        self.context["paramd"] = {
            "feed" : "http://feeds.feedburner.com/DavidJanesCode",
            "template" : """\
&lt;ul&gt;
{% for item in data.items %}
	&lt;li&gt;&lt;a href="{{ item.link }}"&gt;{{ item.title }}&lt;/a&gt;&lt;/li&gt;
{% endfor %}
""",
        }
        self.context.Push()
        self.context["paramd"] = self.paramd
        self.context["data"] = api_feed.RSS20(self.context.as_string("paramd.feed"))

    def CustomizeContent(self):
        yield   self.html_template.Render(self.context)

if __name__ == '__main__':
    Application.RunCGI()</pre>
<p>There&#8217;s almost nothing there! In particular, note:</p>
<ul>
<li><code>bm_wsgi.SimpleWrapper</code> handles all the WSGI interface work, including determining when to output HTML headers, error trapping, and Unicode to UTF-8 encoding</li>
<li>the most complicated part of the application is setting up the <code>Context</code>. In particular, note that self.paramd is automatically populated by the <code>QUERY_STRING</code> passed to the application, and the double setting we do here allows us to have default values.</li>
<li>If you want to see the HTML template that drives the application <a href="http://code.davidjanes.com/examples/2008-12-08/dqtt1/index.dj">it is here</a>. Note two variations from Django templates: the <code>{% asis %}</code> block which doesn&#8217;t intrepret it&#8217;s content as Djolt code and the <code>{{ *paramd.template|safe }}</code> variable which<em> <a href="http://code.davidjanes.com/blog/2008/12/04/djolt-indirection/">interprets the variable&#8217;s contents as a template</a></em>.</li>
<li>Methods called <code>Customize</code>-something are my convention for framework functions, i.e. methods that will be called for us rather than methods we call.</li>
</ul>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2008/12/08/coding-backwards-for-simplicity/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to JSON encode iterators</title>
		<link>http://code.davidjanes.com/blog/2008/12/08/json-encode-iterators/</link>
		<comments>http://code.davidjanes.com/blog/2008/12/08/json-encode-iterators/#comments</comments>
		<pubDate>Mon, 08 Dec 2008 19:32:03 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[ideas]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=308</guid>
		<description><![CDATA[As part of my recent explorations, I&#8217;ve been playing a lot with Python iterators/generators. The key efficiency of iterators is that when working with lengthy list-like objects, you need only create the part that&#8217;s being looked at. It&#8217;s just-in-time objects.
If you attempt to JSON serialize an object with an iterator/generator object in it, the json [...]]]></description>
			<content:encoded><![CDATA[<p>As part of my recent explorations, I&#8217;ve been playing a lot with Python <a href="http://www.python.org/dev/peps/pep-0255/">iterators/generators</a>. The key efficiency of iterators is that when working with lengthy list-like objects, you need only create the part that&#8217;s being looked at. It&#8217;s just-in-time objects.</p>
<p>If you attempt to JSON serialize an object with an iterator/generator object in it, the <code>json</code> module throws a cog: it doesn&#8217;t know how to serialize these types of objects. The <code>json</code> module is extensible and the documentation makes a suggestion how to do this:</p>
<pre>class IterEncoder(json.JSONEncoder):
 def default(self, o):
   try:
       iterable = iter(o)
   except TypeError:
       pass
   else:
       return list(iterable)
   return JSONEncoder.default(self, o)

print json.dumps(xrange(4), cls = IterEncoder)</pre>
<p>This seems somewhat ugly to me. In particular, lots of objects can be wrapped by the <code>iter</code> function that don&#8217;t need to be, plus lots of objects will cause that TypeError to be thrown which seems to be rather a bit of waste. Here&#8217;s the solution I came up with:</p>
<pre>class IterEncoder(json.JSONEncoder):
    def default(self, o):
        try:
            return  json.JSONEncoder.default(self, o)
        except TypeError, x:
            try:
                return  list(o)
            except:
                return  x</pre>
<p>This tries to encode the object the normal way. Only if that doesn&#8217;t work do we try to turn the object into a list. If that&#8217;s not convertible (i.e. the list object constructor fails) we go back and throw the <em>original</em> exception provided by JSONEncoder &#8211; we&#8217;ve really failed.</p>
<p>You use this as follows:</p>
<pre>
class X:
    def Iter(self):
        yield 1
        yield 2
        yield 3
        yield 4

xi = X().Iter()

print json.dumps(xi, cls = IterEncoder)
print json.dumps(xrange(4), cls = IterEncoder)
</pre>
<p>Which yields the expected:</p>
<pre>
[1, 2, 3, 4]
[0, 1, 2, 3]
</pre>
<p>Don&#8217;t be overly tempted to check the type of <code>o</code>: it may be <code>types.GeneratorType</code> or <code>types.XRangeType</code> or perhaps even something else that I haven&#8217;t found out yet.</p>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2008/12/08/json-encode-iterators/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Djolt Indirection</title>
		<link>http://code.davidjanes.com/blog/2008/12/04/djolt-indirection/</link>
		<comments>http://code.davidjanes.com/blog/2008/12/04/djolt-indirection/#comments</comments>
		<pubDate>Thu, 04 Dec 2008 11:05:28 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[demo]]></category>
		<category><![CDATA[djolt]]></category>
		<category><![CDATA[ideas]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=289</guid>
		<description><![CDATA[I&#8217;ve been working through a sticky problem with Djolt, trying to implement my Toronto Fires example in as few lines as possible. As part of this, I&#8217;ve come up with the idea of adding indirection to Djolt templates:
import djolt

d = {
    "a" : "It says: {{ b }}",
    "b" [...]]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been working through a sticky problem with Djolt, trying to implement my Toronto Fires example in as few lines as possible. As part of this, I&#8217;ve come up with the idea of adding indirection to Djolt templates:</p>
<pre>import djolt

d = {
    "a" : "It says: {{ b }}",
    "b" : "Hello, World"
}

t = djolt.Template("""
a: {{ a }}
b: {{ b }}
*a: {{ *a }}
""")

print t.Render(d)
""")

print t.Render(d)</pre>
<p>Which yields:</p>
<pre>a: It says: {{ b }}
b: Hello, World
*a: It says: Hello, World
</pre>
<p>This is significantly updated from the original version I posted here an hour ago. The indirection now makes the variable read as a template. This is a much more powerful concept.</p>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2008/12/04/djolt-indirection/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Djolt &#8211; Django-like Templates</title>
		<link>http://code.davidjanes.com/blog/2008/11/28/djolt-django-like-templates/</link>
		<comments>http://code.davidjanes.com/blog/2008/11/28/djolt-django-like-templates/#comments</comments>
		<pubDate>Fri, 28 Nov 2008 21:34:39 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[djolt]]></category>
		<category><![CDATA[pybm]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[work]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=277</guid>
		<description><![CDATA[Djolt is a reimplementation of Django&#8217;s template language in Python. Why do this?

I like the Django template language
I wanted something that small and independent of Django
I wanted something that will work with WORK paths (this was the real deal breaker for using Django)
I wanted something that I could take and reimplement in Javascript and maybe [...]]]></description>
			<content:encoded><![CDATA[<p>Djolt is a reimplementation of <a href="http://docs.djangoproject.com/en/dev/topics/templates/">Django&#8217;s template language</a> in Python. Why do this?</p>
<ul>
<li>I like the Django template language</li>
<li>I wanted something that small and independent of Django</li>
<li>I wanted something that will work with <a href="http://code.davidjanes.com/blog/2008/11/21/work-paths/">WORK paths</a> (this was the real deal breaker for using Django)</li>
<li>I wanted something that I could take and reimplement in Javascript and maybe Java too</li>
<li>Some template engines, <a href="http://www.cheetahtemplate.org/">Cheetah</a> for example, are far too heavy for the kind of light-weight applications I have in mind; <em>note that I&#8217;ve had great success with Cheetah in the past </em></li>
<li>Some template engines, <a href="http://www.python.org/doc/2.6/library/string.html#format-string-syntax">such as that in Python 2.6</a>, are for too underfeatured</li>
</ul>
<p>However, if you&#8217;re really looking for the whole Django template experience and don&#8217;t want to use Djolt, just <a href="http://www.b-list.org/weblog/2007/sep/22/standalone-django-scripts/">start here</a>.</p>
<h4>How do I get it?</h4>
<p>Djolt is packaged as part of the <a href="http://code.google.com/p/pybm/">pybm</a> library.</p>
<h4>How do I use it?</h4>
<pre>import djolt

t = djolt.Template("""
&lt;ul&gt;
{% for name in names %}
&lt;li&gt;{{ name }}&lt;/li&gt;
{% endfor %}
&lt;/ul&gt;
""")
print t.Render({
    "names" : [ "Johnny", "Jack", "Ray", "Mary &amp; Sam", ]
})</pre>
<p>Which gives the results:</p>
<pre>&lt;ul&gt;
&lt;li&gt;Johnny&lt;/li&gt;
&lt;li&gt;Jack&lt;/li&gt;
&lt;li&gt;Ray&lt;/li&gt;
&lt;li&gt;Mary &amp;amp; Sam&lt;/li&gt;
&lt;/ul&gt;</pre>
<p>Note the &#8220;autoescaping&#8221; of the <code>&amp;</code> character.</p>
<h4>What tags does it define?</h4>
<ul>
<li>autoescape/endautoescape</li>
<li>if/else/endif</li>
<li>equal/endequal</li>
<li>for/endfor</li>
<li>notequal/notendequal</li>
</ul>
<p>It does not implement blocks.</p>
<h4>What filters does it define?</h4>
<ul>
<li>add</li>
<li>cut</li>
<li>default (see otherwise below)</li>
<li>default_if_none</li>
<li>divisibleby</li>
<li>first</li>
<li>join</li>
<li>last</li>
<li>length</li>
<li>length_is</li>
<li>linebreaks</li>
<li>lower</li>
<li>pluralize</li>
<li>random</li>
<li>safe (respecting all the Django autoescape rules)</li>
<li>slug</li>
<li>upper</li>
</ul>
<p>Unimplemented filters are due to laziness and will be done &#8220;on demand&#8221;. We also introduce a few new filters:</p>
<ul>
<li> jslug &#8211; like slug, but more Javascript friendly</li>
<li> otherwise &#8211; like default, except the empty string/empty values trigger the filter also</li>
</ul>
<h4>Are their differences between Djolt and Django templates?</h4>
<ul>
<li>Djolt tags suck up whitespace if they&#8217;re on a line by themselves</li>
<li>If Djolt cannot resolve a variable, it resolves to the appropriate &#8220;empty&#8221; value (as opposed to failing). This is keeping in line with <a href="http://code.davidjanes.com/blog/category/work/">WORK</a> philosophy</li>
</ul>
<p>Beyond that you should be able to use most Django template examples (that don&#8217;t use block/implements) as-is.</p>
<h4>Is it extensible?</h4>
<p>Yes. You can add your own tags and filters by following the examples in code (<code>djolt_nodes.py</code> and <code>djolt_filters.py</code> respectively).</p>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2008/11/28/djolt-django-like-templates/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>How to dynamically load Python code</title>
		<link>http://code.davidjanes.com/blog/2008/11/27/how-to-dynamically-load-python-code/</link>
		<comments>http://code.davidjanes.com/blog/2008/11/27/how-to-dynamically-load-python-code/#comments</comments>
		<pubDate>Thu, 27 Nov 2008 12:13:01 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[python]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=272</guid>
		<description><![CDATA[The normal way to load Python code is through the import statement:
import pprint
pprint.pprint('Hello, world.')
But what do you do if you want to dynamically load a module? A classic example of where you&#8217;d like to do this is adding &#8216;extensions&#8217; to your application. Your application has no way of knowing the exact name of the module [...]]]></description>
			<content:encoded><![CDATA[<p>The normal way to load <a href="http://www.python.org/">Python</a> code is through the <code>import </code>statement:</p>
<pre>import pprint
pprint.pprint('Hello, world.')</pre>
<p>But what do you do if you want to <em>dynamically</em> load a module? A classic example of where you&#8217;d like to do this is adding &#8216;extensions&#8217; to your application. Your application has no way of knowing the exact name of the module that it&#8217;s going to use; it only knows the filename(s). The way to do this is the <a href="http://docs.python.org/library/imp.html"><code>imp</code> module</a>:</p>
<pre>import md5
import os.path
import imp
import traceback

def load_module(code_path):
    try:
        try:
            code_dir = os.path.dirname(code_path)
            code_file = os.path.basename(code_path)

            fin = open(code_path, 'rb')

            return  imp.load_source(md5.new(code_path).hexdigest(), code_path, fin)
        finally:
            try: fin.close()
            except: pass
    except ImportError, x:
        traceback.print_exc(file = sys.stderr)
        raise
    except:
        traceback.print_exc(file = sys.stderr)
        raise</pre>
<p>A few notes:</p>
<ul>
<li>call <code>load_module</code> with the path to a <code>.py</code> file that you want to load</li>
<li>the <code>md5.new</code> generates a unique module identifier. If you don&#8217;t do this it&#8217;s difficult to import two modules in different directories with the same name!</li>
<li>the different <code>except</code>s are to give you a flavor of the issues you may see, <code>ImportError</code> is expected, the others are not</li>
</ul>
<p>The return value is a module, which is a Python object that you can address in all the normal ways that you&#8217;d use a module. For example, if you have the following file <code>extension.py</code>:</p>
<pre>def hello(x): print "Hello, %s" % x</pre>
<p>You can use it as follows to get <code>Hello, world</code>.</p>
<pre>m = load_module('extension.py')
m.hello("World")</pre>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2008/11/27/how-to-dynamically-load-python-code/feed/</wfw:commentRss>
		<slash:comments>5</slash:comments>
		</item>
		<item>
		<title>The &#8220;Anything Goes&#8221; Pattern</title>
		<link>http://code.davidjanes.com/blog/2008/11/25/the-anything-goes-pattern/</link>
		<comments>http://code.davidjanes.com/blog/2008/11/25/the-anything-goes-pattern/#comments</comments>
		<pubDate>Tue, 25 Nov 2008 21:13:05 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[python]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=268</guid>
		<description><![CDATA[Here&#8217;s a Python code pattern that I find myself falling into every once in awhile. If you&#8217;re a highly disciplined milspec-type non-pragmatic programmer, I suggest you stop reading here lest you burn your eyes.
The patterm useful in two situations:

when you have an evolving superclass that may take new constructor (or method!) arguments in the future [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a Python code pattern that I find myself falling into every once in awhile. If you&#8217;re a highly disciplined milspec-type non-pragmatic programmer, I suggest you stop reading here lest you burn your eyes.</p>
<p>The patterm useful in two situations:</p>
<ul>
<li>when you have an evolving superclass that may take new constructor (or method!) arguments in the future and you don&#8217;t want to have to recode your subclasses to reflect those changes</li>
<li>you have a number of interchangeable subclasses that may or may not use certain arguments (say, because you&#8217;re constructing the object from a command line)</li>
</ul>
<pre>class Component:
    def __init__(self, a, b = None, *av, **ad):
        ...

class ComponentTemplate(Component):
    def __init__(self, *av, **ad):
        Component.__init__(self, *av, **ad)</pre>
<p><code>a</code> and <code>b</code> are two arguments that are being used by superclass. With this pattern you can add <code>c</code> to <code>Component</code> in the future without worrying about rewriting <code>ComponentTemplate</code>. Similarly, if an unexpected argument is passed down to Component, it will be silently ignored.</p>
<p>In case you&#8217;re wondering what <code>*av</code> and <code>**ad</code> are, they&#8217;re Python&#8217;s way of referring to arguments that have been passed in, by position and by name, but have not been explicitly listed in the method&#8217;s signature. The first is a list and the second a dictionary. If you&#8217;re a Python user and you&#8217;re not familiar with this, you can and should read more about this <a href="http://www.rexx.com/~dkuhlman/python_101/python_101.html#SECTION004510000000000000000">here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2008/11/25/the-anything-goes-pattern/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Database roundup</title>
		<link>http://code.davidjanes.com/blog/2008/11/24/database-roundup/</link>
		<comments>http://code.davidjanes.com/blog/2008/11/24/database-roundup/#comments</comments>
		<pubDate>Mon, 24 Nov 2008 12:27:44 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[db]]></category>
		<category><![CDATA[ideas]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[semantic web]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=253</guid>
		<description><![CDATA[Here&#8217;s a few things I was reading about over the weekend.
SQLAlchemy
SQLAlchemy is a full-featured Design Pattern-heavy pythonic database ORM. I am totally going to use this for my next Python SQL database project and may even do some playing with old datasets (using the reflection features, yum) soon. If you are considering doing SQL work [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s a few things I was reading about over the weekend.</p>
<h4>SQLAlchemy</h4>
<p><a href="http://www.sqlalchemy.org/">SQLAlchemy</a> is a full-featured <a href="http://en.wikipedia.org/wiki/Design_pattern_(computer_science)">Design Pattern</a>-heavy pythonic database <a href="http://en.wikipedia.org/wiki/Object-relational_mapping">ORM</a>. I am totally going to use this for my next Python SQL database project and may even do some playing with old datasets (using the <a href="http://www.sqlalchemy.org/docs/05/metadata.html#metadata_tables_reflecting">reflection features</a>, yum) soon. If you are considering doing SQL work on your next Python project, don&#8217;t even bother with the usual <a href="http://www.python.org/dev/peps/pep-0249/">PEP 249</a> stuff, start with this.</p>
<p>Note that if you&#8217;re working with <a href="http://www.djangoproject.com/">Django</a> it handles the DB in its own way so SQLAlchemy may be of limited utility.</p>
<h4>CouchDB</h4>
<p><a href="http://incubator.apache.org/couchdb/">CouchDB</a> &#8220;is a distributed, fault-tolerant and schema-free document-oriented database accessible via a RESTful HTTP/JSON API&#8221;. I couldn&#8217;t have written that more succently myself, so I didn&#8217;t. I qualified the paragraph above on SQLAlchemy that I&#8217;m going to use that for my next <em>SQL</em> project because I&#8217;m really biting at the bit to try CouchDB out. The CouchDB design philosophy &#8211; a REST API a returning lists of JSON-objects &#8211; reflects my <a href="http://code.davidjanes.com/blog/category/work/">current design paradigm</a> very closely, and the only question I have is whether in practically scales to millions of rows.</p>
<p>A caveat that it&#8217;s written in the-cool-nerds-are-doing-it language <a href="http://www.erlang.org/">Erlang</a>, but because you don&#8217;t have to interact with that it should be OK for us mortals.</p>
<p>CouchDB is <a href="http://mail-archives.apache.org/mod_mbox/incubator-couchdb-dev/200811.mbox/%3C3F352A54-5FC8-4CB0-8A6B-7D3446F07462@jaguNET.com%3E">about to officially become a &#8220;top level&#8221; </a><a href="http://mail-archives.apache.org/mod_mbox/incubator-couchdb-dev/200811.mbox/%3C3F352A54-5FC8-4CB0-8A6B-7D3446F07462@jaguNET.com%3E">Apache</a><a href="http://mail-archives.apache.org/mod_mbox/incubator-couchdb-dev/200811.mbox/%3C3F352A54-5FC8-4CB0-8A6B-7D3446F07462@jaguNET.com%3E"> project</a>, though none of the documentation on the <a href="http://apache.org/">Apache.org</a> site reflects this yet.</p>
<h4>Virtuoso</h4>
<p><a href="http://virtuoso.openlinksw.com/wiki/main/Main/">Virtuoso</a> is a &#8220;high-performance object-relational SQL database&#8221;. <a href="http://www.openlinksw.com/weblog/oerling/?id=1484">It apparently can perform well</a>. As I came across through the <a href="http://planetrdf.com/">Planet RDF</a> aggregator, this may be something you want to look into if you&#8217;re working on an <a href="http://en.wikipedia.org/wiki/Resource_Description_Framework">RDF</a>/<a href="http://www.w3.org/TR/rdf-sparql-query/">SPARQL</a> project.</p>
<h4>Amazon Web Services Hosted Data Sets</h4>
<p>That&#8217;s a mouthfull, isn&#8217;t it? <a href="http://aws.amazon.com/publicdatasets/">Amazon is offering to host public datasets</a> on <a href="http://aws.amazon.com/ec2/">EC2</a> for free. What&#8217;s the catch? It will host the data, but you have to pay for the computing resources to use that data in the normal EC2 manner. Still, if you&#8217;re using a large public dataset and you&#8217;re already EC2-friendly, you might want to consider this program. An even more interesting thought occurs (though I&#8217;m not sure if it will fly): if you&#8217;re using large amounts of your own data on EC2, you may want to offer it up as a free resource.</p>
<p>There&#8217;s more on this on by <a href="http://www.readwriteweb.com/archives/amazon_web_services_seeks_publ.php">Lidija Davis on Read/Write Web</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2008/11/24/database-roundup/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Use unittest</title>
		<link>http://code.davidjanes.com/blog/2008/11/20/use-unittest/</link>
		<comments>http://code.davidjanes.com/blog/2008/11/20/use-unittest/#comments</comments>
		<pubDate>Thu, 20 Nov 2008 11:08:08 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[python]]></category>
		<category><![CDATA[tips]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=238</guid>
		<description><![CDATA[When developing Python code there&#8217;s a tendency to do add a __main__ section to test the code:
def add(a, b):
    return  a + b

if __name__ == '__main__':
    print add(3, 4)
Don&#8217;t. Python has a great little package called unittest that let&#8217;s you quickly frame functions in testcases.
If the example above [...]]]></description>
			<content:encoded><![CDATA[<p>When developing Python code there&#8217;s a tendency to do add a <code>__main__</code> section to test the code:</p>
<pre>def add(a, b):
    return  a + b

if __name__ == '__main__':
    print add(3, 4)</pre>
<p>Don&#8217;t. Python has a great little package called <a href="http://www.python.org/doc/2.6/library/unittest.html">unittest</a> that let&#8217;s you quickly frame functions in testcases.</p>
<p>If the example above is called <code>add.py</code>, I&#8217;ll generally make a subdirectory called <code>tests</code> and add a test program called <code>test_add.py</code>. This can be as simple as:</p>
<pre>
import unittest

class TestAdd(unittest.TestCase):
    def setUp(self):
        pass    

    def test_1(self):
        self.assertEqual(add(3, 4), 7)
        self.assertEqual(add(4, 4), 8)
        self.assertEqual(add(4, -4), 0)

if __name__ == '__main__':
    unittest.main()
</pre>
<p>But I prefer to use the following pattern:</p>
<pre>class TestAdd(unittest.TestCase):
    def test_add(self):
        checkds = [
            {
                "a" : 4,
                "b" : 3,
                "@result": 7
            },
            {
                "a" : 4,
                "b" : 4,
                "@result": 8
            },
            {
                "a" : 4,
                "b" : -4,
                "@result": 0
            },
        ]

        for checkd in checkds:
            expected_result = checkd.pop("@result")
            actual_result = add(**checkd)

            if expected_result == -1:
                print checkd, actual_result
                continue

            try:
                self.assertEqual(expected_result, actual_result)
            except:
                print checkd, actual_result
                raise</pre>
<p>In particular:</p>
<ul>
<li>the individual tests are defined in the checkds list of dictionaries</li>
<li>the bottom part (the <code>for</code> loop) is boilerplate
<ul>
<li>it removes the <code>@result</code> from the dictionary</li>
<li>it calls <code>add</code> with the remaining dictionary</li>
<li>and it then asserts that the <code>actual_result</code> was the same as the <code>expected_result</code></li>
</ul>
</li>
<li>if the <code>expected_result</code> is -1, it doesn&#8217;t run the test, it just prints the <code>actual_result</code>. This is great for setting up your tests in the first place. Obviously you might way to change this marker for testing functions that can return -1, but you get the idea</li>
</ul>
<p>The advantage of using unittest like is that you&#8217;re now not depending on visual inspection or remembering which files you put a <code>__main__</code> in to test your code. As a secondary benefit, unittest helps you think about edge cases, how other people might call <em>your</em> code.</p>
<p>Just go to your test directory and run them all and you&#8217;ll be sure your libraries are behaving as designed.</p>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2008/11/20/use-unittest/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>Work API Teaser III &#8211; Google API</title>
		<link>http://code.davidjanes.com/blog/2008/11/18/work-api-teaser-iii-google-api/</link>
		<comments>http://code.davidjanes.com/blog/2008/11/18/work-api-teaser-iii-google-api/#comments</comments>
		<pubDate>Tue, 18 Nov 2008 10:26:50 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[python]]></category>
		<category><![CDATA[work]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=229</guid>
		<description><![CDATA[Here&#8217;s an example of implementing an API with many different endpoints. It&#8217;s the Google AJAX Search API which lets you access all of Google&#8217;s search engines programmatically! A few notes:

In the Javascript API Google provides &#8220;branding&#8221; functions to make sure search results are properly attributed. There doesn&#8217;t seem to be a corresponding AJAX call &#8212; [...]]]></description>
			<content:encoded><![CDATA[<p>Here&#8217;s an example of implementing an API with many different endpoints. It&#8217;s the <a href="http://code.google.com/apis/ajaxsearch/documentation/reference.html#_intro_fonje">Google AJAX Search API</a> which lets you access all of Google&#8217;s search engines programmatically! A few notes:</p>
<ul>
<li>In the Javascript API Google provides &#8220;branding&#8221; functions to make sure search results are properly attributed. There doesn&#8217;t seem to be a corresponding AJAX call &#8212; that is, it&#8217;s probably implemented directly in the Javascript &#8212; but I&#8217;d still like to provide a corresponding function. It would be nice if API providers actually gave a branding end-point</li>
<li>The code doesn&#8217;t support (yet) multi-page results: coming soon</li>
<li>The clever bit is in <code>_item_path</code>, which describes how to pull <a href="http://code.davidjanes.com/blog/2008/11/11/work-web-object-records/">WORK</a> result objects out of the AJAX result</li>
<li>all this code is actually available right now, via SVN: the instructions are <a href="http://code.google.com/p/pybm/source/checkout">here</a>. This library is standalone (and is in fact the basis for many of the other projects I have on Google code)</li>
<li>The Google API requires a <code>_http_referer</code>: the URL of the site that&#8217;s using the results</li>
<li>The Google API <em>does not</em> require an API key, but you can pass one (in the constructor or in individual search calls) under the key <code>api_key</code>. <em>You can use the same API key that <a href="http://code.davidjanes.com/blog/2008/11/07/how-to-use-the-google-maps-api/">you&#8217;ve created for Google Maps</a>.</em></li>
</ul>
<p>Here&#8217;s the Google API class: quite simple. I&#8217;ll probably extend each individual search function to provide all the known parameters by name, rather than passing in a <code>**ad</code> catch-all.</p>
<pre>class Google(bm_api.API):
    _base_query = {
        "v" : "1.0",
    }

    _item_path = "responseData.results"
    _meta_path = "responseData.cursor"
    _convert2work = bm_work.JSON2WORK()

    def __init__(self, _http_referer, **ad):
        bm_api.API.__init__(self, _http_referer = _http_referer, **ad)

    def WebSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/web"
        self.SearchOn(q = q, **ad)

    def LocalSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/local"
        self.SearchOn(q = q, **ad)

    def VideoSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/video"
        self.SearchOn(q = q, **ad)

    def BlogSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/blogs"
        self.SearchOn(q = q, **ad)

    def NewsSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/news"
        self.SearchOn(q = q, **ad)

    def BookSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/books"
        self.SearchOn(q = q, **ad)

    def ImageSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/images"
        self.SearchOn(q = q, **ad)

    def PatentSearch(self, q, **ad):
        self._uri_base = "http://ajax.googleapis.com/ajax/services/search/patentNew"
        self.SearchOn(q = q, **ad)</pre>
<p>Here&#8217;s how you use it:</p>
<pre>
api_key = os.environ["GMAPS_APIKEY"]
referer = "http://code.davidjanes.com"
query = "Paris Hilton"

api = Google(key = api_key, _http_referer = referer)
api.VideoSearch(query)

for item in api.IterItems():
    pprint.pprint(item)
</pre>
<p>Here&#8217;s an example of a results, searching for &#8220;Paris Hilton&#8221; in Videos. I tried searching in Patents without luck.</p>
<pre>{'@Index': 0,
 '@Page': 1,
 u'GsearchResultClass': u'GvideoSearch',
 u'content': u"Paris Hilton's new video clip for 'Nothing In This World'",
 u'duration': u'204',
 u'playUrl': u'http://www.youtube.com/v/...',
 u'published': u'Thu, 12 Oct 2006 09:33:23 PDT',
 u'publisher': u'www.youtube.com',
 u'rating': u'4.52872',
 u'tbHeight': u'240',
 u'tbUrl': u'http://0.gvt0.com/vi/Ki2M3-2W-cQ/0.jpg',
 u'tbWidth': u'320',
 u'title': u'Paris Hilton - Nothing In This World',
 u'titleNoFormatting': u'Paris Hilton - Nothing In This World',
 u'url': u'http://www.google.com/url?q=...',
 u'videoType': u'YouTube'}</pre>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2008/11/18/work-api-teaser-iii-google-api/feed/</wfw:commentRss>
		<slash:comments>3</slash:comments>
		</item>
		<item>
		<title>Work API Teaser II &#8211; Praized API</title>
		<link>http://code.davidjanes.com/blog/2008/11/12/work-api-teaser-ii-praized-api/</link>
		<comments>http://code.davidjanes.com/blog/2008/11/12/work-api-teaser-ii-praized-api/#comments</comments>
		<pubDate>Wed, 12 Nov 2008 23:46:19 +0000</pubDate>
		<dc:creator>David Janes</dc:creator>
				<category><![CDATA[demo]]></category>
		<category><![CDATA[ideas]]></category>
		<category><![CDATA[python]]></category>
		<category><![CDATA[search]]></category>
		<category><![CDATA[semantic web]]></category>
		<category><![CDATA[work]]></category>

		<guid isPermaLink="false">http://code.davidjanes.com/blog/?p=212</guid>
		<description><![CDATA[Implementing a merchant search using the Praized API took about 10 minutes (mainly finding the right documentation), using my WORK framework:
class PraizedMerchants(bm_api.API):
    """See: http://code.google.com/p/praized/wiki/A_Second_Tutorial_Search"""

    _uri_base = "http://api.praized.com/apitribe/merchants.xml"
    _meta_path = "community"
    _item_path = "merchants.merchant"
    _page_max_path = 'pagination.page_count'
    _page_max [...]]]></description>
			<content:encoded><![CDATA[<p>Implementing a merchant search using the <a href="http://praizedmedia.com/">Praized</a> <a href="http://praizedmedia.com/en/api">API</a> took about 10 minutes (<a href="http://code.google.com/p/praized/wiki/A_Second_Tutorial_Search">mainly finding the right documentation</a>), using my WORK framework:</p>
<pre>class PraizedMerchants(bm_api.API):
    """See: http://code.google.com/p/praized/wiki/A_Second_Tutorial_Search"""

    _uri_base = "http://api.praized.com/apitribe/merchants.xml"
    _meta_path = "community"
    _item_path = "merchants.merchant"
    _page_max_path = 'pagination.page_count'
    _page_max = -1

    def __init__(self, api_key, slug = "apitribe", **ad):
        bm_api.API.__init__(self, api_key = api_key, **ad)

        self._uri_base = "http://api.praized.com/%s/merchants.xml" % slug

    def CustomizePageURI(self, page_index):
        if page_index &gt; 1:
            return  "page=%s" % page_index</pre>
<p>Partially hardcoding &#8216;apitribe&#8217; as a &#8216;community slug&#8217; is probably a bad idea. Anyhoo, here&#8217;s how you call it&#8230;</p>
<pre>api_key = os.environ["PRAIZED_APIKEY"]
api = PraizedMerchants(api_key = api_key, slug = "david-janess-code")
api.SearchOn(
    q = "Bistro",
    l = "Toronto",
)
for item in api.IterItems():
    print json.dumps(item, indent = 1)</pre>
<p>&#8230; and a set if results, somewhat edited below. I&#8217;ll have to figure out what that &#8220;permalink&#8221; is all about (I&#8217;ve edited it to shorten it)  &#8230; it could be something neat, but I haven&#8217;t quite grasped all the ins and outs of what Praized wants to accomplish as a business.</p>
<pre>{
 "@Index": 0,
 "@Page": 1,
 "short_url": "http://przd.com/zAU-7",
 "pid": "af5bebd604f3d1517a8113e0a2e8cc58",
 "updated_at": "2008-10-04T20:49:34Z",
 "phone": "(416) 585-7896",
 "permalink":
   ".../praized/places/ca/ontario/toronto/coffee-supreme-bistro?l=Toronto&amp;q=Bistro",
 "name": "Coffee Supreme Bistro",
 "created_at": "2008-10-04T20:49:34Z",
 "location": {
  "city": {
   "name": "Toronto"
  },
  "country": {
   "code": "CA",
   "name_fr": "Canada",
   "name": "Canada"
  },
  "longitude": "-79.384071",
  "regions": {
   "province": "Ontario"
  },
  "postal_code": "M5J 1T1",
  "latitude": "43.646347",
  "street_address": "40 University Avenue"
 }
}</pre>
]]></content:encoded>
			<wfw:commentRss>http://code.davidjanes.com/blog/2008/11/12/work-api-teaser-ii-praized-api/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
