<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom"><title>Guido Kollerie's Blog</title><link href="http://blog.kollerie.com/" rel="alternate"></link><link href="http://blog.kollerie.com/feeds/Programming.atom.xml" rel="self"></link><id>http://blog.kollerie.com/</id><updated>2012-12-06T07:00:00+01:00</updated><entry><title>Oracle and Python on OS X Mountain Lion</title><link href="http://blog.kollerie.com/2012/12/06/oracle_and_python_on_osx_mountain_lion/" rel="alternate"></link><updated>2012-12-06T07:00:00+01:00</updated><author><name>Guido Kollerie</name></author><id>tag:blog.kollerie.com,2012-12-06:2012/12/06/oracle_and_python_on_osx_mountain_lion/</id><summary type="html">&lt;p&gt;I had been running OS X &lt;a class="reference external" href="https://en.wikipedia.org/wiki/OS_X#Version_10.6:_.22Snow_Leopard.22"&gt;Snow Leopard&lt;/a&gt;, released on August 28th, 2009, relatively happily for many years
and never saw the need to upgrade to newer incarnations of Apple's operating system.
With &lt;a class="reference external" href="https://en.wikipedia.org/wiki/OS_X#Version_10.6:_.22Snow_Leopard.22"&gt;Snow Leopard&lt;/a&gt; Apple focussed on under the hood changes
and not so much on end user features.
A laudable effort as many small and not so small things just did not work correctly;
better make what's already there work than introduce more broken features.&lt;/p&gt;
&lt;p&gt;That all changed with OS X &lt;a class="reference external" href="https://en.wikipedia.org/wiki/OS_X#Version_10.7:_.22Lion.22"&gt;Lion&lt;/a&gt;, released on July 20th, 2011.
&lt;a class="reference external" href="https://en.wikipedia.org/wiki/OS_X#Version_10.7:_.22Lion.22"&gt;Lion&lt;/a&gt; was not particularly well received.
It was buggy and presented Apple's first forays into dumbing OS X down to &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Ios"&gt;iOS&lt;/a&gt; levels.
OS X &lt;a class="reference external" href="https://en.wikipedia.org/wiki/OS_X#Version_10.8:_.22Mountain_Lion.22"&gt;Mountain Lion&lt;/a&gt; continued that effort but supposedly did solve many of &lt;a class="reference external" href="https://en.wikipedia.org/wiki/OS_X#Version_10.7:_.22Lion.22"&gt;Lion&lt;/a&gt;'s issues.&lt;/p&gt;
&lt;p&gt;The other day I was handed a relatively new MacBook Pro to be my main development machine at work.
It ran &lt;a class="reference external" href="https://en.wikipedia.org/wiki/OS_X#Version_10.7:_.22Lion.22"&gt;Lion&lt;/a&gt; and might have run &lt;a class="reference external" href="https://en.wikipedia.org/wiki/OS_X#Version_10.6:_.22Snow_Leopard.22"&gt;Snow Leopard&lt;/a&gt;, but I opted for installing &lt;a class="reference external" href="https://en.wikipedia.org/wiki/OS_X#Version_10.8:_.22Mountain_Lion.22"&gt;Mountain Lion&lt;/a&gt;.
One can only postpone the inevitable for so long.&lt;/p&gt;
&lt;p&gt;Although some people strongly prefer &lt;a class="reference external" href="http://mxcl.github.com/homebrew/"&gt;HomeBrew&lt;/a&gt;,
I am generally happy with &lt;a class="reference external" href="http://www.macports.org/"&gt;MacPorts&lt;/a&gt;.
It works just fine.
Though it does require you to have installed Apple's Xcode
Developer Tools and Apple's Command Line Developer Tools.&lt;/p&gt;
&lt;p&gt;With &lt;a class="reference external" href="http://www.macports.org/"&gt;MacPorts&lt;/a&gt; installing Python, and some basic tools, is a simple matter of:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;$ sudo port install python26 py26-distribute py26-pip py26-virtualenv
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Yes I know, I'm stuck at version 2.6 for deployment reasons out of my control.&lt;/p&gt;
&lt;p&gt;For a particular project I needed to connect to an Oracle database.
In order to install the Python interface to Oracle, &lt;a class="reference external" href="http://pypi.python.org/pypi/cx_Oracle/5.1.2"&gt;cx_Oracle&lt;/a&gt;, in my &lt;a class="reference external" href="http://pypi.python.org/pypi/virtualenv/"&gt;virtualenv&lt;/a&gt;
I needed to install the &lt;a class="reference external" href="http://www.oracle.com/technetwork/database/features/instant-client/index-097480.html"&gt;Oracle Instant Client&lt;/a&gt; first.
Running on a 64 bit processor (Intel Core i7)
and a 64 bit operating system
with a 64 bit version of Python installed as demonstrated by:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;$ python -c &amp;#39;import struct;print( 8 * struct.calcsize(&amp;quot;P&amp;quot;))&amp;#39;
64
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I, logically, choose the 64 bit version of the &lt;a class="reference external" href="http://www.oracle.com/technetwork/database/features/instant-client/index-097480.html"&gt;Oracle Instant Client&lt;/a&gt;.
Big mistake!&lt;/p&gt;
&lt;p&gt;Even though &lt;a class="reference external" href="http://pypi.python.org/pypi/cx_Oracle/5.1.2"&gt;cx_Oracle&lt;/a&gt; installed just fine.
Executing code that tries to connect to the Oracle database
consistently resulted in a &lt;cite&gt;segmentation fault 11&lt;/cite&gt; error message.&lt;/p&gt;
&lt;p&gt;Under &lt;a class="reference external" href="https://en.wikipedia.org/wiki/OS_X#Version_10.6:_.22Snow_Leopard.22"&gt;Snow Leopard&lt;/a&gt;, that I had running in 64 bit mode as well, this worked without any issues.
Though in &lt;a class="reference external" href="https://en.wikipedia.org/wiki/OS_X#Version_10.8:_.22Mountain_Lion.22"&gt;Mountain Lion&lt;/a&gt;
and supposedly in &lt;a class="reference external" href="https://en.wikipedia.org/wiki/OS_X#Version_10.7:_.22Lion.22"&gt;Lion&lt;/a&gt;
the &lt;a class="reference external" href="http://www.oracle.com/technetwork/database/features/instant-client/index-097480.html"&gt;Oracle Instant Client&lt;/a&gt; cannot be run in 64 bit mode.
Fortunately I don't have strict requirements to run this particular application in 64 bit.
So the solution is simple;
use the 32 bit &lt;a class="reference external" href="http://www.oracle.com/technetwork/database/features/instant-client/index-097480.html"&gt;Oracle Instant Client&lt;/a&gt;
and run Python in 32 bit mode.&lt;/p&gt;
&lt;p&gt;The latter turned out to not so obvious.
How do I tell &lt;a class="reference external" href="http://www.macports.org/"&gt;MacPorts&lt;/a&gt; to build me a 32 bit version of Python?
And how do I tell &lt;a class="reference external" href="http://pypi.python.org/pypi/pip"&gt;pip&lt;/a&gt; to build me 32 bit library?&lt;/p&gt;
&lt;div class="section" id="building-32-bit-ports-using-macports"&gt;
&lt;h2&gt;Building 32 bit ports using MacPorts&lt;/h2&gt;
&lt;p&gt;Contrary to the Apple provided binary of the Python interpreter,
the binaries as built by &lt;a class="reference external" href="http://www.macports.org/"&gt;MacPorts&lt;/a&gt; do not contain both a 32 bit and a 64 bit version.
&lt;a class="reference external" href="http://www.macports.org/"&gt;MacPorts&lt;/a&gt; builds a 64 bits version if it is in a 64 bit environment.
To tell it we want it to built a 32 bit version we need to edit &lt;cite&gt;macports.conf&lt;/cite&gt;&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;$ sudo mvim /opt/local/etc/macports/macports.conf
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;And uncomment the line:&lt;/p&gt;
&lt;pre class="literal-block"&gt;
#build_arch        i386
&lt;/pre&gt;
&lt;p&gt;From this moment on &lt;a class="reference external" href="http://www.macports.org/"&gt;MacPorts&lt;/a&gt; will build 32 bit binaries exclusively.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="building-32-bit-libraries-using-pip"&gt;
&lt;h2&gt;Building 32 bit libraries using pip&lt;/h2&gt;
&lt;p&gt;Without any extra information, pip too, builds 64 bit libraries
for those libraries that need C extensions to be built.
To tell it we want a 32 bit library instead we run pip as follows:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;$ ARCHFLAGS=&amp;quot;-arch i386&amp;quot; pip install cx-Oracle
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;To make this the default we can add the variable &lt;cite&gt;ARCHFLAGS=&amp;quot;-arch i386&amp;quot;&lt;/cite&gt; to our shell environment
by adding it to our shell startup scripts.&lt;/p&gt;
&lt;p&gt;Now we can connect to Oracle without any issues.&lt;/p&gt;
&lt;/div&gt;
</summary><category term="Python"></category><category term="Oracle"></category><category term="OS X"></category></entry><entry><title>Python Users Netherlands Meeting</title><link href="http://blog.kollerie.com/2012/09/22/python_users_netherlands_meeting/" rel="alternate"></link><updated>2012-09-22T15:46:00+02:00</updated><author><name>Guido Kollerie</name></author><id>tag:blog.kollerie.com,2012-09-22:2012/09/22/python_users_netherlands_meeting/</id><summary type="html">&lt;p&gt;Last night I attended the Python Users Netherlands (&lt;a class="reference external" href="http://wiki.python.org/moin/PUN/"&gt;PUN&lt;/a&gt;) Meeting
hosted by &lt;a class="reference external" href="http://www.nelen-schuurmans.nl/downloads/124/Routebeschrijving.pdf"&gt;Nelen &amp;amp; Schuurmans&lt;/a&gt; in Utrecht.
This must have been the third or fourth time I attended a &lt;a class="reference external" href="http://wiki.python.org/moin/PUN/"&gt;PUN&lt;/a&gt; Meeting.
Each time I'm struck by the small size of the Python community in The Netherlands.
I do realize not every Python developer in The Netherlands attends every or even any &lt;a class="reference external" href="http://wiki.python.org/moin/PUN/"&gt;PUN&lt;/a&gt; meeting.
But still, it is an excellent way to keep in touch with fellow Python developers
and get a feel for what Python developers in The Netherlands are working on.&lt;/p&gt;
&lt;p&gt;If you are a Python developer looking for work there is even more reason to attend.
This time two companies formally announced they were looking for Python developers to join their development teams.
Even more people in the audience raised their hands when asked whether their company was looking for Python developers as well.
I heard similar sounds when I attended &lt;a class="reference external" href="http://pygrunn.nl/"&gt;PyGrunn&lt;/a&gt; earlier this year.
Even in these challenging times there seems plenty of opportunity for good Python developers.&lt;/p&gt;
&lt;div class="section" id="the-30-min-talks"&gt;
&lt;h2&gt;The 30 min Talks&lt;/h2&gt;
&lt;p&gt;On the &lt;a class="reference external" href="http://wiki.python.org/moin/PUN/nens_sept_2012"&gt;agenda&lt;/a&gt;
the two half hour talks both mentioned &lt;a class="reference external" href="http://www.pylonsproject.org/"&gt;Pyramid&lt;/a&gt;.
I have played with &lt;a class="reference external" href="http://www.pylonsproject.org/"&gt;Pyramid&lt;/a&gt; once before and came away very impressed.
So I was hoping to see more details on how people were using &lt;a class="reference external" href="http://www.pylonsproject.org/"&gt;Pyramid&lt;/a&gt; in production.
The two talks were a little light on specific &lt;a class="reference external" href="http://www.pylonsproject.org/"&gt;Pyramid&lt;/a&gt; details,
but did gave a very good overview of the products being built on top of it.&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="http://www.wiggy.net/"&gt;Wichert Akkerman&lt;/a&gt;'s talk on &lt;a class="reference external" href="http://www.wiggy.net/presentations/2012/2Style4You%20en%20Python/"&gt;How 2Style4You uses Pyramid (and more Python)&lt;/a&gt;
intriguingly showed the complexities of developing &lt;a class="reference external" href="https://en.wikipedia.org/wiki/I18n"&gt;i18n and l10n&lt;/a&gt; applications.
Something most Dutch developers normally don't have to take into account.&lt;/p&gt;
&lt;p&gt;Wichert was also kind enough to lend me his laptop for my 5 min lightning talk
as mine had broken done earlier that day.&lt;/p&gt;
&lt;p&gt;Marcel van den Elst seemed to have a lot of fun developing with &lt;a class="reference external" href="http://www.mongodb.org/"&gt;MongoDB&lt;/a&gt;.
His talk on &lt;a class="reference external" href="http://prezi.com/_hx6kdevlxab/mongoengine-relational-privileges-goodness/"&gt;MongoEngine + Relational + Privileges&lt;/a&gt; (on Pyramid!) showed
the &lt;a class="reference external" href="https://github.com/ProgressiveCompany"&gt;open source tools&lt;/a&gt; his company had built on top of &lt;a class="reference external" href="http://www.mongodb.org/"&gt;MongoDB&lt;/a&gt;
to implement their progressive planning tool.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="the-5-min-lightning-talks"&gt;
&lt;h2&gt;The 5 min Lightning Talks&lt;/h2&gt;
&lt;p&gt;I started off the lightning talks with my 5 min presentation on &lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt;.
Let me tell you that 5 minutes to present something is really, really short.
I should have prepared myself a little better because I ran out of time
as soon as I had started.&lt;/p&gt;
&lt;div class="section" id="disco"&gt;
&lt;h3&gt;Disco&lt;/h3&gt;
&lt;p&gt;Anyway, back to &lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt;.
A couple of months ago I was researching the means
to process (OCR'ing, shape detection in diagrams, cross referencing, etc)
a fairly large number of documents (60K - 100K).
Processing that many documents on a single machine took a number of weeks.
How could we speed that up?&lt;/p&gt;
&lt;p&gt;One of the possible solutions I came across was &lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt;.
&lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt; is a &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Mapreduce"&gt;MapReduce&lt;/a&gt; implementation written
in &lt;a class="reference external" href="http://www.erlang.org/"&gt;Erlang&lt;/a&gt; (core)
and Python (tools and the &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Mapreduce"&gt;MapReduce&lt;/a&gt; jobs).
Working for a C# development shop at the time &lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt; was not immediately applicable.
It did, however, remain in the back of my head to return to one day.
The 5 min lightning talks seemed like a good excuse to play with it a bit more.&lt;/p&gt;
&lt;p&gt;Installation of &lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt; was fairly simple.
However I did have to patch one file (&lt;tt class="docutils literal"&gt;lib/disco/comm.py&lt;/tt&gt;)
to get &lt;a class="reference external" href="http://discoproject.org/doc/disco/howto/chunk.html"&gt;chunking in its ddfs tool&lt;/a&gt; working.
When I later realized a fix for this issue had been available in &lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt;'s repository for over six months
as a &lt;a class="reference external" href="https://github.com/discoproject/disco/pull/302"&gt;pull request&lt;/a&gt;
I prematurely drew the conclusion that &lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt; was not actively developed.&lt;/p&gt;
&lt;p&gt;Today,
while writing this blog post,
I took another look at &lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt;'s &lt;a class="reference external" href="https://github.com/discoproject/disco/"&gt;project page at Github&lt;/a&gt;
and noticed the &lt;a class="reference external" href="https://github.com/discoproject/disco/commits/master"&gt;steady stream of commits&lt;/a&gt;.
Hence it is, contrary to what I said at the &lt;a class="reference external" href="http://wiki.python.org/moin/PUN/"&gt;PUN&lt;/a&gt; meeting, definitely actively developed.&lt;/p&gt;
&lt;p&gt;So, what makes &lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt; so interesting
that I wanted to bring it to the attention of other Python developers? An example shows that best:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;disco.core&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Job&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result_iterator&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;map&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;split&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;iter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;params&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;disco.util&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;kvgroup&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;counts&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;kvgroup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;iter&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="k"&gt;yield&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;counts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;__main__&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;job&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;Job&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="nb"&gt;input&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;quot;http://discoproject.org/media/text/chekhov.txt&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
        &lt;span class="nb"&gt;map&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;map&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nb"&gt;reduce&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nb"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;result_iterator&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;job&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wait&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;show&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt; &lt;span class="n"&gt;word&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;count&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;That's plain and simple Python code!
Furthermore it's a complete &lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt; &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Mapreduce"&gt;MapReduce&lt;/a&gt; job.
As you can see
it only takes two functions, &lt;tt class="docutils literal"&gt;map&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;reduce&lt;/tt&gt;,
without a lot of boiler plate code to implement a &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Mapreduce"&gt;MapReduce&lt;/a&gt; job.
Now compare that to writing a &lt;a class="reference external" href="http://hadoop.apache.org/docs/current/hadoop-yarn/hadoop-yarn-site/WritingYarnApplications.html"&gt;simple Apache Hadoop client&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Something slightly more complicated,
&lt;a class="reference external" href="http://discoproject.org/doc/disco/start/tutorial_2.html"&gt;an inner_join operation on arbitrarily large datasets&lt;/a&gt;
still looks simple.
I think it is a testimony to good design
if problems can be expressed easily and succinctly in a framework.&lt;/p&gt;
&lt;p&gt;You might wonder whether &lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt; actually scales.
After all, &lt;a class="reference external" href="https://en.wikipedia.org/wiki/Mapreduce"&gt;MapReduce&lt;/a&gt; problems crave to be distributed over as many nodes as you can dedicate to it.
Well, &lt;a class="reference external" href="http://research.nokia.com/"&gt;Nokia Research Center in Palo Alto&lt;/a&gt; runs &lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt; on an 800 node cluster.
That should give you an idea of its scalability.&lt;/p&gt;
&lt;p&gt;According to the &lt;a class="reference external" href="http://discoproject.org/doc/disco/intro.html"&gt;What is Disco&lt;/a&gt; page,
&lt;a class="reference external" href="http://discoproject.org/"&gt;Disco&lt;/a&gt;'s main features are:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;Proven to scale to hundreds of CPUs and tens of thousands of simultaneous tasks.&lt;/li&gt;
&lt;li&gt;Used to process datasets in the scale of tens of terabytes.&lt;/li&gt;
&lt;li&gt;Extremely simple to use: A typical tasks consists of two functions written in Python and two calls to the Disco API.&lt;/li&gt;
&lt;li&gt;Tasks can be specified in any other language as well, by implementing the Disco worker protocol.&lt;/li&gt;
&lt;li&gt;Input data can be in any format, even binary data such as images. The data can be located on any source that is accessible by HTTP or it can distributed to local disks.&lt;/li&gt;
&lt;li&gt;Fault-tolerant: Server crashes don’t interrupt jobs. New servers can be added to the system on the fly.&lt;/li&gt;
&lt;li&gt;Flexible: In addition to the core map and reduce functions, a combiner function, a partition function and an input reader can be provided by the user.&lt;/li&gt;
&lt;li&gt;Easy to integrate to larger applications using the standard Disco module and the Web APIs.&lt;/li&gt;
&lt;li&gt;Comes with a built-in distributed storage system (&lt;a class="reference external" href="http://discoproject.org/doc/disco/howto/ddfs.html"&gt;Disco Distributed Filesystem&lt;/a&gt;).&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;5 minutes to intrigue you, I hope it worked.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="the-other-5-min-lightning-talks"&gt;
&lt;h3&gt;The other 5 min Lightning Talks&lt;/h3&gt;
&lt;p&gt;The other talks covered a wide range of topics:&lt;/p&gt;
&lt;ul class="simple"&gt;
&lt;li&gt;&lt;a class="reference external" href="https://speakerdeck.com/u/joelcox/p/really-naive-data-mining"&gt;(Really) naive data mining&lt;/a&gt;, Joël Cox&lt;/li&gt;
&lt;li&gt;&amp;quot;Requests&amp;quot; library for easy json api access + testing dikes, &lt;a class="reference external" href="http://reinout.vanrees.org/"&gt;Reinout van Rees&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Python for those little throwaway scripts (that you end up not throwing away), Tikitu de Jager&lt;/li&gt;
&lt;li&gt;&lt;a class="reference external" href="http://www.slideshare.net/sshanx/shell-pearls-not-perl-shells-pun-21092012"&gt;Shell pearls (not to be confused with Perl shells)&lt;/a&gt;, Remco Wendt&lt;/li&gt;
&lt;li&gt;A script for running shell commands from the OS X command line but executed in Virtual Machines, &lt;a class="reference external" href="http://reinout.vanrees.org/"&gt;Reinout van Rees&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;All in all an evening well spent!&lt;/p&gt;
&lt;/div&gt;
&lt;/div&gt;
</summary><category term="Python"></category><category term="Erlang"></category></entry><entry><title>Converting a Tumblr blog to a rstblog</title><link href="http://blog.kollerie.com/2011/06/21/converting_a_tumblr_blog_to_a_rstblog/" rel="alternate"></link><updated>2011-06-21T00:00:00+02:00</updated><author><name>Guido Kollerie</name></author><id>tag:blog.kollerie.com,2011-06-21:2011/06/21/converting_a_tumblr_blog_to_a_rstblog/</id><summary type="html">&lt;p&gt;I have been writing a private (password protected) blog for family and friends
on Tumblr for almost half a year and it suddenly freaked me out that Tumblr had
all my carefully written blog posts. What if they lost it all, what if they went
out of business, etc? Shortly before I freaked out I had discovered &lt;a class="reference external" href="/2011/05/03/rstblog/"&gt;rstblog&lt;/a&gt;, a
static blog generator that generates a blog from &lt;a class="reference external" href="http://docutils.sourceforge.net/docs/user/rst/quickref.html"&gt;reStructuredText&lt;/a&gt; formatted
text files. Could I somehow get all my posts out of Tumblr, convert them to
&lt;a class="reference external" href="http://docutils.sourceforge.net/docs/user/rst/quickref.html"&gt;reStructuredText&lt;/a&gt; and generate a &lt;a class="reference external" href="/2011/05/03/rstblog/"&gt;rstblog&lt;/a&gt;? Turns out I could, though it did
involve a fair bit of manual editing and writing a Python script.&lt;/p&gt;
&lt;div class="section" id="getting-my-posts-out-of-tumblr"&gt;
&lt;h2&gt;Getting my posts out of Tumblr&lt;/h2&gt;
&lt;p&gt;Tumblr has a &lt;a class="reference external" href="http://staff.tumblr.com/post/286303145/tumblr-backup-mac-beta"&gt;backup tool&lt;/a&gt; that was
released as a beta on Dec 16th, 2009. It has not been updated since and never
got bumped to release status. It works just fine though. However the fact that
it has not been updated in 1.5 years seems seems to suggest that they do not
consider it a high priority for users to be able to backup their data. That
means you will either have to trust Tumblr or get your data out while you still
can.&lt;/p&gt;
&lt;p&gt;Tumblr's backup tool does not work with password protected blogs (nor does
their search feature, but that's something entirely different). Hence I had to
temporarily turn off the password protection, run the backup and re-enable the
password protection. The backup tool writes all blog posts, images and other
relevant data into the following directory structure:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="gp"&gt;$&lt;/span&gt; ls -la
&lt;span class="go"&gt;total 32&lt;/span&gt;
&lt;span class="go"&gt;drwxr-xr-x    9 gkoller  staff   306 May 31 14:15 .&lt;/span&gt;
&lt;span class="go"&gt;drwx------   15 gkoller  staff   510 Jun 17 22:16 ..&lt;/span&gt;
&lt;span class="go"&gt;drwxr-xr-x    6 gkoller  staff   204 May 31 12:50 archive.noindex&lt;/span&gt;
&lt;span class="go"&gt;-rw-r--r--    1 gkoller  staff  5616 May 31 12:50 avatar.png&lt;/span&gt;
&lt;span class="go"&gt;drwxr-xr-x   17 gkoller  staff   578 May 31 12:50 images&lt;/span&gt;
&lt;span class="go"&gt;-rw-r--r--    1 gkoller  staff  1138 May 31 12:50 index.html&lt;/span&gt;
&lt;span class="go"&gt;drwxr-xr-x  216 gkoller  staff  7344 Jun  8 11:41 posts&lt;/span&gt;
&lt;span class="go"&gt;-rw-r--r--    1 gkoller  staff   498 May 31 12:50 style.css&lt;/span&gt;
&lt;span class="go"&gt;drwxr-xr-x    3 gkoller  staff   102 May 31 14:15 theme&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;The &lt;tt class="docutils literal"&gt;posts&lt;/tt&gt; directory contains all the posts. The &lt;tt class="docutils literal"&gt;images&lt;/tt&gt; directory
contains all the images. Both directories are flat; they have no further
hierarchy.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="converting-html-to-restructuredtext"&gt;
&lt;h2&gt;Converting HTML to reStructuredText&lt;/h2&gt;
&lt;p&gt;Each post in the &lt;tt class="docutils literal"&gt;posts&lt;/tt&gt; directory is a simple HTML file. For &lt;a class="reference external" href="/2011/05/03/rstblog/"&gt;rstblog&lt;/a&gt; to
be able to generate a complete static blog these posts need to be converted
to &lt;a class="reference external" href="http://docutils.sourceforge.net/docs/user/rst/quickref.html"&gt;reStructuredText&lt;/a&gt;. Initially I thought I had to write a script to do this
conversion for me, but as it turns out, there is a tool that already does this:
&lt;a class="reference external" href="http://johnmacfarlane.net/pandoc/"&gt;Pandoc&lt;/a&gt;. Converting (most) HTML files to &lt;a class="reference external" href="http://docutils.sourceforge.net/docs/user/rst/quickref.html"&gt;reStructuredText&lt;/a&gt; files was as
simple as:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="k"&gt;for &lt;/span&gt;f in &lt;span class="sb"&gt;`&lt;/span&gt;ls -1 *.html&lt;span class="sb"&gt;`&lt;/span&gt;
&lt;span class="k"&gt;do&lt;/span&gt;
&lt;span class="k"&gt;    &lt;/span&gt;&lt;span class="nb"&gt;echo &lt;/span&gt;processing &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;f&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;
    pandoc -f html -t rst &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;f&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt; -o &lt;span class="k"&gt;${&lt;/span&gt;&lt;span class="nv"&gt;f&lt;/span&gt;&lt;span class="p"&gt;%%html&lt;/span&gt;&lt;span class="k"&gt;}&lt;/span&gt;rst
&lt;span class="k"&gt;done&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Unfortunately &lt;a class="reference external" href="http://johnmacfarlane.net/pandoc/"&gt;Pandoc&lt;/a&gt; is not perfect. Converting HTML tables to their
&lt;a class="reference external" href="http://docutils.sourceforge.net/docs/user/rst/quickref.html"&gt;reStructuredText&lt;/a&gt; equivalents was a no-go; &lt;a class="reference external" href="http://johnmacfarlane.net/pandoc/"&gt;Pandoc&lt;/a&gt; simple hang. For those posts
with tables I took an intermediate route via Markdown (also supported by
&lt;a class="reference external" href="http://johnmacfarlane.net/pandoc/"&gt;Pandoc&lt;/a&gt;); HTML -&amp;gt; Markdown -&amp;gt; &lt;a class="reference external" href="http://docutils.sourceforge.net/docs/user/rst/quickref.html"&gt;reStructuredText&lt;/a&gt;. The remaining posts generally
converted reasonably well.  I say reasonably, as posts with images with  &lt;a class="reference external" href="http://en.wikipedia.org/wiki/Alt_attribute"&gt;alt
text&lt;/a&gt;'s generated incorrect
&lt;a class="reference external" href="http://docutils.sourceforge.net/docs/user/rst/quickref.html"&gt;reStructuredText&lt;/a&gt; equivalent markup where the &lt;tt class="docutils literal"&gt;:alt:&lt;/tt&gt; marker and the
corresponding text were on separate lines instead of on the same line.  This
required a fair bit of manual editing to fix.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="rstblog-metadata-and-directory-structure"&gt;
&lt;h2&gt;rstblog Metadata and Directory Structure&lt;/h2&gt;
&lt;p&gt;&lt;a class="reference external" href="/2011/05/03/rstblog/"&gt;rstblog&lt;/a&gt; requires more than just a bunch of &lt;a class="reference external" href="http://docutils.sourceforge.net/docs/user/rst/quickref.html"&gt;reStructuredText&lt;/a&gt; files. First,
each post needs to have some metadata in &lt;a class="reference external" href="http://en.wikipedia.org/wiki/YAML"&gt;YAML&lt;/a&gt; at the top; this includes, among
other things, tags. Furthermore &lt;a class="reference external" href="/2011/05/03/rstblog/"&gt;rstblog&lt;/a&gt; requires the posts to be in a specific
directory hierarchy where year, month and day each are a directory.  Last, all
the images needed to be copied to subdirectory of &lt;a class="reference external" href="/2011/05/03/rstblog/"&gt;rstblog&lt;/a&gt;'s &lt;tt class="docutils literal"&gt;static&lt;/tt&gt;
directory and the references to them from the &lt;a class="reference external" href="http://docutils.sourceforge.net/docs/user/rst/quickref.html"&gt;reStructuredText&lt;/a&gt; files adjusted.
Doing all of this manually for 100+ posts would have been a major pain. Hence I
wrote a small Python 3 script to do exactly that.&lt;/p&gt;
&lt;p&gt;It is run by executing:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;python converter.py -tumblr_path ~/tmp/tumblr_backup -rstblog_path ~/tmp/rstblog
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;This assumes that &lt;tt class="docutils literal"&gt;~/tmp/rstblog&lt;/tt&gt; contains a rudimentary rstblog setup.&lt;/p&gt;
&lt;p&gt;The source code for the script (mind the dependency on &lt;a class="reference external" href="http://pytz.sourceforge.net/"&gt;pytz&lt;/a&gt;):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;glob&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;sys&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;argparse&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;xml.etree.ElementTree&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;fromstring&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;locale&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;os&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;re&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="nn"&gt;datetime&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;pytz&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="nn"&gt;shutil&lt;/span&gt;

&lt;span class="n"&gt;parser&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;argparse&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ArgumentParser&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;description&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;Convert Tumblr posts backup into rstblog posts.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;-tumblr_path&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;required&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;The root directory of the Tumblr backup.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;add_argument&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;-rstblog_path&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;required&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;help&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;The root directory of the rstblog blog.&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parse_args&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;sys&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;argv&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:])&lt;/span&gt;


&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;verify_path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path_argname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path_name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;path_sig&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;format_params&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;path&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                     &lt;span class="s"&gt;&amp;#39;path_argname&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;path_argname&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                     &lt;span class="s"&gt;&amp;#39;path_name&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;path_name&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;isdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;missing&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;path_sig&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;listdir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;missing&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;format_params&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;missing&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;missing&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;{path_argname} does not look like to be a {path_name} directory. It&amp;#39;s missing &amp;#39;{missing}&amp;#39;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;format_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;else&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;{path_argname} does not refer to a directory.&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="n"&gt;format_params&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;

&lt;span class="c"&gt;# set of files or directories that are typical for the type of directories&lt;/span&gt;
&lt;span class="c"&gt;# we&amp;#39;re dealing with. This allows us to perform a few sanity checks before we&lt;/span&gt;
&lt;span class="c"&gt;# start reading and writing files.&lt;/span&gt;
&lt;span class="n"&gt;rstblog_sig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;config.yml&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;static&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;_templates&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="n"&gt;tumblr_sig&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;post&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;index.html&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;archive.noindex&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="n"&gt;verify_path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tumblr_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;-tumblr_path&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;Tumblr Backup&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tumblr_sig&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;verify_path&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rstblog_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;-rstblog_path&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;rstblog&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rstblog_sig&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# create image dir in rstblog_dir.&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;get_local_timezone&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
    &lt;span class="n"&gt;locale&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;setlocale&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;locale&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;LC_ALL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;quot;&amp;quot;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;# (&amp;#39;nl_NL&amp;#39;, &amp;#39;ISO8859-1&amp;#39;) -&amp;gt; NL&lt;/span&gt;
    &lt;span class="c"&gt;# works even with explicit char encoding present, eg nl_NL.UTF-8&lt;/span&gt;
    &lt;span class="n"&gt;country&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;locale&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;getlocale&lt;/span&gt;&lt;span class="p"&gt;()[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;timezone_names&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pytz&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;country_timezones&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timezone_names&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;&amp;quot;Multiple timezones {} found for country &amp;#39;{}&amp;#39;, picking the first one &amp;#39;{}&amp;#39;&amp;quot;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                &lt;span class="n"&gt;timezone_names&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;country&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                &lt;span class="n"&gt;timezone_names&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]))&lt;/span&gt;
    &lt;span class="n"&gt;tz&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pytz&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;timezone_names&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;tz&lt;/span&gt;

&lt;span class="n"&gt;local_tz&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;get_local_timezone&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

&lt;span class="n"&gt;tumblr_img_dir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tumblr_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;images&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;tumblr_posts_glob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;tumblr_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;posts/*.html&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tumblr_post&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;sorted&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;glob&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;iglob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tumblr_posts_glob&lt;/span&gt;&lt;span class="p"&gt;)):&lt;/span&gt;
    &lt;span class="n"&gt;rst_post_src&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;splitext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tumblr_post&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;.rst&amp;#39;&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tumblr_post&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;tp&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;contents&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;tp&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;read&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
        &lt;span class="c"&gt;# The Tumblr backup stores the tags in XML document contained in an&lt;/span&gt;
        &lt;span class="c"&gt;# HTML comment. Before we can use an XML parser to extract the tags&lt;/span&gt;
        &lt;span class="c"&gt;# we need to extract the comment from the HTML.&lt;/span&gt;
        &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;r&amp;#39;&amp;lt;!--\s*BEGIN TUMBLR XML\s*(.*)\s*END TUMBLR XML\s*--&amp;gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                      &lt;span class="n"&gt;contents&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;DOTALL&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;groups&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;comment&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;post_elem&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fromstring&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;comment&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="c"&gt;# strptime ignores timezones even if %Z is specified. This is&lt;/span&gt;
            &lt;span class="c"&gt;# because it relies on time.strptime which does not handle&lt;/span&gt;
            &lt;span class="c"&gt;# timezones. Hence we need to take care of it ourselves. We expect&lt;/span&gt;
            &lt;span class="c"&gt;# the string dates to be in the following format:&lt;/span&gt;
            &lt;span class="c"&gt;# 2011-05-13 07:32:00 GMT&lt;/span&gt;
            &lt;span class="c"&gt;# BTW given the attribute name one could argue the date will&lt;/span&gt;
            &lt;span class="c"&gt;# always be in GMT. However it is not that much work to actually&lt;/span&gt;
            &lt;span class="c"&gt;# parse the date including timezone information. They violated the&lt;/span&gt;
            &lt;span class="c"&gt;# SPOT rule, so there must be a reason for it (eg other timezones&lt;/span&gt;
            &lt;span class="c"&gt;# possible?)&lt;/span&gt;
            &lt;span class="n"&gt;date_str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tz_str&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;post_elem&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;date-gmt&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rsplit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39; &amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;datetime&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strptime&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;date_str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;%Y-%m-&lt;/span&gt;&lt;span class="si"&gt;%d&lt;/span&gt;&lt;span class="s"&gt; %H:%M:%S&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tzinfo&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;pytz&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timezone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tz_str&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="n"&gt;local_date&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;local_tz&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;normalize&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;astimezone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;local_tz&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

            &lt;span class="c"&gt;# clean up the tags. Apparently it is possible for Tumblr tags&lt;/span&gt;
            &lt;span class="c"&gt;# to contain &amp;#39;,&amp;#39;. rstblog can&amp;#39;t deal with them.&lt;/span&gt;
            &lt;span class="n"&gt;tags&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;,&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt;
                    &lt;span class="n"&gt;post_elem&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;findall&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;tag&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;tag&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
            &lt;span class="k"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;date_components&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;{:02d}&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt;
                &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;local_date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;local_date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;month&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;local_date&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;day&lt;/span&gt;&lt;span class="p"&gt;)]&lt;/span&gt;
            &lt;span class="n"&gt;rst_post_dir&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;args&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rstblog_path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;date_components&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;makedirs&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rst_post_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;o744&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="n"&gt;day_order_glob&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rst_post_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;*.rst&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;day_order&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;glob&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;glob&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;day_order_glob&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;
            &lt;span class="n"&gt;shutil&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rst_post_src&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;rst_post_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;rst_post_dst&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rst_post_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                                        &lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;basename&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rst_post_src&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

            &lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;re&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;compile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;r&amp;#39;\.\./images/(\w+\.\w+)&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nb"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rst_post_dst&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;&amp;#39;r+&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;rpd&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;rpd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readlines&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                &lt;span class="n"&gt;header&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
                &lt;span class="n"&gt;header&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;public: yes&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;header&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;tags: [{}]&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;, &amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tags&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
                &lt;span class="n"&gt;header&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;day-order: {:d}&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;format&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;day_order&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="n"&gt;header&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="s"&gt;&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;header&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:]&lt;/span&gt;

                &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nb"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
                    &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;search&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                        &lt;span class="n"&gt;image_fn&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;group&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="n"&gt;shutil&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;copy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;os&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;path&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tumblr_img_dir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image_fn&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
                                    &lt;span class="n"&gt;rst_post_dir&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                        &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;sub&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;r&amp;#39;/static/images/\1&amp;#39;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

                &lt;span class="n"&gt;rpd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;seek&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="n"&gt;rpd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;truncate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
                &lt;span class="n"&gt;rpd&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;writelines&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
</summary><category term="Python"></category></entry><entry><title>rstblog</title><link href="http://blog.kollerie.com/2011/05/03/rstblog/" rel="alternate"></link><updated>2011-05-03T00:00:00+02:00</updated><author><name>Guido Kollerie</name></author><id>tag:blog.kollerie.com,2011-05-03:2011/05/03/rstblog/</id><summary type="html">&lt;p&gt;For quite some time I have wanted to self-host my blog. However I did not feel
like administering a dynamic system, such as Wordpress, that is in constant
need of patches and upgrades. Hence a static blog generator was an obvious
choice. As a Python developer I have a slight preference for a system written
in Python.  &lt;a class="reference external" href="http://www.blogofile.com/"&gt;Blogofile&lt;/a&gt; is such a system and seems to be gaining some traction
within the Python community lately. For good reason: it is well documented,
actively developed and easy to work with. However, when I found out &lt;a class="reference external" href="http://lucumr.pocoo.org/"&gt;Armin
Ronacher&lt;/a&gt; wrote an even simpler system I knew I had found what I was looking
for: &lt;a class="reference external" href="https://github.com/mitsuhiko/rstblog"&gt;rstblog&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;a class="reference external" href="https://github.com/mitsuhiko/rstblog"&gt;rstblog&lt;/a&gt; is a very simple script that allows you to write your posts in
&lt;a class="reference external" href="http://docutils.sourceforge.net/docs/user/rst/quickref.html"&gt;reStructuredText&lt;/a&gt;. It is however lacking in documentation. Fortunately &lt;a class="reference external" href="http://sbhr.dk/"&gt;Morten
Siebuhr&lt;/a&gt; wrote a blog post on &lt;a class="reference external" href="http://sbhr.dk/2010/11/30/using_rstblog/"&gt;using rstblog&lt;/a&gt; that should get you up and
running in no-time. At least, it did for me.&lt;/p&gt;
</summary><category term="Python"></category></entry><entry><title>Setting up PostgreSQL on OS X for development</title><link href="http://blog.kollerie.com/2010/01/02/setting_up_postgresql_on_os_x_for_development/" rel="alternate"></link><updated>2010-01-02T00:00:00+01:00</updated><author><name>Guido Kollerie</name></author><id>tag:blog.kollerie.com,2010-01-02:2010/01/02/setting_up_postgresql_on_os_x_for_development/</id><summary type="html">&lt;p&gt;Currently I’m working on my third web based project that uses PostgreSQL as its
backend. Two of these projects were/are being developed under OS X. Installing
PostgreSQL under OS X is a breeze when one uses MacPorts. However I have seen
more than one developer being confused about the steps that should follow the
installation and the post installation instructions as printed out by the
PostgreSQL port.&lt;/p&gt;
&lt;p&gt;The installation instructions can be compressed into three steps:&lt;/p&gt;
&lt;ol class="arabic simple"&gt;
&lt;li&gt;Install/update MacPorts&lt;/li&gt;
&lt;li&gt;Execute: sudo port install postgresql83 postgresql83-server&lt;/li&gt;
&lt;li&gt;Follow post-install instructions as printed out by above command&lt;/li&gt;
&lt;/ol&gt;
&lt;div class="section" id="trying-to-connect-to-postgresql"&gt;
&lt;h2&gt;Trying to connect to PostgreSQL&lt;/h2&gt;
&lt;p&gt;Now that the PostgreSQL is installed you might be tempted to connect to it by
starting the PostgreSQL interactive terminal. This is what will happen (gkoller
is the user I’m currently logged in as under OS X):&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="go"&gt;gkoller@Kinchenna $ psql&lt;/span&gt;
&lt;span class="go"&gt;psql: FATAL: database &amp;quot;gkoller&amp;quot; does not exist&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;So by default it looks for a database named identically to the currently logged
in user. Should we want to connect to a different database we should specify
the database’s name after the ‘psql’ command.  E.g.:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="go"&gt;psql hgh&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;I have chosen the name &lt;tt class="docutils literal"&gt;hgh&lt;/tt&gt; as it is the (abbreviated) name of my latest
project. The above command will fail with a similar message as the first
command. So let’s create the ‘hgh’ database:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="go"&gt;gkoller@Kinchenna $ createdb hgh&lt;/span&gt;
&lt;span class="go"&gt;createdb: database creation failed: ERROR: role &amp;quot;gkoller&amp;quot; does not exist&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Again the error message is clear. The currently logged in user does not have
access to (does not have a role with the same name defined in) PostgreSQL.
Simply executing a &lt;tt class="docutils literal"&gt;createuser gkoller&lt;/tt&gt; will not help as we don’t have enough
privileges to do that. More importantly nor does root. So a ‘sudo createuser
gkoller’ does not work either. And this is what stumps most developers that try
to get PostgreSQL up and running for web development on OS X&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="granting-privileges-to-the-currently-logged-in-user"&gt;
&lt;h2&gt;Granting privileges to the currently logged in user&lt;/h2&gt;
&lt;p&gt;When PostgreSQL was installed it was configured with one superuser, namely
&lt;tt class="docutils literal"&gt;postgres&lt;/tt&gt;. Hence adding new users with superuser privileges should be done as
user &lt;tt class="docutils literal"&gt;postgres&lt;/tt&gt;:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="go"&gt;sudo su postgres -c &amp;#39;createuser -P --superuser gkoller&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Now the currently logged in user has PostgreSQL superuser privileges. This
means we don’t have to use &lt;tt class="docutils literal"&gt;sudo&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;su&lt;/tt&gt; to executed PostgreSQL commands
to create databases, roles, and other users.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="creating-a-database-and-user-for-project-hgh"&gt;
&lt;h2&gt;Creating a database and user for project HGH&lt;/h2&gt;
&lt;p&gt;Now that I am a super user it is easy to create additional users and databases.
For my HGH project I want a separate database and user. I’ll name them both
&lt;tt class="docutils literal"&gt;hgh&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;This is how:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="go"&gt;gkoller@Kinchenna $ createuser hgh -P&lt;/span&gt;
&lt;span class="go"&gt;Enter password for new role: &amp;lt;password&amp;gt;&lt;/span&gt;
&lt;span class="go"&gt;Enter it again: &amp;lt;password&amp;gt;&lt;/span&gt;
&lt;span class="go"&gt;Shall the new role be a superuser? (y/n) n&lt;/span&gt;
&lt;span class="go"&gt;Shall the new role be allowed to create databases? (y/n) y&lt;/span&gt;
&lt;span class="go"&gt;Shall the new role be allowed to create more new roles? (y/n) n&lt;/span&gt;

&lt;span class="go"&gt;gkoller@Kinchenna $ createdb -E utf8 -O hgh -W -U hgh hgh&lt;/span&gt;
&lt;span class="go"&gt;Password: &amp;lt;password&amp;gt;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;/div&gt;
&lt;div class="section" id="conclusion"&gt;
&lt;h2&gt;Conclusion&lt;/h2&gt;
&lt;p&gt;After PostgreSQL installation and post-installation you should create a new
superuser named after your OS X login account. This allows for access to
PostgreSQL commands without the need to use &lt;tt class="docutils literal"&gt;sudo&lt;/tt&gt; and &lt;tt class="docutils literal"&gt;su&lt;/tt&gt;.&lt;/p&gt;
&lt;p&gt;This is achieved by executing the following command:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="go"&gt;sudo su postgres -c &amp;#39;createuser -P --superuser &amp;lt;your_username&amp;gt;&amp;#39;&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Where &lt;tt class="docutils literal"&gt;&amp;lt;your_username&amp;gt;&lt;/tt&gt; should be replaced with the username of your
OS X account (short) name.&lt;/p&gt;
&lt;/div&gt;
</summary><category term="OS X"></category><category term="PostgreSQL"></category></entry><entry><title>Devoxx ’08: The Good &amp; The Bad</title><link href="http://blog.kollerie.com/2008/12/14/devoxx_08_the_good_and_the_bad/" rel="alternate"></link><updated>2008-12-14T00:00:00+01:00</updated><author><name>Guido Kollerie</name></author><id>tag:blog.kollerie.com,2008-12-14:2008/12/14/devoxx_08_the_good_and_the_bad/</id><summary type="html">&lt;p&gt;This year I attended &lt;a class="reference external" href="http://devoxx.com/display/JV08/Home"&gt;Devoxx&lt;/a&gt;; the largest Java conference in Europe. I had high
expectations as I previously had heard many good things about JavaPolis
(&lt;a class="reference external" href="http://devoxx.com/display/JV08/Home"&gt;Devoxx&lt;/a&gt;’ former name). However after three days of attending presentations I
returned home with mixed feelings. First the good.&lt;/p&gt;
&lt;div class="section" id="the-good"&gt;
&lt;h2&gt;The Good&lt;/h2&gt;
&lt;p&gt;A conference of this size is bound to give you a good feeling of what is going
on in the Java world. It might be my pick of the presentations but I got the
feeling that nothing shockingly new was happening (except for arguably JavaFX,
but more on that later). Instead the focus seemed to be on consolidating that
what is working for the Java developers and addressing real issues Java
developers are dealing with.&lt;/p&gt;
&lt;p&gt;For instance in Thursday’s keynote Joshua Bloch talked about enums and generics
and how to use them effectively. A good thing as I have seen many situations
where developers should have used enums, but opted for a more conventional
solution instead. As for generics, if things get more complicated then
ArrayList&amp;lt;String&amp;gt; I have to slow down and be really careful about what I do.
Any tips on using generics effectively is always welcome.&lt;/p&gt;
&lt;p&gt;A good presentation was John Ferguson Smart’s presentation on “Behavior Driven
Development in Java with easyb”. Again nothing shockingly new, but it did
clearly point out problems many developers have with Test Driven Development;
what to test? By focussing on the required behavior that needs to be
implemented and using a testing framework that allows this to be expressed
easily, writing meaningful tests should be a lot easier for many developers.&lt;/p&gt;
&lt;p&gt;There was a strong focus on running other languages then Java on the JVM. Bill
Venners presented on the “Feel of Scala” though I didn’t attend that one as it
had been scheduled at the same time as the presentation on easyb. Nor did I
attend Charles Nutter’s and Thomas Enebo’s presentation on JRuby. However I did
attend Jim Baker’s and Tobias Ivarsson’s presentation on Jython and Brian
Goetz’ “Towards a Dynamic VM”. Both were great.&lt;/p&gt;
&lt;p&gt;I was thrilled to hear Jython is alive and kicking. With all the focus on
JRuby, Scala, and more recently Clojure in blogosphere, one might have gotten
the impression that Jython was all but dead. Instead a release that’s
compatible with Python 2.5 is imminent and they demonstrated Django running on
Jython.&lt;/p&gt;
&lt;p&gt;Dynamic languages on the JVM are already quite fast compared to their relatives
written in C. However with the enhancements planned for the JVM this should
improve significantly in the future. Brian Goetz’ presentation was fairly
technical and as such one of the better ones. It made me well aware of all the
work the JVM has to do in running dynamic languages. Knowing this made me
appreciate the current speed of the dynamic language on the JVM even more.&lt;/p&gt;
&lt;p&gt;Another thing worth mentioning was the organization of the conference. They did
a great job. From registration, sending my badge to my home address, lunch,
coffee, snacks, drinks, announcements. Everything was organized very well
allowing me to focus on what I had came for, attending presentations and
talking to people.&lt;/p&gt;
&lt;/div&gt;
&lt;div class="section" id="the-bad"&gt;
&lt;h2&gt;The Bad&lt;/h2&gt;
&lt;p&gt;Even though there were a fair number of good presentations some were really
awful. Most notably IBM’s keynote presentation on Java and RFID. IBM is a large
multinational. &lt;a class="reference external" href="http://devoxx.com/display/JV08/Home"&gt;Devoxx&lt;/a&gt; is a huge conference with 3300 attendees. You would have
figured they would have made an effort in delivering an interesting keynote.
Looking at the result they did not. It was embarrassingly amateurish and
clumsy.&lt;/p&gt;
&lt;p&gt;They same holds for many other presentations. Some had potentially interesting
content, but due to the way it was presented it was difficult to follow, or
outright incomprehensible. I do realize it is easy for me to criticize people
sitting behind my keyboard and writing a blog entry. After all I did not have
to stand in front of a huge room filled with anything between 60 to 1000
people. But even so, more presenters should have done significantly better for
a conference of this size.&lt;/p&gt;
&lt;p&gt;And last JavaFX. Not necessarily bad as the presentations on it that I did
attend were well executed and interesting. However I failed to see JavaFX’
relevance for enterprise Java developers. I might be wrong, but I would expect
most of the attendees to fall into this class of developers. Sun however seemed
to target a different class of developers. Nice graphics, animations, sound and
video is all great. However I as an enterprise Java developer am more
interested in how I can construct forms, the kind of controls that are
available, communication with back end systems, etc. None of that was
addressed. A pity and a missed opportunity for Sun.&lt;/p&gt;
&lt;/div&gt;
</summary><category term="Java"></category><category term="Devoxx"></category></entry><entry><title>Implementing Java’s equals method</title><link href="http://blog.kollerie.com/2008/06/16/implementing_java_s_equals_method/" rel="alternate"></link><updated>2008-06-16T00:00:00+02:00</updated><author><name>Guido Kollerie</name></author><id>tag:blog.kollerie.com,2008-06-16:2008/06/16/implementing_java_s_equals_method/</id><summary type="html">&lt;p&gt;There’s a lot of incorrect information on the web and even in published books
on how to implement Java’s &lt;tt class="docutils literal"&gt;equals&lt;/tt&gt; method. Ivan Memruk in his blog post &lt;a class="reference external" href="http://thewrongcode.blogspot.com/2007/12/equalsinstanceof-pitfall.html"&gt;The
equals/instanceof pitfall&lt;/a&gt;
describes this issue nicely. There is however a small error in his post; what
he calls the &lt;em&gt;reflexive property&lt;/em&gt; is actually the &lt;em&gt;symmetric property&lt;/em&gt; of the
equals method.&lt;/p&gt;
&lt;p&gt;Reflexive is: a=a, whereas symmetric is: if a=b then also b=a. It’s this
property, among others, that should hold for Java’s &lt;tt class="docutils literal"&gt;equals&lt;/tt&gt; method. By using
instanceof in equals() the symmetric property doesn’t hold. Instead use the
&lt;tt class="docutils literal"&gt;getClass&lt;/tt&gt; method. A good template for Java’s &lt;tt class="docutils literal"&gt;equals&lt;/tt&gt; method would be:&lt;/p&gt;
&lt;div class="highlight"&gt;&lt;pre&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="kt"&gt;boolean&lt;/span&gt; &lt;span class="nf"&gt;equals&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Object&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;o&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt; &lt;span class="o"&gt;||&lt;/span&gt; &lt;span class="n"&gt;getClass&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;o&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getClass&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;


   &lt;span class="c1"&gt;// your customized check for equality goes here&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/pre&gt;&lt;/div&gt;
&lt;p&gt;Another good article on implementing equals() is Manish Hatwalne’s &lt;a class="reference external" href="http://www.geocities.com/technofundo/tech/java/equalhash.html"&gt;Equals and
Hash Code&lt;/a&gt;&lt;/p&gt;
</summary><category term="Java"></category></entry></feed>