I have the Hitchhikers Guide to the Galaxy

The problem is that I only have the shell . . . unless I have an internet connection. As others have mentioned; Karoliina Salminen and Andrew Flegg; were someone to combine wikipedia and the Nokia 770 we would have that damned book right in our pocket. Henri Bergius mentions briefly what to do to get the Hitchhiker's Guide into his pocket. In this post, I'm going to outline what it is I need to do to make this a reality.

Getting Data

Wikipedia Contents provides raw XML dumps of its article database on a rolling basis (as quickly as they can get the data dumped its on the site.) The first dump of enwiki is not yet available but it looks like I'm going to need to do a lot of chopping to get it to even fit on my 1GB rs-mmc card I purchased for my device. The uncompressed raw XML dump is 4.5G. I'm going to have to trim over 3.5G of data if I want to view wikipedia on my n770.

Wikitravel (?) Contents

Another option is to use a smaller corpus of data from as the base for my hitchhiker's guide.


Using this iBlue GPS Reciever I will be able to determine the location a user is at; and summarily record that location for future downloads of wikipedia data.

Parsing the Raw Data

With python, parsing XML is pretty easy as long as it is well formed. I believe wikipedia's data is.

wikipedia Sample XML article

<text xml:space="preserve">#REDIRECT [[AAA]]</text>

The easy python:

from xml import sax
from xml.sax import saxutils
from xml.sax import handler

class DelHandler(saxutils.DefaultHandler):
  def startElement(self, name, attrs):
if name != 'text':
print attrs.get('text')

parser = sax.make_parser()
parser.setFeature(handler.feature_namespaces, 0)
dh = DelHandler()

The schema looks to be pretty simple. I will have to find a wikitext python module (or write one myself) if I am going to do any sort of formatting (of course I have to) of the article text. That will be the harder part.

Implementation Details

Using python2.4 I will extend HTTPServer since it makes sense that the wikipages are served like a website. I also think the application would have a GUI component as well. Teemu's Blog will help with that endevor. I think that to make it easy to know that the hhgttg is running and make it easy to launch it, the GUI will internally launch the webserver and will provide some useful functionality for getting updates to pages. I have to flesh this out a lot more. If there is anything I've learned from working at Google, its that design docs do go a long way. This blog entry is a precursor to a more detailed designed spec. I find DDs useful because they help keep me on track and to organize what it is I have to do.

I will be posting my photos of taiwan here:

I will also post photo highlights as individual posts, as seen in the previous mouse pad post.

Good Morning

I just woke up. I have a migraine but I'll survive . . . I think. Thank you Excedrin you'll have me back in shape in no time!

Yesterday was a fury of time travel. Its now tomorrow Morning and I left two days ago. It took me a long time to figure out what day it was yesterday but now I have it figured out. I haven't been hit by jet lag; I'm glad that my 2 hours of sleep Thursday paid off. Too bad I didn't get the "Your soul has to catch up" feeling. That would have been cool.

The beds in the hotel are really hard. I woke up today and felt less sore than I have in a while. The air in here is really dry and I'm a little cold. I think the air is why I have a headache. I'll get used to it then.

In Japan

Just a two hour layover, hardly the chance to do anything aside from catch a bite to eat from an airport shop and hop on the next plane. Teriyaki Chicken rice bowl (a bit salty) and "Seasonal" Toppo.

Ugh! The batteries on my nokia 770 died! Blast! I have 2:44 remaining on my second extended battery and at least 1.5 hours on my slimline. Thanks Help Desk!

Big metal thing in the sky.

The plane I'm on is huge. Its about a million rows long, and 10,000 seats wide. It can carry 1 trillion tons and flys at a speed of about ninety-milion miles an hour, for the canadians among us thats like 120 kph, like the speed limit. I can't take a picture of it because it won't fit into the frame.

I'm two hours early for boarding, like I always try to be; but I'm also running on two hours sleep. I used my passport for the first time and the ticket attendant laughed at me becuase I first handed her my license: "You're traveling internationally, you need a passport. haha."

I have three bottles of water in my bag. I feel sorry for the people between me and the aisle. I have two books with me as well; Freakonomics and The World is flat. I started Freakonomics on the way to San Francisco and I do plan on writing a bit about it once I finish. I have one plan for the flight: Sleep.

I feel like I'm going to lose my valuable air-flight-can't-be-interrupted-by-anything-but-death time because I'll be sleeping; disappointing indeed but I have a weekend ahead of me where I'm sure I'm going to have a hard time sleeping. Ever try flipping your sleeping schedule a full 180 degrees?

Boarding begins in 30 minutes. Yippie!

Oh. Wait a second . . . Do they have different power sockets in Taiwan? I brought my favorite gadgets (camera, nokia 770, celly) but will I be able to charge them? And(!) I am not paying tmobile any money to use their bad bad wireless access points at SFO. . . unless its less than $5 . . . which its not. Its $10 dollars. So now you get to read this only after I get to Taiwan and find internet access, aka my life blood.

Mountain View Today

Because I had to layover in San Francisco anyways, I took an extra day (today) to stop by the Googleplex in Mountain View, CA.

Today I:

  1. Rode an electric scooter three blocks to a meeting
  2. Ate a tasty hunan beef lunch
  3. Drank lots of liquid from the 7-11-if-7-11-were-free like snack room.
  4. Ate some cookies.
  5. Prepped for a super long flight!
  6. Got kinda nervious about my trip (for the first time)
  7. Mmmm.... Orange juice
  8. Got three new laptop batteries
  9. Ate a Tasty dinner, cooked to order pasta!
  10. Played ots of foos-ball
  11. Did some work

Python is the awesome

On Saturday night I was feeling particularly productive and threw togther the infrastructure for a HTTP-request driven multithreaded queueing server that can prioritize tasks based on a set of provided tests. the most successful tasks are queued first to execute.

500 lines of python, 10 hours of work. I'm pleased with the result.

web design

I changed things a bit, tell me what you think.

Whisky bottle PC

Nokia 770

I just picked up the Nokia 770 from on Friday and I've been playing with it all weekend long. Its quite the device. With a 200Mhz processor and 128MB onboard shared memory. The software included provides an Opera based web browser, feed reader, email client, audio player, video player, pdf reader and some more.


  1. 770 Software
  2. Application Catalogue
  3. Maemo Planet


I backup using's Synchronization Makefile. This is pretty simple . . . Quite Amazing.

Things to do

  1. Get a Bluetooth GPS reciever and wham, instant navigation device. I have this running on my server so now I have my lastfm streams in my Audio Player favorites.

Things I've done

  1. Got root
  2. Enabled 24M of swap space: Just create a standard init script for it. and link to it from /etc/rc2.d
  3. Installed tons of stuff
  4. Stream music from lastfm using the proxy.
  5. SSH'd into it and ran dpkg to install vim.
  6. Bricked my device, requiring that I reflash it.