Monday, June 22, 2009

Dumpcatcher: Python Client

For the python dumpcatcher client I decided to go with the standard python logging infrastructure. While there was definitely something lacking in the python documentation for logging.py I found all the information I needed with a combination of the docs and the source code itself. Creating a python client for the dumpcatcher is now pretty simple:
import handler  # from dumpcatcher.clients.python.logging

logger = logging.getLogger('my_tag')
logger.setLevel(logging.DEBUG)  
logger.addHandler(
      DumpcatcherHandler('agtkdW1wY2F0Y2hlcnINCxIHUHJvZHVjdBgCDA',
                         '9b6c65910428419db0d0b730278b72e3',
                         'http://localhost:8080/add',
                         5))


try:
  'a' + 1  # TypeError!
except Exception, e:
  logger.exception('Oh man! Look at this!')
The output of which looks something like:
2009-06-21 06:36:57 demo 40
handler.py:110:broken Oh man! Look at this!
handler.py:110:broken Oh man! Look at this!
Traceback (most recent call last):
File "/home/jlapenna/code/dumpcatcher/clients/python/logging/handler.py", line 108, in broken
  'a' + 1
TypeError: cannot concatenate 'str' and 'int' objects
File "/home/jlapenna/code/dumpcatcher/clients/python/logging/handler.py", line 110, in broken
  logger.exception('Oh man! Look at this!')
Looking at this now, it seems that all I have left is the java client and data aggregation before I'm feature complete!

Labels: ,

Dumpcatcher: Python HTTP and Dates libraries

Sheesh,

Python really shows its age when you have to deal with dates, http and urls. I was able to implement my signing and verification bit of code just now and it was quite a pain. I have a sample client in clients/python/example.py and the endpoint/crash.Add handler should show you the inverse side of the handshake.

The thing that was definitely most challenging was writing the example client. Because the python libraries for http, etc evolved over time there are about seventy-five ways to do an http request instead of python's normal "one."

Same can be said for datetime modules. Long ago when we had to use mx time there was a DateTime object, now its built into python so everything should be peachy? Yeah, date handling in python was improved by this change but timezones are still outrageous! Considering I'm now about 10 hours off UTC, I have had to think about how to submit a request in string form. For now though I'm going to punt and require that times submitted are in the UTC time zone.

Now that I've got that baked in, I can acutally write the code that logs the requests to the data store. After that, I need to write a bit of code to do sorting and aggregating of the crash dump data.

Labels: ,

Dumpcatcher: HTML

Wow, its been quite a while since I've written any HTML. Probably about one year. Its also been that long since I last used AppEngine and as a result I'm really, really rusty.

I'm now about 3 hours into the hack-a-thon (on a plane instead of of a coffee shop) and I have about 10 hours of battery life remaining.

Things I've learned so far:

I totally forgot the basic IO operations for the app engine data store and with this kind of knowledge its actually pretty hard to re-learn -- I remember such random bits and pieces I forget which parts are bad memory or actuality.

The design doc was fun to write, I had to think about a few different problems that I figured I would encounter. I even realized that my initial idea for request signing had some security vunerabilities which I had to think about how to resolve.

I have one third of the whole project done, but its the smallest one third. Users are now able to register their account and create an unlimited number of products.

Next, I need to write the server side component of crash logging. Eg, when a user submits a crash the request must be authorizedand then it needs to get into the datastore.

Labels: , ,

Dumpcatcher: Design Doc

As I do with any project at work, I want to put together a short doc describing the scope and scale at which I will write this app.

Summary

Dumpcatcher is a simple web service that takes authorized requests from remote clients and logs key-value pairs in its datastore for future analysis. These pairs are typically an aribitrary identifier and an exception/stack-trace.

Features

  • Crash Stack Message storage
  • Clients must be able to submit stack traces (well, arbitrary strings) along with various bits of meta-data. Version, app name, etc.
  • Client libraries for Python, Java
  • I am targeting my <a href="http://joelapenna.com/git/foursquared.git">Foursquared Android Client

    Unknown end tag for </a>

    as well as any other pet projects I may use in the future.
  • Data Aggregation
  • I plan on allowing data aggregation by exception type, custom label and line.
  • Authenticated Client Requests
  • All requests by clients must be sent by authorized clients to prevent the service from becoming a black hole for spam. Design:

Design

App Engine has a very simple data store and webapp framework that I intend to utiltize for the basic functionality of the app.

Users

Users represent a single Google Id and a particular developer using the system.

Products

A product is an application that uses the dumpcatcher to log crashes. A user may have multiple products.

Each product registered will have two values associated with it, a productKey which will be passed as a paramter in all HTTP requests to the server and a secret which will be used to HMAC sign a request.

Product secrets will be randomly generated UUIDs.

HTTP Request

All requests to the dumpcatcher service will be secured with an HMAC hash. The hash will be keyed by a unique identifier provided to the client

productKey

Each client -> server request will include a productKey, an identifier used to differentiate between different products using the service.

HMAC

All requests must be submitted with an HMAC-SHA1 hex digest of the request query paramters as well as an increasing "request" identifer. The message consists of a standard http "query", sorted by keyname and quoted, request

For: http://localhost/add?product_keyd=1234&some=pair&other=pair we would construct the digest like so:

TODO(jlapenna): Probably don't want to split on & if the contents of the request might contain one though, they should already be encoded. Something like that...

sorted_query = ''.join(sorted(request.query_string.split('&'))) hash = hmac.new('SOME KEY', sorted_query, hashlib.SHA1)

And, as such, the actual request made to the server will be:

'http://localhost/add?product_key=12345&some=pair&other=pair&hmac=%(hash)s'

On the backend the server will take the reverse steps and using the secret associated with the provided productKey, will verify the authenticity of the request by encoding the query paramters the same way it is done on the client, keying the result by secret associated with the provided productKey.

Datastore

Initially there will be three models, one corresponding to "crashes," another to "users" and the third to "products."

Each user will be associated with a specific Google ID but a single Google ID can have many products.

Security

Security and validity of client-> server requests will be handled via the usage of HTTPS for securing communications and for HMAC to verify authenticity of a client request.

Replay Attack

An attacker with access to the HTTP stream a client -> server request is sent over will be able to execute a replay attack by capturing the HTTP post made by the client and submitting it as its own, at any rate he so desires.

The solution as such is to only allow requests over HTTPS. This gains the added advantage of preventing any private data from leaking via a network observer packet sniffing.

Caveats

It is likely and highly reasonable that an app like this exists in a highly more polished and featureful way. I chose this project because I felt like it would be a good way to explore some new technologies and have a fun time; not because this is in any way "new" or "exciting"

Labels: , , , ,

Sunday, October 26, 2008

Synchronization

I have spent the past two weeks procrastinating (to some extent) on working on my top secret project. The problem I've been struggling with is just a really hard one to solve and I've gotten quite used to instant gratification with my code. Much of what I've been working on has ended up with results by the end of a hacking session. Not so much with my current task -- synchronization.

One of the key features of the top secret project is that it is always available. Whether by a browser or a native app like an android client, the user is expected to be able to interact with the app no matter if they have internet connectivity or not. This means I have to spend a lot of time working on offline access as a requirement for letting anyone use the app. I thought I could get away with dogfooding my app while I was in Toronto, but I quickly realized that without 3G data on the phone the top secret project would just not function correctly.

One of the challenges I've faced so far is a temporal one. My first thought when deciding to do offline access was that the client would do the synchronization and call back to the server to push a canonical dataset into the datastore. After several nights of hacking I was fed up. I couldn't get the synchronzation to work at all. I found other things (like Statusinator) to work on instead.

On the flight to Chicago I had a "breakthrough" that really should have been my first thought. Do synchronization on the server side! My idea is as such:

  1. Client creates local data, assigns hypothetically-unique UUID to record, tags it as existing locally only.


    1. Stores:
      {"key": "possibly-unique-key", "value": "some-value", updated: "2008-10-27 04:41:01", "created": "2008-10-27 04:41:01", "is_pending": true}



  2. Client requests sync session with server, gets data for min/max records to sync. **All further RPCs have a sync_uuid.


    1. Client Sends:

      {"device_uuid": "some-possibly-id"}
    2. Client Recieves:

      {"sync_uuid":
      "some-unique-session-based-on-device-uuid-and-user", "last_sync": null,
      "max_checkin": "2008-10-27 04:41:01"}

  3. Client pushes record to server, is_pending to denote that the server is receiving a note with an unrecognizable key.


    1. Client sends:

      [{"key": "possibly-unique-key", "value": "some-value", updated: "2008-10-27 04:41:01", 
      "created": "2008-10-27 04:41:01", "is_pending": true}]



  4. Server processes record, changes the key to a valid server key, stores the original as an attribute on the record for future book keeping: local_id.


    1. Server Stores:


      {"key": "some-real-key", "local_id": "possibly-unique-key", "value": "some-value", 
      updated: "2008-10-27 04:41:01", "created": "2008-10-27 04:41:01", "is_pending": false}



  5. Client requests updates from the server, Server responds with all new or updated records modified serverside, along with the newly added records from step 3.


    1. Client requests, asking for all records modified after last_sync (or all records, if None) but before max_checkin.
    2. Server responds:


      [{"key": "some-real-key", "local_id":
      "possibly-unique-key", "value": "some-value", updated: "2008-10-27
      04:41:01", "created": "2008-10-27 04:41:01"}]



  6. Client parses record for "local_id" attribute, and replaces the record in the local datastore with the copy from the server, stripping the local_id attribute and removing the pending bit.


    1. Client stores:


      {"key": "some-real-key", "value": "some-value", updated: "2008-10-27 04:41:01", 
      "created": "2008-10-27 04:41:01", "is_pending": false}



  7. Client tells server sync sesison is complete, using the newest record received from the "push/pull" to specifiy the end date of the session. (just as last_sync is the start). To save a write during the server-to-client record update, the client is the one noting the end date for the session here, instead of the server.


    1. Client sends: last_update.
    2. Server stores: stores last_update as last_sync.

I haven't thought this through all the way yet, but I think this will work just fine for a google-gears based browser client just as it will work for my android client. There is something that still bothers me and I'm having a hard time scoping it out in my head: What happens when clock scews occur?

Remaining questions:
  • Do I leave around "local_id?" When is an appropriate time to strip those records? I don't want the server modifying the records withouth confirmation from the client that the info is no longer needed.
  • What are the error condtions when the sync fails at each of the above steps, how does this pattern resolve conflicts that occur when records are modified after a failed sync?

Labels: , ,

Saturday, September 27, 2008

Top Secret Project Update


I've spent another weekend hacking away at my top secret project and I feel that every day I work on it that I am getting closer to being able to use it. In fact, I'm just a few major-usability bugs away from using it day to day! After I spend a few weeks working on it, then a few more weeks fixing what I see as show stoppers I hope to be able to show it off to a couple of people.

Looking at what I've done so far this summer leaves me both depressed and excited. I'm a bit sad because I am so far away from what I hoped to have accomplished by this point. At the same time, I'm excited about what I have been able to do. I've learned a ton about Java and realized that it is a pretty awesome language. I feel it is a good way for me to stretch my brain. With python you can throw everything against a wall, 'import antigravity' and have the crap float away, leaving you with something usable. With Java, I feel I have to plan things out a bit better. I have to live with the consequences of my decisions and deal with every shortcut I inevitable take. As a result I'm much more careful about what I do. Even if I do take a shortcut, these days they don't tend to last very long as I get irritated with the way my code looks or interacts and I refactor until I'm happy.

One thing I haven't gotten around to learning is unit testing. I thought I would have more time to work on the project this summer and fully expected to be twice as far along as I am today, with solid unit testing coverage. Instead I'm not even at a usable point with my project and I've not written one Java unit test. The python side of my app has a few tests but still nothing worthy of being called "coverage."

Labels: , , ,

Wednesday, March 19, 2008

So they say . . .

Wow, it has been 14 hours since I woke up today and I feel so much better than I did yesterday. Its been quite some time now that I've worked on my own project for such a streak. I didn't write any work email. Didn't talk to anyone about work. Didn't send in any code reviews for work. But, best of all I didn't stress out about work.

See, work has been pretty tough the past few weeks as I work on a project that seems to progressing in the wrong direction. Its been tough enough that I, the dude who's not "professional" in the first place has slid some place far south of sane.

So, what did I do today? Well, I wrote a neat little app that I'm going to clean up before I publish. Its very simple: a quotes database with neat cross-linking between posters and the people they've quoted. It has four models to it and it is only a few hundred lines of python and a bit of html but after a day of working on this little guy I'm actually a bit proud of it.

Frameworks... This is the first time I've put together a website with a "web framework" of any sort. I never realized how wonderful it would be to not have to worry about writing SQL schemas, handling CGI and dealing with HTTP. There is now thousands of aborted lines of code that will never be written because someone thought ahead and saw that some code only has to be written once.

Labels: , , ,

The views and opinions expressed in the blog are of Joe LaPenna. Google has nothing to do with these pages.
For information about Google please visit: Google Press Center