Monday, June 22, 2009

Dumpcatcher: Design Doc

As I do with any project at work, I want to put together a short doc describing the scope and scale at which I will write this app.

Summary

Dumpcatcher is a simple web service that takes authorized requests from remote clients and logs key-value pairs in its datastore for future analysis. These pairs are typically an aribitrary identifier and an exception/stack-trace.

Features

  • Crash Stack Message storage
  • Clients must be able to submit stack traces (well, arbitrary strings) along with various bits of meta-data. Version, app name, etc.
  • Client libraries for Python, Java
  • I am targeting my <a href="http://joelapenna.com/git/foursquared.git">Foursquared Android Client

    Unknown end tag for </a>

    as well as any other pet projects I may use in the future.
  • Data Aggregation
  • I plan on allowing data aggregation by exception type, custom label and line.
  • Authenticated Client Requests
  • All requests by clients must be sent by authorized clients to prevent the service from becoming a black hole for spam. Design:

Design

App Engine has a very simple data store and webapp framework that I intend to utiltize for the basic functionality of the app.

Users

Users represent a single Google Id and a particular developer using the system.

Products

A product is an application that uses the dumpcatcher to log crashes. A user may have multiple products.

Each product registered will have two values associated with it, a productKey which will be passed as a paramter in all HTTP requests to the server and a secret which will be used to HMAC sign a request.

Product secrets will be randomly generated UUIDs.

HTTP Request

All requests to the dumpcatcher service will be secured with an HMAC hash. The hash will be keyed by a unique identifier provided to the client

productKey

Each client -> server request will include a productKey, an identifier used to differentiate between different products using the service.

HMAC

All requests must be submitted with an HMAC-SHA1 hex digest of the request query paramters as well as an increasing "request" identifer. The message consists of a standard http "query", sorted by keyname and quoted, request

For: http://localhost/add?product_keyd=1234&some=pair&other=pair we would construct the digest like so:

TODO(jlapenna): Probably don't want to split on & if the contents of the request might contain one though, they should already be encoded. Something like that...

sorted_query = ''.join(sorted(request.query_string.split('&'))) hash = hmac.new('SOME KEY', sorted_query, hashlib.SHA1)

And, as such, the actual request made to the server will be:

'http://localhost/add?product_key=12345&some=pair&other=pair&hmac=%(hash)s'

On the backend the server will take the reverse steps and using the secret associated with the provided productKey, will verify the authenticity of the request by encoding the query paramters the same way it is done on the client, keying the result by secret associated with the provided productKey.

Datastore

Initially there will be three models, one corresponding to "crashes," another to "users" and the third to "products."

Each user will be associated with a specific Google ID but a single Google ID can have many products.

Security

Security and validity of client-> server requests will be handled via the usage of HTTPS for securing communications and for HMAC to verify authenticity of a client request.

Replay Attack

An attacker with access to the HTTP stream a client -> server request is sent over will be able to execute a replay attack by capturing the HTTP post made by the client and submitting it as its own, at any rate he so desires.

The solution as such is to only allow requests over HTTPS. This gains the added advantage of preventing any private data from leaking via a network observer packet sniffing.

Caveats

It is likely and highly reasonable that an app like this exists in a highly more polished and featureful way. I chose this project because I felt like it would be a good way to explore some new technologies and have a fun time; not because this is in any way "new" or "exciting"

Labels: , , , ,

Sunday, October 26, 2008

Synchronization

I have spent the past two weeks procrastinating (to some extent) on working on my top secret project. The problem I've been struggling with is just a really hard one to solve and I've gotten quite used to instant gratification with my code. Much of what I've been working on has ended up with results by the end of a hacking session. Not so much with my current task -- synchronization.

One of the key features of the top secret project is that it is always available. Whether by a browser or a native app like an android client, the user is expected to be able to interact with the app no matter if they have internet connectivity or not. This means I have to spend a lot of time working on offline access as a requirement for letting anyone use the app. I thought I could get away with dogfooding my app while I was in Toronto, but I quickly realized that without 3G data on the phone the top secret project would just not function correctly.

One of the challenges I've faced so far is a temporal one. My first thought when deciding to do offline access was that the client would do the synchronization and call back to the server to push a canonical dataset into the datastore. After several nights of hacking I was fed up. I couldn't get the synchronzation to work at all. I found other things (like Statusinator) to work on instead.

On the flight to Chicago I had a "breakthrough" that really should have been my first thought. Do synchronization on the server side! My idea is as such:

  1. Client creates local data, assigns hypothetically-unique UUID to record, tags it as existing locally only.


    1. Stores:
      {"key": "possibly-unique-key", "value": "some-value", updated: "2008-10-27 04:41:01", "created": "2008-10-27 04:41:01", "is_pending": true}



  2. Client requests sync session with server, gets data for min/max records to sync. **All further RPCs have a sync_uuid.


    1. Client Sends:

      {"device_uuid": "some-possibly-id"}
    2. Client Recieves:

      {"sync_uuid":
      "some-unique-session-based-on-device-uuid-and-user", "last_sync": null,
      "max_checkin": "2008-10-27 04:41:01"}

  3. Client pushes record to server, is_pending to denote that the server is receiving a note with an unrecognizable key.


    1. Client sends:

      [{"key": "possibly-unique-key", "value": "some-value", updated: "2008-10-27 04:41:01", 
      "created": "2008-10-27 04:41:01", "is_pending": true}]



  4. Server processes record, changes the key to a valid server key, stores the original as an attribute on the record for future book keeping: local_id.


    1. Server Stores:


      {"key": "some-real-key", "local_id": "possibly-unique-key", "value": "some-value", 
      updated: "2008-10-27 04:41:01", "created": "2008-10-27 04:41:01", "is_pending": false}



  5. Client requests updates from the server, Server responds with all new or updated records modified serverside, along with the newly added records from step 3.


    1. Client requests, asking for all records modified after last_sync (or all records, if None) but before max_checkin.
    2. Server responds:


      [{"key": "some-real-key", "local_id":
      "possibly-unique-key", "value": "some-value", updated: "2008-10-27
      04:41:01", "created": "2008-10-27 04:41:01"}]



  6. Client parses record for "local_id" attribute, and replaces the record in the local datastore with the copy from the server, stripping the local_id attribute and removing the pending bit.


    1. Client stores:


      {"key": "some-real-key", "value": "some-value", updated: "2008-10-27 04:41:01", 
      "created": "2008-10-27 04:41:01", "is_pending": false}



  7. Client tells server sync sesison is complete, using the newest record received from the "push/pull" to specifiy the end date of the session. (just as last_sync is the start). To save a write during the server-to-client record update, the client is the one noting the end date for the session here, instead of the server.


    1. Client sends: last_update.
    2. Server stores: stores last_update as last_sync.

I haven't thought this through all the way yet, but I think this will work just fine for a google-gears based browser client just as it will work for my android client. There is something that still bothers me and I'm having a hard time scoping it out in my head: What happens when clock scews occur?

Remaining questions:
  • Do I leave around "local_id?" When is an appropriate time to strip those records? I don't want the server modifying the records withouth confirmation from the client that the info is no longer needed.
  • What are the error condtions when the sync fails at each of the above steps, how does this pattern resolve conflicts that occur when records are modified after a failed sync?

Labels: , ,

Saturday, September 27, 2008

Top Secret Project Update


I've spent another weekend hacking away at my top secret project and I feel that every day I work on it that I am getting closer to being able to use it. In fact, I'm just a few major-usability bugs away from using it day to day! After I spend a few weeks working on it, then a few more weeks fixing what I see as show stoppers I hope to be able to show it off to a couple of people.

Looking at what I've done so far this summer leaves me both depressed and excited. I'm a bit sad because I am so far away from what I hoped to have accomplished by this point. At the same time, I'm excited about what I have been able to do. I've learned a ton about Java and realized that it is a pretty awesome language. I feel it is a good way for me to stretch my brain. With python you can throw everything against a wall, 'import antigravity' and have the crap float away, leaving you with something usable. With Java, I feel I have to plan things out a bit better. I have to live with the consequences of my decisions and deal with every shortcut I inevitable take. As a result I'm much more careful about what I do. Even if I do take a shortcut, these days they don't tend to last very long as I get irritated with the way my code looks or interacts and I refactor until I'm happy.

One thing I haven't gotten around to learning is unit testing. I thought I would have more time to work on the project this summer and fully expected to be twice as far along as I am today, with solid unit testing coverage. Instead I'm not even at a usable point with my project and I've not written one Java unit test. The python side of my app has a few tests but still nothing worthy of being called "coverage."

Labels: , , ,

Monday, September 8, 2008

How to work on a Project

This comes from a conversation I was having with my friend Chris the other day
about how you can keep yourself and friends working on a project.

So this is what worked for me... Something that has allowed my project to not die after a day of working on it -- what usually happens to projects that i start.

First and foremost: Get everyone to set aside a specific time each week to work on the project. Ideally it would be everyone at the same time at the same place, but thats not likely to be possible. At the very least you should have pairs of people.  Most of my friends are already highly motivated but having someone to bounce ideas off of is invaluable. If you have to wait until the next time someone is online that idea will probably bounce away instead of back. There are other reasons for this that I will come to shortly.

Because you're working on a personal project, the only initial rewards are going to be the feeling of satisfaction of having produced a good idea or from helping another person on your team.

Make sure everyone shares their ideas, and document those ideas. You'll need some reference material later when you thing "Well, why didn't we do this in the first place?" This is the second reason for having pairs or groups of people working at the same time. At the end of each session you can suggest people show their work with an example or through some documentation. This not only keeps people concentrating on goals but it establishes that each person on the team is working for someone else -- the most important part of this plan is to make sure people are vested in the work everyone is doing. It is vital that each member of the team takes as much or more pride in the work each of their teammates than their own.

Next, spend some time planning what the team is going to focus on. Even if you don't have a clear idea of what the project will be, give people focus area to lead. Have someone investigate javascript frameworks while you have someone else analyze some patterns for the client-server interaction. Or have someone start writing up use cases for how they expect the project to be used. If you immediately jump into writing code you might leave your teammates who have less a clear idea of waht to do in the lurch. Know that each person has some series of talents that nobody else does so be sure to spend the time upfront to realize what those are.

Do not expect the same quantity of work from each person, but do expect each person to make progress, even if that progress is a discussion about how they had to re-write some system for a third time. If someone is not completing any work you can expect that attitude to be more influential than that of the other people completing some tasks. Even if someone's time is spent researching, a short summary to another teammate or in a document will help spread some knowledge and capture some progress.

As the project moves along, the tasks people will be doing will tend to be more tightly coupled with the work from different people. By this point your team is probably working well together and because you've already encouraged people to help each other being a dependency or or dependent on someone else is not a new feeling.

Finally and most importantly, make sure everyone is having fun. The best way to kill a personal project it to treat your teammates poorly. The second best way is to make the work people are doing boring so be sure that people find something they enjoy to work on.

Labels: ,

Sunday, August 24, 2008

Don't get distracted!

One of the things I've learned from starting my last few projects is that if you get distracted by project infrastructure you get distracted from what made you excited about the project in the first place.

An example would be a project Kevin and I tried starting a few months back -- an everyblock-like appliance that could sit on your mantle and passively provide you intersting information about your neighborhood. After several hours spent setting up an SVN repository, the correct mailing lists, researching frameworks and languages we were tired; not because of the amount of work involved, rather it was because we had grown weary of the boringness of the project.

Even though we hadn't written a single line of code for the project the energy we expended on the infrastructure was the exact energy we should have used to sketch out a more serious design, start prototyping and collaborating. Instead of getting more interested in our project we instead grew tired of it and like so many other projects, it died.

There are a few things you can do to mitigate this problem.

  1. Spend some time *now*, before you have a project, researching project resources. Things like SVN repositories, mailing lists and web sites are easy to get going but suck up the most valuable moments of your project's birth.
  2. Your initial decisions are both temporary and perminent. Think of it this way: You may decide to change from django, to ruby-on-rails, to C# but the design vs feature tradeoffs you think of with any of them will probably have lasting impact on your future design decisions.
  3. Make sure you work on the project you were excited about, not some deriviative of it that you think you need to get working on the real project.

Labels: ,

Sunday, July 20, 2008

Design Patterns... BAH!

I started reading "Head First Design Patterns" last week, and taking advantage of my new-found knowledge I tried to refactor Missing's http client stack. Its taken me about eight hours and it works again. Its as ugly as it was before, maybe even more so.

The kookiness of the design stems from the fact that I want the ability to respond to an HTTP response, possibly from a different thread. Android has a cool thread/message-queuing class called a Handler, that allows you to post a Runnable to the handler and it will get executed on the thread in which the handler was instantiated.

Because I wanted to keep my http stack android-agnostic (not that I had a good reason for it, in fact I didn't even think of doing it at first.) I removed the knowledge of the Handler from the .http package.

Instead I created a decorator for the HttpResponseRunnable that does know about Android's handlers. For the sake of re-use on another project, I put it in the .http package... DOH!

On the positive side, I did manage to implement a design that allows me to dynamically create http requests and responses without having to subclass all the time. I don't know if in the long run that will be better.

Things I need to think about:
  1. What is the best way to chain http operations where each request depends on data from the previous request?
  2. How would I design the HttpService from the ground up? Where can I look at code that has similar features to the ones I'm coding up to see what design they used.
  3. Why am I constantly re-writing this stack?
  4. If I am going to have a large number of http callbacks, if there are many types of end points, do I want keep the ability to define them quickly, or do I go with something more verbose and more abstract?

Do you know Java? Want to discuss some of this with me? I'm a newb and could use insight from someone with more experience than I.

Labels: , , , ,

Monday, June 30, 2008

Missing

One advantage of being several thousand miles from home is that you have the chance to sit down and work on things that you're normally too distracted to deal with.

As such, out here I've had the chance to work a bit more on Missing. I've re-written the http client stack I was using to use HttpClient 4.x instead of 3.x, with better abstraction and a multithread-but-threadsafe client. I've also gotten the LocationManager code working and the phone now supports a location update round trip. The server component needed a few tweaks to make this work but the majority of my time has been spent working on the Android client.

I became more enthused about this when jgib pointed me at sf0.org where they're running with the whole play-a-game-but-in-real-life-and-get-levels-and-stuff.

Labels: , ,

Wednesday, June 4, 2008

How to run the hacking part of a cyberpunk game. Part I

This was taken from a recent email exchange I had with some friends in regards to running our own cyberpunk game.

First off, to dash Josh's hopes and dreams: This is a cool idea but of all the games we play, this is going to be one of the more expensive ones. Plastic tubes, foam and duct tape are cheap compared to any projectile driving device. Second, if you're going to represent technology, then it has to be to some degree realistic, what game are you playing if your "internets" is just a hand wavy alternate dimension/plane with the minor environmental effects? I think in both cases you'll require some sort of investment in enabling technology. GPS, airsoft guns, light or wireless access points. Something that will bring out the cyber part.

Making the interwebs is a very difficult prospect. I've tried games like "uplink" and have spent much time figuring out how to make "hacking" anything other than a drab and boring experience; mostly because its a drab boring experience. If life were more like the movie Hackers we'd be much better off but there would probably be more capsized oil tankers in the seas and far more dead artists threating the world for 25 million dollars. Row ... row ... row ...

I think one thing that could make this whole game more fun is distributed involvement. Because you have communication devices at your disposal, its possible to involve people who are not physically located at the game. For example. Joe schmoe is busy one weekend and cannot be onsite. He happens to be the puzzle solver sort of person so he knows a lot of what is going on. During game play something comes up, a player calls Joe schmoe and joe schmoe looks up on wikipedia the solution to the puzzle. Winner! Or, do you not want to consider the world wide web at your disposal?

I think another source of inspiration for this sort of game are ARGs. They do a good job of blending reality with gaming and since we're already in a derivative of the cyberpunk, dealing with real world tech in game would not be a bad thing. This game is also going to need much more preparation than another rpg because writing a website is not as easy as writing in script on a parchment.

Because we're not playing a tabletop game we can look at what shadowrun /current/ edition does with Decking. Because of the way networks have firewalled themselves from the internet a lot of decking is done inside. The running team gets the decker into the corp network, then they do their thing. Or they kidnap someone with approprate access and use their accounts. They did a good job of getting a character class that was essentially a solo gig and made it in integral part of the runner team. In otherwords even your hacker characters are going to have to get in there and be involved in modules. This also means that where people are in meatspace is relevant to where they are on the "interwebs."

Now, here is my pitch and Josh is in disagreement with me here. I think that you need a software framework to be easily able to build and interact with virtual puzzles. I think you'll need multiple gps enabled smart phones (iphone, android or s60) or wireless network + PCs, a server, wireless networking to get this game off the ground. Communications play a large part in the cyberpunk world -- especially when they go out and the players who have come to rely on their decker or their off-site drones lose some control. I also think you need some evolution of my software (http://missing.googlecode.com) or something like it (hopefully there is something better than the crap I have put together) to build any sort of technological layer on top of your game. Now the problem is you need this stuff to be accessible which means, like I mentioned earlier smart phones and laptops for the game.

Labels: , , ,

Monday, May 26, 2008

Statusinator

Last year after the Android SDK was released, I wrote a small app that allowed me to upload photos and update my facebook status with a native application. Here is the result. I re-implemented a subset of the official facebook API because I didn't understand it. (It wasn't "not invented here syndrome" it was more of a "I am not smart syndrome"). Anyways, I've posted the code at statusinator.googlecode.com and there is a Facebook app page as well.

Labels: , , ,

Sunday, May 11, 2008

quotesdb - Quotes Database on Google App Engine

I took a day off of work recently to take a stab at Google App Engine. I'm already using it experimentally for my game, Missing. But I wanted to see what I could put together in one day. I'm not much of a programmer but the results are impressive to me. In about 6 hours I now have an app running on my site that hosts profiles, quotes and descriptive text for any quotesdb need... in 300 lines of code. If you go to quotesdb.joelapenna.com/fortune it exports the database to a fortune file. I will add an ATOM feed shortly and upload the source as well.

I had to write a small decorator to handle authentication and authorization but beyond that I had to do no work to manage users. I had to write a small data schema but I don't have to host a database on my webserver and I had to didn't even have to set up mod python. App Engine takes care of it all for me. I think its the coolest app we've released since Google Maps.

team-quotesdb project page

Labels: ,

Wednesday, March 19, 2008

So they say . . .

Wow, it has been 14 hours since I woke up today and I feel so much better than I did yesterday. Its been quite some time now that I've worked on my own project for such a streak. I didn't write any work email. Didn't talk to anyone about work. Didn't send in any code reviews for work. But, best of all I didn't stress out about work.

See, work has been pretty tough the past few weeks as I work on a project that seems to progressing in the wrong direction. Its been tough enough that I, the dude who's not "professional" in the first place has slid some place far south of sane.

So, what did I do today? Well, I wrote a neat little app that I'm going to clean up before I publish. Its very simple: a quotes database with neat cross-linking between posters and the people they've quoted. It has four models to it and it is only a few hundred lines of python and a bit of html but after a day of working on this little guy I'm actually a bit proud of it.

Frameworks... This is the first time I've put together a website with a "web framework" of any sort. I never realized how wonderful it would be to not have to worry about writing SQL schemas, handling CGI and dealing with HTTP. There is now thousands of aborted lines of code that will never be written because someone thought ahead and saw that some code only has to be written once.

Labels: , , ,

Thursday, April 12, 2007

Bye bye photos

There was a serious disk failure on my web server and my website disapeared for a few days, you may have noticed. Its back now, sans photos. I have them all backed up on my laptop so I'll get them back up sometime this week.

Have you seen Grindhouse? You should!

Who's going to meet me in some state other than California or Illinois in the next month for a weekend? Now accepting applications.

Labels: ,

Saturday, March 31, 2007

Java

Started poking around with Java again. Geeze, after working so much with python, this is a pain to deal with.

I've also been sick the past two days. In between sleeping I haven't done much besides work. How is that any different than the rest of my life?

Labels:

The views and opinions expressed in the blog are of Joe LaPenna. Google has nothing to do with these pages.
For information about Google please visit: Google Press Center