(updated 7/21/08)
On the cusp of the 0.3 release, coinciding roughly with my first month of full-time work on Melkjug, I thought it’d be a good time to blog a bit about what I’ve been up to. While wetting my whistle with cookies, controllers, and captchas, I’ve also been getting my first real taste of some not-so-small potatoes like Pylons, jQuery, and CouchDB. Somehow, somewhere along the way I almost started to feel like a web programmer. And that’s not all: Recently we stumbled upon the precipice of none other than Greenspun’s Tenth Rule, basked momentarily in its profound and infinite wisdom, and then plowed heedlessly ahead. My past few weeks have been as exciting as they have formative. And some people think programming is dull!
Now to talk about some of the more interesting things I’ve been working on…
The 0.3 release will include some work toward #173: directly email/share articles from melkjug to popular services. After a survey of some available options, Luke noted:
ShareThis has some blood surrounding logo takedown notices so I’m inclined to skip this one altogether on principle.
I’ve seen AddThis around and it seem to work and is extremely comprehensive. It’s probably recognizable and everyone who does this sort of thing would be comfortable with it. It’s also ugly and divulges information to a centralized server owned by some strange LLC.
I’ve never actually seen iBeginShare used, but it appears to be open source (MIT) decentralized and uses the alternative open sharing logo. It seems to cover most of the huge services.
So I went with iBeginShare and set to work writing a small integration layer to see how it felt. The code turned out to be a bit hard to work with, and the UI was also not as lightweight as we were looking for, so I decided to scrap it and roll my own
Here’s what I’m calling the “iBeginWhere?!” widget (working title):

clicking the green “open share” icon reveals the various services to its left
Still a work in progress UI-wise, but it’s a decent approximation.
Implementing this was educational in a number of different ways. For one, I got to play with jQuery (which I was excited to do after attending rmarianski’s excellent TOPP Talk). But hooking up the email button was particularly interesting. First, a screen shot:

email link form (logged-in)
Getting the above to happen was pretty easy using the javascript Luke had already written. But then we realized it would be nice if, when you filled out the form incorrectly, it would reappear with the errors indicated, like this:

This was interesting because it wasn’t the basic case of synchronous form validation. Instead, the user clicks the email button, and some javascript makes an Ajax request to fetch the form and then display it in that nice modal dialog. If the user submits an invalid form, Pylons has to respond with an error code as well as a snippet containing the form with the indicated errors, which the javascript then pops up again. This was easily the most Ajax I’ve ever done (as well as my first time using FormEncode), and it was pretty cool getting to see how it works.
Another thing that made this interesting was the non-logged-in case: Unauthenticated users should not be able to send emails without first solving a captcha. Looks a little something like this:
email link form (not logged in)
As you can see, we’re using recaptcha, which has a nice ajax api to generate the captcha. This was interesting not only because it was more ajax, but also because it complicated validation of the form: An incorrectly-solved captcha invalidates the form when the user is not logged in, but when the user is logged in, the captcha is not even present. How do you represent that with a FormEncode schema? After pairing with novalis on this (shout out), we found a nice solution: use the compound validator “Any” with two custom validators, one which validates iff the user is logged in, and another which actually makes the request to recaptcha.net to see if the captcha was solved correctly, and validates accordingly. The implementation of this is here.
I should also note that along the way Luke found and fixed a pretty critical bug in the recaptcha python plugin and submitted a patch to the recaptcha mailing list, but unfortunately we haven’t heard back yet. Here’s to good open source karma, anyway.
Working on this ticket was interesting for one last reason, because I got to work with MIMEMultiPart messages (which is what gets sent when a user actually fills out the form correctly). Here’s a screenshot of what the resulting email looks like:

“share link” email
The interesting thing about implementing this was figuring out how to correctly attach an html message and a plaintext message to the MimeMultiPart. It turns out that you have to attach the html message first, or else mail clients will show the plaintext message instead. Learned that one the hard way.
Also in the next release will be work toward #187: promote articles based on a set of tags. The motivation for this was a use case Bryan thought of while working on a demo of Melkjug for new users: Suppose Blog A uses one tag, but Blog B uses a different tag to refer to the same concept (not to name names). It would be great to have a single filter that controls the presence of articles with any of the several tags you’re interested in. Another use case for the same functionality would be if you wanted to group tags referring to different concepts that are still somehow related. For instance, baseball, golf, and competitive eating could all be grouped under “sports”. Enter the MultiTagFilter.
To effect this feature, we discussed the possibility of transfiguring the simple filters of the mixing spec into meta-filters, entities no longer limited to containing a single atomic filter, but rather allowed to contain an arbitrary function of other filters. The MultiTagFilter then just becomes some SingleTagFilters or’ed together. Implementing this was apparently tantamount to defining the lambda calculus (hence the reference to Greenspun’s Tenth), but for better or worse, we decided to do it anyway. Keep it under raps though, I hear he sicks his enormous Siberian dog on serious offenders. The upside, at least, now that filters can be arbitrarily composed, is that if we ever want to offer a MultiAuthorFilter, for example, it will be trivial.
The last thing I’ll mention is work on #176: logins should be able to last longer. We accomplished this as follows:
- set beaker session cookies to never expire
- add a random string to the session, the ‘authenticator’
- if when the user logs in they select ‘keep me logged in’, store an additional ‘melkauth’ cookie which contains the authenticator and never expires
- if the user restarts the browser and then hits melkjug, check if the authenticator in the melkauth cookie matches the authenticator in the session. if so, they will be logged in.
I’m blogging about this one because it was my first foray into cookie auth stuff and I found it interesting, but also to solicit feedback. For instance, I suspect this is overkill:
def _make_authenticator(self):
return ''.join(random.choice(ascii_letters) for i in xrange(64)) # 52^64 distinct strings
If anyone with more experience and intuition about this wants to chime in, I’d love to hear it. :)
Thanks for reading,
Josh

jeez, I saw Paul’s link to the “Computerworld - How CAPTCHA got trashed” article (http://www.computerworld.com.au/index.php/id;489635775;pp;1;fp;;fpid; minutes after I posted this. Hopefully the countermeasures listed on http://recaptcha.net/security.html will remain effective as captcha crackers get better.
Comment by magicbronson on July 17, 2008 at 3:15 pm
Hey Josh,
I’m sorry you had so many problems with iBegin Share. I will make sure I look into the issues marked in your changeset. As for the “pretty shocking” code, it’s designed to be extendable, and somewhat DRY (thus the createElement chunks, it’s based off of mootools). The UI is also completely customizable via CSS.
Anyways, nice work, and good luck with Melkjug!
Comment by David on July 18, 2008 at 5:59 am
Hey David,
We should say in fairness that we did get iBeginShare working, and, IMO, it’s a great standalone tool and there is definitely a need for what you’re up to. Our needs were somewhat more limited than what you’re providing overall and the bits that we did want didn’t turn out to be a great fit in our own shocking javascript (which is currently already wedging together two other frameworks). Also we owe a good deal of the solution we ultimately chose to the work you put in on this, so thanks for that in addition :)
Comment by ltucker on July 18, 2008 at 9:09 am
Josh,
For your authentication, you might also consider using the ‘nonce_str’ function in the melk.util package which has similar intent (but might also be in need of some improvement)
https://melkjug.openplans.org/trac/melkjug/browser/melk.util/trunk/melk/util/nonce.py
In both cases, we should probably take a peek into how reasonable it is to rely on python’s randomness. Although I suspect it’s much better than your average C library, it’s possible we should be using something better or being careful about how the prng is initialized.
Comment by ltucker on July 18, 2008 at 9:17 am
Hey David,
First off, apologies for the tone of my comments. They were heavy on criticism and short on appreciation. As Luke says, we owe a lot of what we did to being able to look at what you did. As open source developers ourselves, we greatly value the availability of open source alternatives like iBeginShare to proprietary tools like addthis. So above all, thank you for the work put into developing and releasing iBeginShare, and please know that you are more than welcome in our IRC channel (#melkjug on freenode) or to email me directly if you would like to talk about the code further.
Comment by magicbronson on July 18, 2008 at 9:53 am
Hey Luke,
Thanks, nonce_str is much nicer (not to mention already written :). I’ll use that instead.
Just noticed python2.5 deprecates the md5 module with a new module hashlib, http://www.python.org/doc/current/lib/module-hashlib.html . The 2.4/2.5 straddle grows ever wider…
Comment by magicbronson on July 18, 2008 at 10:33 am
from http://docs.python.org/lib/module-random.html :
“Almost all module functions depend on the basic function random(), which generates a random float uniformly in the semi-open range [0.0, 1.0). Python uses the Mersenne Twister as the core generator. It produces 53-bit precision floats and has a period of 2**19937-1. The underlying implementation in C is both fast and threadsafe. The Mersenne Twister is one of the most extensively tested random number generators in existence. However, being completely deterministic, it is not suitable for all purposes, and is completely unsuitable for cryptographic purposes.”
For a non-deterministic prng, the pycrypto module provides Crypto.Util.randpool: http://www.amk.ca/python/writing/pycrypt/pycrypt.html#SECTION000720000000000000000
Comment by magicbronson on July 19, 2008 at 3:39 pm