Archive for February, 2009

The learner-driver problem

Posted on February 11, 2009, under general.

That humans find statistics counter-intuitive is not news, there are many problems that have distinctly unsettling results. The Unfinished Game problem and The Monty Hall Problem are two of my favourites, but I thought I’d add another one. It’s not as good at those problems, but it can be insightful.

L driver

The problem

“Government data shows that 20% of qualified drivers are recently-qualified, but that one third of accidents are caused by this same group. How more likely is Bob, a recently-qualified driver to have caused an accident than Alice, a longer-qualified driver?”

On the face of it, this problem appears simple. If the 1 in 5 group is responsible for 1 in 3 accidents, then they are twice as likely to cause an accident as their complement. If this isn’t clear from arithmetic, we can put numbers on it. 99 accidents and 1000 drivers; 200 drivers produce 33 accidents, and 800 cause 66. So the rates are 0.164 accidents per recently-qualified driver, and 0.082 accidents per longer-qualified driver. It might take a minute of thought and maybe a calculator, but no bother, right? Wrong.

The real answer

The naive answer is well .. naive, we can’t make that kind of statement with any validity – because we haven’t really accounted for the fact that a single driver can cause multiple accidents, they are statistically independent. We weren’t comparing probabilities – just abstract rates.

Let’s again suppose that there are 1000 drivers, and 99 accidents. But that of the the 200 recently-qualified drivers only 1 really really bad driver caused all 33 accidents (the rest are more cautious), and that of the 800 longer-qualified drivers, 66 caused one each. The group mean averages are still the same as above, but now the likelihood of Alice or Bob being one of those accident-prone drivers is radically different. Bob has only a 1/200 probability of having caused an accident, and Alice a 66/800.

Now this is the most extreme case, but it’s valid along the continuum. The point is really that without knowing about the distribution of accidents within the groups, we can’t make comparisons between particular members of those groups. In other words; the question is an abuse of statistics.

It’s a great starting off point, because it immediately gets at exactly what a standard deviation really means, it validates very basic group statistics and it gets the right people excited. Most statisticians get it straight away, nearly everyone else doesn’t and it’s memorable.

Even more interestingly, according to my friends in the insurance industry, it turns out that the standard deviation of accidents for recently-qualified drivers really is higher than for the more experienced/jaded. The probable explanation being that there is more of a mix of brazen young nutcases, extremely cautious prudes and people who having qualified, barely then drive at all.

Photo from flickr user tz1_1zt.

Broken state programming

Posted on February 1, 2009, under general.

Cory Doctorow has shared some of his thoughts on how to write productively, and daily, in the modern age of distractions and interruptions. Cory was our keynote speaker at Apachecon San Diego, and he was kind enough to give myself and JM a great deal of advice and contacts when we met up with him, as well as some very cool business cards. He’s been friendly every since, too.

Fixing

But what struck me most about Cory, apart from his friendliness and generosity was that he appeared to have an attention-span that’s divided into micro-slices, with a cosmic ability to schedule between them. That’s a compliment, he was doing many things at once, and still giving each more coverage than should be reasonable, and certainly wasn’t ignoring anyone. “Wow, how does this guy write so much?” must occur to a lot of people that meet Cory.

A few months later when Suw Charman, who shared an office with Cory, came over to visit, she seemed all the more awed by Cory’s productivity for the proximity. But after reading his essay, I think one of this tricks is key, and through coincidence or common influence I’ve been using that same trick to avoid programming stalls.

Procrastination and Paranoia.

I don’t quite know how, but over the last few years I seem to have become a professional software developer. I certainly don’t mind, I quite like programming, and enjoy making things go fast. But nobody ever sat me down and taught me what it is to be a programmer. I’ve read plenty, but opinions differ, and there are many common memes that don’t suit.

waiting

I don’t think that programmers need zen-like concentration-zone bubbles in order to get things done, life is a series of interruptions, and when stuff breaks – or somebody calls – or whatever, you have to attend to it. I’ve had a personal office, and I hated it. I’ve worked from home, and it’s not so great. I hate isolation, and it doesn’t make me work any better. A pair of headphones is bubble enough for me, and please interrupt any time – you’re more important than something that can be restarted later. For some reason or another, I don’t find it hard to re-start.

What I do find hard is to start in the first place. Sometimes it’s procrastination, I don’t know whether it’s out of laziness (I’ve worked 120 hour weeks though, so there is some cause for doubt), or whether it’s a daunting sense of imperfection or pointlessness (“how I can start on something that won’t be perfect?”) or what. But it’s the hardest part. And I don’t just mean starting on day 1, actually that’s not so bad, but starting on every new feature, or even every day’s work.

There were three common fixes; 1.) for no explicable reason at all, I would find myself in a mood to start. 2.) I would get sufficiently paranoid about everyone thinking I was a complete lazy failure that I start something, in sort of a half-panic state, 3.) there’d be some insane deadline or sufficient external pressure that getting it done was imperative. That was that, “glad I’m not a real developer” I’d think to myself.

Leave your code in a broken state.

While thinking out the Joel on software “every developer needs an office” principle a few months ago – I was trying to get to the root of why I didn’t feel it was general. “I don’t seem to have much problem re-starting from interruptions” …. hmmm …. and I asked myself “Why?” and the answer; “Well, duh, I haven’t finished what I’m doing, it doesn’t work, and it needs to be fixed, it’s obvious”.

Fixing a broken bike

And like a lightbulb that has, no doubt, gone off for you a lot more quickly than it originally took me, I thought “well if that makes it easy to restart things, why don’t I use it to start things too?”. So instead of an obsessive-compulsive drive to at-least leave the office with my code compiling cleanly, making

1
make

exit with zero, or whatever other neat little unit of “fixedness” there is for the project, I now feel better if I leave it broken.

The more broken the better. If I make any kind of a late-day rush it’s to half-add something; a broken function stub that at least reminds me what I had in mind, or a bunch of debug

1
printf

s (shush you, I use real debuggers plenty) that simply could never be left in a release. All the better if it stops the dev build from even running.

Then the next day, or the next feature comes; there it is, a starting point, glaring imperfection that has to be fixed. And once you’ve fixed that, well the editor is already open, and you might as well move on to the very next thing. I find it a lot easier. Of course it requires care, you have to really break it properly, not just add a FIXME that you may forget about, and you never check in broken code to the real branch, but for me at least – it has worked, well I feel better at least.

Ok, so all of this is probably on page 4 of about a hundred basic time-management books I haven’t read, and page 1 of “how to be a competent developer”, but nobody every taught it to me … so I thought I’d share; try leaving your code broken, and see if it helps you start any more quickly.

Update:
Based on comments and offline feedback, I can already see at least two camps emerging. These are caricatures, but they establish the range of the spectrum.

For some developers, programming is also an art form. This camp holds to notions that, say something like OO design can be elegant and artful and has an aesthetic. Or that the choices made between the limited number of ways in which you can in fact code something are akin to poetry. This camp hates distraction, but in a cruel twist of fate is also prone to distracting ideological arguments about programming.

The other end of the scale, which I’m closer to, sees code as a mere tool for the job, with the primary focus being on doing a task well, quickly and efficiently. Like a good wrench, it’s allowed to get oily, as long as that helps to do the task better. I suspect people closer to this end of the spectrum cope with interruption more easily.

Another trend in the comments is that what really makes the difference when it comes to procrastination is interest. If you’re really interested in the task, you’ll be motivated. This may be true, but I also think it’s possible to make nearly any programming task interesting. If the underlying objective is boring, just write it in a new way, or better than it’s ever been done before, or try using a new language. Even if it’s something really boring and humdrum, try automating it, or write new analyser checks for it, and make that the real task … and so on.