Update 11/30/2010: See this post for some clarification and reflection on this post.

I’m frequently irked by budding programmers deciding to waste their time writing bad tutorials about a subject immediately after gaining a basic understanding of it. As I discussed in that previous post, this “by-beginners-for-beginners” approach to writing tutorials is harmful. Tutorials, I repeat, are one of the worst ways for beginners to learn. But there is one way that is even more dangerous.

Source code.

You might think I’m joking… but I’m not. Source code itself, pure, unadulterated and unaccompanied, is a rather impotent learning tool — from an entirely objective standpoint. This isn’t an issue you can reasonably contest with abstract arguments about a person’s “learning style.” You cannot learn from what isn’t there. Unlike a tutorial, which is at least attempting to be educational, the purpose of source code is to communicate a programmer’s intent to a machine.

Source code is devoid of context. It’s simply a miscellaneous block of instructions, often riddled with a fair bit of implicit assumptions about preconditions, postconditions, and where that code will fit in to the grand scheme of the original author’s project. Lacking that information, one can’t be sure that the code even does what the author wanted it to do! An experienced developer may be able to apply his insight and knowledge to the code and divine some utility from it (“code scavenging” is waxing in popularity and legitimacy, after all), but a beginner can’t do that.

Also lacking is any kind of rationale. The common beginner is looking for code from big-name games to “learn how professionals do it.” The first flawed assumption here is that professionals are good at what they do — that isn’t always true. But setting that can of worms aside, the deeper problem is that those beginners are often under the impression there is some singular “best way” to complete every development task and that professionals always employ this holy grail of solutions in every applicable case. But those of us who are professionals are all-too-painfully aware of the realities of software development and the horrors that lurk within our projects. A beginner may stumble across a routine or a subsystem in the code of a professional that was hacked together to meet a milestone or quickly patch over a bug, and may start using that bit of chaff in their own code. When this happens, it’s usually really difficult to wean them (after all, typically those kinds of hacks are designed to be quick and easy, which are attractive qualities to the beginner). All too frequently they will fall back on that most dreadful of excuses: “But I saw it in the Quake source code! Don’t you think id knows what they’re doing?”

Finally, source code is just too easy to copy. This is perhaps the biggest danger. Blindly copying and pasting imparts absolutely no understanding at all, and when things go wrong — as they invariably will, largely due in part to the above issues — the poor beginner will lack the comprehension neccessary to fix the problem. They may not be able to find the problem in the first place. When dealing with topics of nontrivial complexity, such as computer graphics, the knee-jerk response is to start fiddling with the code at random. Changing values, adding, removing or reordering functions calls may ultimately alter the output of the code such that it appears correct, but this really only introduces more instability in the entire system. This is perhaps best demonstrated by the rather poor Nehe tutorials, which are popular despite their shoddy quality and consist largely of sketchy source code with bad comments. The copy-and-pasted remnants of this code are easily recognizable in the files of many a hapless, dumbfounded beginner begging for help with his mysteriously vanishing triangles.

Self-learning is what drives the desire to turn to source code as an educational conduit. I have no particular problem with self-learning — I was entirely self-taught for almost three quarters of what would have been my high school career. But there are well-known dangers to that path, most notably the challenge of selecting appropriate sources of knowledge for a discipline when you are rather ill-informed about that selfsame discipline. The process must be undertaken with care. Pure source code offers no advantages and so many pitfalls that it is rarely a good choice.