Monday, February 19, 2007

Where are the stable versions?

It's becoming increasingly evident that one of the major problems that Wikipedia is facing is the way that articles slowly "rot" over time. There are several different things that appear to me to cause article rot. Fortunately, it seems that the cure for all of them is the same thing.

One of them is the vandalism/reversion cycle. Rot gets in when an article is vandalized twice and the reverting patroller only reverts one of the vandalisms. This is relatively common, and is increasingly more of a problem as vandalism patrol has become even more of a "point and shoot" game for the people who play it: these people have such a sense of urgency combined with a lack of interest in actually reading the articles, that they often don't look to see if they've reverted to a vandalized version (or even look at what they've reverted at all).

Another is so-called sneaky vandalism. Not that "sneaky vandalism" needs to be sneaky: "medulla oblongata" currently claims that the medulla was "Made famous in the comedy 'Waterboy' starring Adam Sandler"; it has said this now for over a week without anybody noticing. But, since this edit didn't set off the vandalism sensors that the vandalism patrollers rely too much on, it goes ignored, and will until somebody with the interest to fix it reads this blog entry. (Hopefully that'll be soon; I've noticed that none of the articles I mentioned in my "ten random articles" has been edited yet, not even the one that is clearly being used for linkspam.) This example appears to be merely silly; there certainly is lots of malicious "sneaky" vandalism too, ranging from changing dates or other numeric data in small ways, to inserting seemingly plausible but false information into articles. An example of the latter is this snippet about the so-called "Miller valve" modification for the Boeing 737, which was entirely fabricated. (VATSIM is a game; Miller valves are used on trombones, not aircraft.)

Yet another is the need that so many people have to "make their mark". So many articles, especially those on highly pertinent topics, started out decent, but over time accumulate so much randomly added crap that they no longer read well. This typically takes the form of randomly added bits of trivia, added without any concern for organization, tone, or any of the other factors that a good encyclopedia author would keep in mind. This process slowly turns articles into a mishmash of disconnected facts. A good example of an article currently displaying such a lack of organization is Tony Blair's. This is a very common problem for biographies of living people, especially controversial ones; it is difficult for to maintain coherency with so many cooks peeing in collaborating on the soup.

Anyone who has ever attempted collaborative authoring knows that it is can be very difficult to collaborate with even one or two co-authors. Textbooks sometimes have as many as four or five authors, but on examination it often becomes apparent that all of the authors clearly did not collaborate on every word; rather, the authors divvied up the work in some manner, each working mostly independently, reviewing one another's work and making suggestions instead of directly editing one another's work. The wiki model, in which hundreds or thousands of editors supposedly collaborate on an article, appears to me to be really quite unworkable toward a goal of producing, and more importantly maintaining, a quality article. (Not to mention that we already know that most quality articles are the result of the work of at most a handful of authors.) The best we can hope for is an article which recites relatively few factual falsehoods and is not too terribly unreadable. In order to obtain and retain quality, it is clear to me that mature articles need to have the number of editors actually editing the article limited to a relatively small number, preferably people who are actually familiar with the article.

I realize that the above seems to contradict "wikiphilosophy" and especially Wikipedia's doctrine regarding "ownership of articles". And it's intended to. My reason for recommending that Wikipedia change its attitude toward article ownership is not so that authors of articles can claim ownership and thereby protect their precious egos (the egomania of so-called "highly productive article authors" is well-established, and certainly has no need to be fed further). Rather, I recommend this because I believe that it would help to slow the rate at which quality articles rot; in short, I believe this change would benefit the overall quality of the encyclopedia.

Now, back to my original thesis: that all of the above problems can be solved by the same thing. That thing is the long-promised "stable versions" feature. I propose using (at least) two levels of stability on the English Wikipedia. The lowest level would be the "unvandalized" level; articles tagged with this stability tag are certified as not containing vandalism. This would markedly reduce the exposure level that vandalism receives (which has the further benefit of reducing the incentive for vandals to vandalize) and make vandalism easier to remove completely. Higher levels would represent increasingly higher "quality" levels; articles tagged with these stability tags have been certified by their principal maintainers as containing verified, accurate content presented in an organized and well-edited manner.

Here's the rub: this feature has been promised to the community now since August 2006; it was announced at Wikimania 2006 by Jimbo Wales himself. However, it has yet to appear anywhere. And I can't find out why not. It's clear to me that this feature is quite possibly the single most important currently under consideration for the long-term success of the Wikipedia project, and yet not only has it not been developed, but I can't even find any statement as to when it might become available or why it's not yet available. At the same time, an attempt to implement this in wetware using existing features of Mediawiki (protected versions and subpages) was rebuffed in August, with the reason being that "stable versions will be here soon, so let's just wait for it". How long is soon? How much longer will Wikipedia continue to put up with this situation?