Monday, March 10, 2008

Wikipedia's quality

So, people are constantly finding new and interesting ways to evaluate Wikipedia's quality. These often rely on random pagewalks, which is a really poor way to choose articles for evaluation. Finally, though, we have a good basis for choosing articles for evaluation: some dude named Henrik has, presumably in cooperation with someone in the Wikimedia developers' team, come up with stats on the most frequently viewed articles. This is by far the best way to choose articles for evaluation for quality: the articles that people actually do look at will, better than anything else, evaluate how well Wikipedia's readerships is being served by Wikipedia content.

Based on a sample window of 23 days in February 2008, there were 9956 distinct page names (not all of which correspond to articles) that were viewed at least once per minute on average over that timeframe. I don't have time to evaluate them all, but I will be looking at some of the top rated ones and making some comments in the near future. Just glancing down the list indicates that politics, popular culture, and sex dominate the topics. I admit being perplexed at the prominence of "canine reproduction", however. (Andrew Gray has some good comments on his first impressions of the top 9956 in his LiveJournal; I shall not repeat them here.)

I actually expect Wikipedia will acquit itself better here than in many of the other evaluatory metrics people use. These high-traffic articles tend to be watched closely and many of them are semiprotected (and in fact my early observation of this led me to wonder what percentage of pageviews are of protected content, a statistic I would love to see collected, or at least estimated). Their content is likely to be at least decent, if not actually good. It'll be interesting to see to what degree this is the case, and how far down one has to get before one gets to a really bad article.

Anyway. Look for this to be the focus of at least the next several posts. Hopefully this will be a pleasant change from the less pleasant discussions of the past few days.