tag:blogger.com,1999:blog-33573016.post115760200889196660..comments2024-02-11T02:24:22.330-06:00Comments on Nonbovine Ruminations: Anonymoushttp://www.blogger.com/profile/04107127399494404366noreply@blogger.comBlogger6125tag:blogger.com,1999:blog-33573016.post-1159197267382858032006-09-25T10:14:00.000-05:002006-09-25T10:14:00.000-05:00I wonder if Bayesian analysis, such as is used in ...I wonder if Bayesian analysis, such as is used in many spam-filtering programs these days, would help in edit-classifying? If a sufficiently large sample of edits were manually classified by humans, and then a program analyzed what mechanically determinable characteristics were present in each category, then it might be able to categorize other edits automatically (with humans spot-checking and moving incorrectly classified edits to fine tune the algorithm).Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-33573016.post-1159049948726727402006-09-23T17:19:00.000-05:002006-09-23T17:19:00.000-05:00As I understood it, Aaron's method measures how mu...As I understood it, Aaron's method measures how much of the final article was written by which user. If an article was deleted in vandalism along the way, then restored, it's still (mostly) the same groups of letters which were contributed by, I think he demonstrated quite convincingly, the casual contributor.Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-33573016.post-1159037443616692692006-09-23T13:50:00.000-05:002006-09-23T13:50:00.000-05:00What is history flow? where can I look at it?What is history flow? where can I look at it?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-33573016.post-1158937145782004222006-09-22T09:59:00.000-05:002006-09-22T09:59:00.000-05:00I assume you mean "seems a bit dishonest". I'm not...I assume you mean "seems a bit dishonest". I'm not sure why. I talked to the history flow folks but didn't end up using their algorithm because it didn't work particularly well for this task (as Martin notes, it doesn't handle pageblanking vandalism well). Should I simply have mentioned every other study on the subject? (There's also Denise Anthony's work and Tom Cross's, neither of which I had read before I published mine.) That doesn't seem like usual practice for an essay, though I will of course refer to all of them (and Seth) when I put up more details about my work.<BR/><BR/>As for significance, I have run the test on a significant number of articles -- according to statistical sampling techniques, the 500 I ran it on is statistically significant with a reasonable margin of error. Still, I'm eager to run it on more, but the process is actually quite slow and Wikipedia is very large. Thankfully, some people have offered more compute time. I hope to work out a deal with one of them and run the algorithm on all the pages.AaronSwhttps://www.blogger.com/profile/16298637002177499821noreply@blogger.comtag:blogger.com,1999:blog-33573016.post-1157950736387976182006-09-10T23:58:00.000-05:002006-09-10T23:58:00.000-05:00"who first adds a piece of text" ... Gee, that so..."who first adds a piece of text" ... Gee, that sounds a lot like historyflow to me... Seems a bit honest that you failed to mention history flow. If thats all you're doing, why have you not run this test on a significant number of articles?<BR/><BR/>Perhaps the results wouldn't support your position?Anonymousnoreply@blogger.comtag:blogger.com,1999:blog-33573016.post-1157723655821088332006-09-08T08:54:00.000-05:002006-09-08T08:54:00.000-05:00"Aaron's letter-counting metric also incidentially..."Aaron's letter-counting metric also incidentially heavily rewards people who revert pageblanking vandalism (which is actually quite common on high-traffic articles)."<BR/><BR/>Not true. Someone who reverts pageblanking vandalism gets no credit under my metric, which counts who <EM>first</EM> adds a piece of text. I'm not classifying edits at all, I'm looking at the history of each piece of text in the final version.<BR/><BR/>I'm sorry I somehow missed Seth's presentation at Wikimania, but I hope it's somewhat helpful to get independent evidence on this question.<BR/><BR/>But am I right in understanding that you agree that most of Wikipedia is written by casual editors?AaronSwhttps://www.blogger.com/profile/16298637002177499821noreply@blogger.com