Monday, February 05, 2007

Copyleft, copyright, Richard Stallman, Larry Lessig, and the Wikimedia Foundation

The Free Software Foundation is apparently close to announcing a rewrite of the GFDL. There have been some interesting political battles over this. It is the conflict of the goals of two of the major players that leads me to write this.

The GFDL is the "documentation" version of the venerable GPL, the license under which a great deal of -- probably a majority of -- open source software is released. It was originally created because people trying to write documentation for various open source products realized that the GPL is a poor fit for documentation: the GPL is fundamentally a software license and its terms are crafted to that purpose. Documentation is not software, and so many of the terms made little or no sense in the context of material intended to be printed instead of run. The GFDL really got a big boost, though, when Wikipedia adopted it as its default license; at this time it is almost certainly the case that the contents of the various Wikipedia editions represents the largest corpus of GFDL-licensed text in existence, and the Wikimedia Foundation the largest publisher of GFDL'd content currently in existence.

Fundamentally, this is a problem with copyright itself: computer programs are not works of literature (the Copyright Office's insistence to the contrary notwithstanding). But the copyright law is what the copyright law is; you take it as you find it, and write your license to control the worst aspects of it and take advantage of them to accomplish what you really want. At least, that's the way the Free Software Foundation, and Richard Stallman at its head, has taken; their goal is and always has been to encourage people to write more and more code under licenses that allow free reuse without allowing people who do reuse to "take claim" for someone else's work and directly exploit it. The GPL has been very successful at this; the GFDL less so, probably because writing books for free is less fun than writing software for free. The Free Software Foundation, in any case, is about software (hence the word "software" in its organizational name). Its interest in documentation is derivative, to the extent that good documentation makes software better, and so the GFDL served the FSF's interests by making it possible for people to write documentation and release it under a license that carries the same spirit as the GPL, but made a bit more sense.

Wikipedia's choice to require licensing of contributions under the GFDL was a bad one. The GFDL is a license for software documentation. It's not really intended to be used as a license for just any random hunk'o'text that you might have sitting about, and even less appropriate for a standalone photograph. The GFDL is written with the assumption that any printed copy would be in the form of a bound book, and also includes a bunch of hairy language intended to protect "author's rights". Using the GFDL for Wikipedia content requires some twisting of the language -- twisting that doesn't harm the intent of the license too much, but which tends to make people dislike the license. So, there's quite a bit of pressure on the Wikimedia Foundation, as the publisher of Wikipedia, to find a way to get Wikipedia under a license that is less difficult to work with than the GFDL. (Part of the problem is that at the time Wikipedia was looking for a license, the GFDL was the only open source license that encompassed text at all. While it is not a good fit, it was the least bad fit available at the time.)

Now, enter Larry Lessig and the Creative Commons. Larry's not about using copyright to encourage people to write stuff and make it freely reusable. Best I can tell, Larry's main drive is to change the nature of copyright law. Specifically, Larry wants to, inter alia, shorten the term of copyright substantially and also wants to have the "right to remix" recognized as either within fair use or as a separate enumerated exception to exclusive rights. (The "right to remix" is the right to take elements of one person's copyrighted material and to assemble it together with other elements of another person's copyrighted material, so as to convey a message not necessarily intended by any of the original authors, and without requiring any of the original authors' consent. The "mashup" is brought to us courtesy of the right to remix, or so we are told.) Larry's interests are not about software. He is not particularily interested in code reuse. In fact, it seems to me that his attitude that content reuse is something that, in many cases, should be beyond the ability of the original author to control. Larry's interests are not in insuring that code remains available for reuse into the future, and that people won't expropriate and exploit other people's work. His interests appear to lie more with allowing artists creative freedom, ensuring that artists are free to publish their original (by his definition) works, and that artists have the right to demand attribution, integrity, and (if they want) compensation. That's why out of the dozen or so "Creative Commons" licenses, only one (CC-BY-SA) is even close to compatibility with the GFDL, and several (especially the "noncommercial" and "no derivative" variants) are flatly in conflict with the core purposes of the FSF: Larry's goals are simply different. Larry's goal seems to me to be to provide authors with the tools to enable them to broadly distribute their works ("obtain exposure") without having to rely on Big Media (and their restrictive copyright policies). With the FSF, it's not about the programmer, it's about the code. With Larry, it is about the author.

Now, here we find the conflict. The Creative Commons has several licenses which are "more suited" for Wikipedia content than the GFDL. One of these would be the CC-BY-SA license, which fundamentally provides the same base level permissions and restrictions as the GFDL. The differences are minor; most people who would be willing to license under the GFDL would probably be willing to license under CC-BY-SA, and vice versa. However, there are enough small differences in the two licenses that they are not intercompatible. The GFDL, particularily, forbids relicensing: anybody who takes a license under the GFDL may redistribute (or derive from) the work in question only under the GFDL, and any such redistribution or derivation must also be licensed under the GFDL. This is a problem for Wikipedia: it would very much like to have a "better fitting" license than the GFDL.

There are basically two ways out of this situation. The first is to get everyone who has ever contributed to Wikipedia to agree to allow their original contributions to be relicensed under some other license. This is easy to say but quite difficult to do. The second is for the FSF to issue a revised GFDL which remedies the defects of the first version with respect to Wikipedia. The GFDL, like the GPL, contains language allowing a subsequent licensee to elect to use a subsequent version of the license if there is a later one than the one under which the original licensor used. This was originally intended to allow the FSF to rectify defects in the license should they become apparent. However, it can also be used, by sufficently pressuring the FSF, to give people an "escape clause" from the GFDL into other licensing regimes not controlled by the FSF which might eventually erode the GFDL into meaninglessness.

There are at least three ways I see that the FSF could modify the GFDL to serve Wikipedia's apparent interests.

One, they could write a "simplified GFDL" and allow for relicensing under that license. The simplified GFDL would have the same core concepts as the GFDL but without all the language and conditions that were designed specifically for the GFDL's role as a license for printed software documentation. The "viral" nature of the GFDL would be preserved, and the licenses would remain under the jurisdiction of the FSF. (To the best of my knowledge, a draft of such a license exists.)

Two, they could add language that declares that a licensee may choose to relicense the work under CC-BY-SA instead of keeping it under the GFDL. This would allow Wikimedia, as a licensee of GFDL content, to relicense its entire corpus under CC-BY-SA, and thereby escape the onerous and poorly applicable clauses of the GFDL. The problem with this is that the CC licenses, since 2.5, have had clauses in them that are potentially exploitable to undermine the "sharealike" aspect of the licenses -- the very reason for the existence of the GFDL in the first place -- which means that such a clause could easily break the so-called "viral" nature of the GFDL, which quite frankly is the main feature of the FSF licenses. FSF licenses are designed to make it very very hard to "unring the bell". The Creative Commons licenses, far less so. Some of us consider this a major plus of the FSF licenses.

Three, they could add language allowing relicensing (again at the licensee's discretion) under some other license under the control of some other organization (such as, for example, the Wikimedia Foundation). This has the same problems as the CC licenses, really; there is no guarantee that that organization would keep its licenses compatible in spirit with the GFDL.

As I see it, for the FSF to do anything other than the first option is to act outside its interests. However, and here's the rub: The Free Software Foundation is being pressured to do either the second or third option, by powerful parties in the open content arena. And they must resist this. The Free Software Foundation must keep its commitment to open content first, and the convenience of other parties second. Most importantly, they must remember that other parties do not necessarily share their commitment to open content as open content, but instead have other agendas, and to make sure that they do not hand off to others their responsibility to ensure that content licensed under the GFDL stays free in perpetuity -- especially when those others cannot be trusted to have the same commitment that they themselves have.